## PROCEEDINGS

### OF THE TOPICAL WORKSHOP ON ELECTRONICS FOR PARTICLE PHYSICS



*PARIS, FRANCE, 21–25 SEPTEMBER 2009* 

Organized by

CNRS/IN2P3: Centre National de la Recherche Scientifique / Institut National de Physique Nucléaire et Physique des Particules UPMC: Université Pierre et Marie Curie Paris 6 LAL: Laboratoire de l'Accélérateur Linéaire, Orsay LPNHE: Laboratoire de Physique Nucléaire et des Hautes Energies, Paris OMEGA: Orsay Microelectronics Groups Associated, Orsay

with support from CERN, the European Organization for Nuclear Research

Full workshop programme on the Web http://indico.cern.ch/event/twepp09

#### **ABSTRACT**

The purpose of this workshop was to present original concepts and results of research and development for electronics relevant to particle physics experiments as well as accelerator and beam instrumentation at future facilities; to review the status of electronics for running experiments; to identify and encourage common efforts for the development of electronics; and to promote information exchange and collaboration in the relevant engineering and physics communities.



#### **CONTENTS**

















#### **ORGANIZATION**

The TWEPP-09 workshop was held from **21-25 September 2009** in the Institut des Cordeliers, Paris, France and was organized by five French institutions, with support from CERN:

> CNRS/IN2P3: Centre National de la Recherche Scientifique / Institut National de Physique Nucléaire et Physique des Particules UPMC: Université Pierre et Marie Curie Paris 6 LAL: Laboratoire de l'Accélérateur Linéaire, Orsay LPNHE: Laboratoire de Physique Nucléaire et des Hautes Energies, Paris OMEGA: Orsay Microelectronics Groups Associated, Orsay

#### **Local Organizing Committee**



#### **Scientific Committee**



#### **Scientific Committee Assistant and Proceedings Editor**

Evelyne DHO CERN

#### **OVERVIEW**

The purpose of the workshop was

- to present original concepts and results of research and development for electronics relevant to particle physics experiments as well as accelerator and beam instrumentation at future facilities
- to review the status of electronics for running experiments
- to identify and encourage common efforts for the development of electronics
- to promote information exchange and collaboration in the relevant engineering and physics communities.

The main subjects of the workshop were recent research and developments in the following areas relevant to particle physics experiments:

- Electronics for Particle Detection, Triggering and Acquisition Systems
- Electronics for Accelerator and Beam Instrumentation
- Custom Analog and Digital Circuits
- Programmable Digital Logic Applications
- Optoelectronic Data Transfer and Control
- Packaging and Interconnect Technologies
- Radiation and Magnetic Tolerant Components and Systems
- Production, Testing and Reliability
- Power Management and Conversion
- Grounding, Shielding and Cooling
- Design Tools and Methods

The welcome address was given by **François VASEY** chairperson of the programme committee, and the introduction was given by **Christophe de la TAILLE**.

A topical session on low power designs and techniques took place on Thursday afternoon, with two invited contributions.

Three dedicated working group meetings were held, one on microelectronics, one on optoelectronics, and one on power conversion and management.

An optional tutorial took place on the Friday afternoon following the workshop. Michael Schulte, Shuvra Bhattacharyya and Anthony Gregerson lectured on FPGA Tools and Techniques for High Performance Digital Systems.

#### **PLENARY SESSIONS AND INVITED TALKS**

#### Opening Plenary Session 1 – Chaired by François Vasey

Research activities at Pierre & Marie Curie University Status and Perspective of Research at IN2P3 Micro-Electronics at IN2P3

Opening Plenary Session 2 – Chaired by Guy Wormser

The future of the LHC programme and machine HEP experiments in Japan: The Next Generation ILC-CLIC

Plenary Session 1 – Chaired by Geoff Hall

Experiment protection at the LHC and damage limits in LHC(b) silicon detectors

Plenary Session 2 – Chaired by Alessandro Marchioro

Low Power analog Design in Scaled CMOS Technologies BASCHIROTTO, Andrea

Plenary Session 3 – Chaired by Wesley SMITH

Buses and Boards, making the right choice GIPPER, Jerry

Plenary Session 4 – Chaired by Philippe FARTHOUAT

Low Noise Design for Large Detectors JOHNSON, Marvin

Plenary Session 5 – Chaired by François VASEY

Key technologies for present and future optical networks ANTONA, Jean-Christophe

Plenary Session 6 – Chaired by Magnus HANSEN

Recent Advances in Architectures and Tools for Complex FPGA-based Systems

Plenary Session 7 – Report from Working Groups, Chaired by Allain Gonidec

#### **TOPICAL SESSION**

#### Chaired by John OLIVER

Low Power SoC Design **PIGUET**, Christian Two-Phase Cooling of Targets and Electronics for Particle Physics Experiments THOME, John Richard

INDELICATO, Paul AUGE, Etienne DE LA TAILLE, Christophe

BERTOLUCCI, Sergio ITOH, Ryosuke KLUGE, Alexander

FERRO-LUZZI, Massimiliano

SCHULTE, Michael

#### **PARALLEL SESSIONS**

Parallel sessions A were chaired by CHRISTIANSEN, Jorgen

 DE LA TAILLE, Christophe MARCHIORO, Alessandro MUSA, Luciano PETROLO, Emilio SMITH, Wesley

Parallel sessions B were chaired by CHRISTIANSEN, Jorgen

 FARTHOUAT, Philippe GONIDEC, Alain HALL, Geoff QUINTON, Stephen VASEY, François WIJNANDS, Thijs WYLLIE, Kenneth YAREMA, Ray

#### **POSTERS**

John OLIVER and Ray YAREMA chaired the posters session

#### **TUTORIAL**

Michael SCHULTE, Shuvra BHATTACHARYYA, and Anthony GREGERSON lectured on FPGA Tools and Techniques for High Performance Digital Systems

#### **INDUSTRIAL EXHIBITION**

C.A.E.N, Viareggio, Italy PHYSICAL Instruments, Le Perreux sur Marne, France

#### **SPONSORS**

ACEOLE IN2P3 / CNRS Université de Pierre et Marie Curie

#### **NEXT WORKSHOP**

The next workshop will take place in Aachen, Germany, on 20–24 September 2010

#### TWEPP-09 Executive Summary

J. Christiansen<sup>a</sup>, P. Farthouat<sup>a</sup>, A. Gonidec<sup>a</sup>, G. Hall<sup>b</sup>, M. Hansen<sup>a</sup>, K. Kloukinas<sup>a</sup>, A. Marchioro<sup>a</sup>, L. Musa<sup>a</sup>, J. Oliver<sup>c</sup>, E. Petrollo<sup>d</sup>, C. De La Taille<sup>e</sup>, F. Vasey<sup>a</sup>, T. Wijnands<sup>a</sup>, K. Wyllie<sup>a</sup>, R. Yarema<sup>1</sup>

> <sup>a</sup> CERN, 1211 Geneva 23, Switzerland b Imperial College, London, United Kingdom c Harvard University, Cambridge, USA <sup>d</sup> INFN, Rome, Italy <sup>e</sup> Laboratoire de l'Accélérateur Linéaire, Orsay, France f Fermilab, Batavia, USA

#### I. SOME STATISTICS

The Topical Workshop on Electronics for Particle Physics (TWEPP-09) took place in Paris, France, from 21 to 25 September 2009. Fourteen invited and 131 contributed papers (63 oral and 68 poster) were presented in 10 plenary and 11 parallel sessions to an audience of approximately 240 participants. Twenty-six of these participants came from the United States and five from Japan, while the majority originated from Europe. Of all presented papers, 25% referred to the LHC project, 43% to the SLHC upgrade programme, and 32% to ILC and other experiments.

#### II. SESSION SUMMARIES

Some of the main conclusions from the sessions dedicated to ASICs (A), Packaging and Interconnects (B), Optoelectronics (C), Systems Installation and Commissioning (D), Radiation tolerant components and systems (E), Power (F) Trigger (G), Programmable logic, boards, crates and systems (H) and Posters (I) are summarized in sections A to I below. Owing to space constraints, many contributions had to be omitted from this summary, but the interested reader can find them all at [1].

Invited presentations during the Monday afternoon opening session surveyed the High Energy Physics (HEP) activities of CNRS-IN2P3 & IRFU in France, their microelectronics activities, and the research programme at the Pierre et Marie Curie University (the workshop host). In the second part of the opening session, three major High Energy Physics programmes were reviewed: the LHC programme at CERN, the next generation HEP experiments in Japan, and the ILC-CLIC development challenges.

On the following days, all sessions were introduced by invited plenary talks, as mentioned in the sections below.

#### *A. ASICs*

The ASICs sessions were particularly well attended this year, with a very high number of good paper submissions and large turnout. The community is clearly vigorously restarting an R&D phase for chips to be used in future detectors and this was reflected by the large variety of projects presented.

The sessions consisted of 18 oral presentations and two invited talks. The first invited speaker addressed some of the most important issues of designing analog circuits using sub 100 nm technologies. While providing continuous advantages and benefits for digital designers, mainly in terms of size and energy consumption per logic function, these new technologies are not always well received by analog designers due to at least three factors:

- reduced power supplies for deep submicron technologies lead to small signal swing ranges and reduced biasing possibilities;
- process variations at very small size are becoming more important (due to atomic effects) and affect the matching of devices;
- intrinsic transistor gains are becoming significantly worse than with older technologies requiring very tricky solutions to be introduced.

 Some interesting and innovative solutions relevant for HEP applications were proposed, often resorting to new circuit topologies and application of digital correction techniques.

The second invited talk covered the very critical issue of low power design for System-On-Chip solutions. With the explosion of the market of portable and battery operated electronic devices, this is perhaps the most important subject in today's large commercial projects. While it becomes easier to add functionality and computational power to large chips (think about the various multi-core processors or very large DSP chips), doing this within an acceptable power budget becomes the overwhelming preoccupation of designers. System architects are using tricks and optimization techniques at different abstraction levels to contain

power dissipation, and have to be knowledgeable and master complexity at all these different levels, from the simple transistor leakage to the interaction and parallelism of many functional units. Some of these techniques are very relevant and educative for designers in HEP, where the development of electronics for future detectors (see for instance the new intelligent trackers or very high granularity calorimeters being discussed in HEP) depends sometimes critically on the capacity of designers to come out with solutions that remain within an acceptable overall power budget.

The Tuesday morning ASICs session was dedicated to pixel electronics and imagers.

Two talks addressed Monolithic Active Pixels (MAPs) which have now reached a very mature state as shown by their use in EUDET telescope or in the vertexing of physics experiments. The large area (2 cm<sup>2</sup>) obtained with MIMOSA26 together with zerosuppressed and high speed digital readout illustrate this maturity. The progress in technology with deep P-wells also allows more sophisticated MAPS readout (including the use of PMOS transistors without degrading the collection efficiency) as shown in the FORTIS chip or the TPAC (Tera Pixel Active Calorimeter for the ILC). The SOI process can also be considered as a monolithic pixel sensor, although the charge is not collected by diffusion but in a depleted junction made possible by the process. KEK has been providing MPW access to the OKI 0.2 µm process, which has been further improved to minimize back gate effects. Finally, a fourth type of monolithic pixel sensor is the TFA (thin film on ASIC) proposal, depositing amorphous silicon on the readout chip to form the sensor. The very small signal amplitude necessitates ultra low noise readout which has been designed and tested successfully. The detector leakage current still makes it difficult to clearly identify the MIP.

Hybrid pixel electronics was addressed in two talks. One was dedicated to the PANDA experiment at GSI, where the high rate and lack of trigger necessitate custom developments in 130 nm technology exploiting Time over Threshold information. At the LHC, ABCN, the next generation ABCD chip for ATLAS strip readout has been evaluated. The results presented focused on system aspects to reduce the power dissipated, such as allowing internal serial powering and reducing the digital power.

The Tuesday afternoon ASICs session had presentations on chips for gas-detectors and calorimeters and was followed by a Microelectronics Users Group meeting (MUG) (see the MUG report at the end of this section).

A new 130 nm CMOS pixel chip for micropattern gas-detectors was presented. As with several other chips, this was not just a new front-end ASIC, but more functionality was included by incorporating an on-pixel TDC. The on-pixel TDC is capable of doing time measurements already in the pixel element. This functionality was made possible by the higher integration capability of deep submicron technologies.

The electronics being designed for the very high granularity calorimeters at the ILC takes advantage of the particular bunch crossing structure of that machine and relies on powering schemes where the FE electronics is actually turned on with a very low duty cycle. Owing to the demand on dynamic range for these calorimeters, designers find it easier to work with technologies allowing a relatively large power supply voltage and several chips are therefore being developed in a 0.35 um BiCMOS technology. As an example, the HARDROC ASIC is designed for a very high granularity calorimeter readout where the detector is not only capable of doing energy measurements but also has some fine-granularity capability sufficient to explore the concept of particle-flow in these future calorimeters.

ILC calorimeters also rely on very accurate calibration circuitry, typically requiring high precision DACs. Such a component was presented, also developed in a  $0.35 \mu m$  process to take advantage of the higher signal voltage swing allowed in this technology.

A more advanced, medium performance 130 nm BiCMOS technology with SiGe transistors was instead used for an upgrade of the ATLAS LAr calorimeter front-end with again the possibility of using large swing signals and relatively large supply voltages. Preliminary measurements on the test chips indicate nice functionality and good matching with the simulation, reassuring us that using a high volume, mature technology is always an extra guarantee of success for all designs.

The Thursday ASICs session hosted several talks about chips designed for neutrino experiments: analog memories, combining different record lengths and speeds for optimum signal analysis and realized in AMS 0.35  $\mu$ m; a centralized multichannel readout chip PARISROC designed in 0.35 µm SiGe that autotriggers and digitizes both charge and time information to a data-driven readout system; an analog-to-digital converter to be integrated in a readout chip (a low power current driven architecture reaching 10 bits at 25 Ms/s showed good preliminary results and can be used for PET readout systems); and a new high precision and large dynamic range Time-to-digital converter designed in  $0.35$  um CMOS.

An exploratory new detector type using an intrinsic weakness of conventional circuits (latchup sensitivity due to parasitic SCR structures) was presented. A simple working test circuit was realized with conventional discrete components while more work will be necessary to realize a working integrated version.

Finally, two chips belonging to the family of components for the GigaBit Transceiver (GBT) system were also presented. A promising prototype of a laser driver optimized for the opto-components specified by the Versatile Link and GBT Projects was designed in 130 nm. This component does not yet fully satisfy the timing specifications, but was functionally fully working. The corresponding receiver component (a transimpedance amplifier to be connected to the receiving pin-diode) was presented; measurements of the prototype devices with real opto-receivers were shown to be fully satisfactory and the designers expressed confidence that the final iteration will contain all functionality that was omitted in this version and will completely qualify the components for the desired system specification.

The Microelectronic User Group (MUG) meeting consisted of three short presentations followed by an open discussion session. The opening presentation summarized the design tools and foundry access services that CERN provides to the HEP electronics designers community. CERN currently supports two technology nodes, a 130 nm node for CMOS and BiCMOS circuits and a CMOS 90 nm node. To increase the design productivity with the 130 nm node, CERN took the initiative of developing a Mixed Signal Design Kit by incorporating the foundry Physical Design Kit (PDK) along with foundry proprietary Digital Standard Cell libraries. The Design Kit supports the new Cadence CAE tools based on the Virtuoso IC6.1 OA (Open Access) and the SOC Encounter 7.1 OA platforms. A significant part of the development work for the Design Kit was subcontracted.

To enhance the functionality of the 130 nm Mixed Signal design kit, a set of customized Methodology Design Workflows were developed. These flows demonstrate the use of the CAE tools and the procedural steps involved in accomplishing specific design tasks aiming to provide standardized design workflows for partners collaborating in common design projects. CERN has organized three training workshop sessions, until the end of 2009, where these Methodology Workflows will be demonstrated. There are also plans to organize more sessions in early 2010 based on the demand.

CERN is providing the HEP community with foundry access services for design prototyping and production runs either directly with the foundry or through the MOSIS MPW service provider. During the period of 2008–2009 there were five MPW runs organized on the 130 nm CMOS8RF node, one MPW run on the 130 nm BiCMOS (SiGe) node and one MPW run on the 90 nm CMOS9LP/RF node. The majority of the designs were fabricated on the 130 nm CMOS node, having 20 designs with a total silicon area of  $100 \text{ mm}^2$ .

The second presentation was an invited talk on Mixed-Signal Challenges and Solutions for

advanced process nodes. The presentation addressed the implementation and verification challenges for mixed signal designs in deep submicron technologies and the impact on the engineering cost of the product. Modern CAE tools solutions were presented, based on the increased interoperability of the front-end (analog full custom) design flow with the back-end (digital) design flow.

 The third presentation was a demonstration of the Digital Block Implementation Methodology Workflow based on a digital design example. This workflow is part of the 130 nm Mixed Signal Design Kit and is extensively automated through the use of scripts. All design steps were demonstrated, starting from RTL code synthesis, floor planning, placement, routing, clock-tree generation, timing optimization, timing verification, and physical design verification.

During the open discussion session, technical issues concerning the technical capabilities of the newly developed CMOS 130 nm Mixed Signal design Kit were addressed and organizational issues about the distribution of the kit and the training workshops were discussed. The organization of MUG meetings, outside the context of the TWEPP workshops was suggested. Regular MUG meetings of one or two days duration would provide more effective means for exchanging technical information and experience for the use of new technologies and design techniques within the community and allow designers to stay up to date with ongoing developments.

#### *B. Packaging and Interconnects*

Packaging and interconnect technologies are receiving increased attention by the detector and electronics community, as vital and critical contributors to future high-resolution low-mass tracking detectors. The appropriate technologies must offer high interconnect density, low mass, high yield and high reliability at an affordable cost. The silicon detectors (strips and pixels) must be connected to readout chips mounted on appropriate hybrids/modules and must finally be integrated into complex tracker systems with optimally engineered power distribution and cooling systems. The papers presented in this session can be classified in three basic categories:

- Silicon strip detectors with front-end chips mounted on hybrids connected with wirebonding;
- Interconnects for hybrid pixel detectors;
- 3D interconnect schemes for pixel detectors.

Across these three categories the challenges of using thinned readout chips and detectors plus very light support structures and services (power, cooling) to minimize material are being actively addressed.

Two presentations on silicon strip detectors reported the use of a flex Kapton hybrid technology,

as it offers the required interconnectivity with relatively low mass. Experience from current tracker systems has clearly shown that a significant emphasis needs to be put on manufacturability and reliability of such hybrids to enable large-scale trackers to be produced. Wire bonding between the detector and the readout chip and to the hybrids is extensively used. This is a well known technology with wide experience in the community which covers well the needs of silicon strip detectors. For the ATLAS silicon strip detector upgrade the approach taken is to minimize the actual connections through the hybrid itself. The readout chips are directly connected to the single-sided silicon strip detectors with staggered wire-bonding (without pitch adapters) and readout and control signals are whenever possible connected directly between neighbouring readout chips (20 per hybrid). The complexity of the hybrid itself is thereby significantly reduced (no need for very fine pitch connections and micro vias). The use of wirebonding to connect the front-end hybrids to the long (1.2 m) power distribution and readout busses running along a detector stave with up to 24 readout hybrids is also being evaluated. The ORIGAMI concept for double-sided silicon strip detectors (BELLE experiment upgrade) uses folded flexible Kapton pitch/routing adaptors to connect strip signals from one side of the detector to a Kapton hybrid on the other side, thereby handling the full readout of both detector sides with a single active hybrid. This requires a complex and delicate mounting, gluing, and wire bonding process that has been demonstrated on prototypes using a well defined assembly sequence with dedicated and optimized tooling. For the ATLAS silicon strip detector upgrade the cooling of the front-end chips is assured through the silicon detector itself to a stave with integrated cooling pipes. The gluing of the hybrid on top of the silicon sensor and the cooling has been verified to work correctly when taking appropriate care of the isolation, shielding and gluing scheme. For the ORIGAMI approach a cooling scheme based on readout chips directly glued to a cooling pipe has shown encouraging results.

For hybrid pixel detectors the community is evaluating multiple fine pitch  $(< 50 \mu m)$  interconnect technologies available commercially or at experimental level (solder-based bump bonding, Solid Liquid Inter Diffusion (SLID), direct metal– metal thermocompression bonding). Such processes must be compatible with sensor and readout chip wafers coming from different manufacturers and their critical parameters are available pitch, yield, and total effective cost. The use of Through Silicon Vias (TSV), mainly based on the via last process, is also being evaluated to implement abuttable detector assemblies, making the I/O signals of the readout chips available on their back side. The use of the SLID connection scheme has shown very

encouraging results for an ATLAS pixel upgrade application.

3D interconnect technologies, currently being intensively developed by the microelectronics industry, has received a lot of attention from the HEP pixel community as it can offer unique possibilities for highly integrated pixel detectors. The use of multiple levels of active CMOS layers (Tiers) allows novel pixel architectures to be implemented. Multiple tiers can, for instance, integrate more functions per pixel cell (for a given CMOS technology node) and the sensitive analog part can be implemented on a tier separate from the noisy digital logic. The close integration of the silicon detector and its readout electronics is also a possibility. A large number of groups (17) have joined in a 3D integration consortium to evaluate the possibilities and features of a commercially available 130 nm CMOS technology with via first TSVs and a tiers connection scheme based on direct metal–metal thermocompression bonding. A large number of different circuits have been submitted in a shared two tiers MPW submission using only one mask set. Pixel sensors for use with some of these circuits are also being prepared. The use of 3D integrated circuits bonded to MAPS detectors is also being evaluated by one institute. This large community is eagerly waiting to get its circuits back to make detailed performance, radiation tolerance, and yield analysis of this new 3D technology. Appropriate CAE design, simulation, and checking tools for highly integrated 3D designs is a field that has been found to need improvements if complicated designs are to be handled efficiently and reliably.

The increased attention to modern interconnect and packaging technologies will for sure have a major positive impact on the physics performance (resolution and mass) of future tracking detectors. The vital questions of productivity, reliability, and finally cost will require continued, and when possible coordinated, activities in the HEP community in the coming years.

#### *C. Optoelectronics*

Future HEP detectors will make substantial use of optical connectivity for high-speed triggering, data readout, and control. The systems and components will have to comply with very demanding engineering constraints in terms of mass, volume, power dissipation, operational temperature, and radiation tolerance. Five papers were presented in this session, all related to future applications.

Two closely linked projects target the development of 4.8 Gb/s serial links to connect detector front-ends to counting rooms: GBT and the Versatile Link projects. The architecture and transmission protocols of the GBT chipset were presented, together with an extensive progress report spanning a wide range of designs: laser driver, transimpedance amplifer, SerDes,

control/monitoring, all of them in 130 nm CMOS technology. The status of these ASICs was reviewed and the implementation of the link protocol in an FPGA was demonstrated. A project roadmap to 2011 was presented. The Versatile Link project develops the optical physical layer of the link. Reports on the front-end transceiver developments and on the fibre radiation resistance were given. The capability to evaluate the functional performance of transceiver modules up to 10 Gb/s was highlighted, together with initial comparative results of commercial modules. Radiation resistance of lasers and PIN photodiodes operating at 850 nm as well as 1310 nm was discussed with encouraging results shown for neutron fluences well in excess of  $10^{15}$  $n/cm<sup>2</sup>$ . . Interconnecting the electronics with the optics is a challenge at those data rates, and a path to develop a customized low mass package for a frontend transceiver including ASICs and optocomponents was described. The radiation-induced attenuation in two candidate multimode optical fibres was studied up to a total dose of 700 kGy. In line with results published in the literature, the measured attenuation was found to be much more pronounced at -25°C than at room temperature.

VCSEL and PIN diode arrays for an SLHC ATLAS pixel detector will have to be operating at neutron fluences of the order of several  $10^{15}$  n/cm<sup>2</sup>. Results from several irradiation runs were presented indicating promising performance, provided VCSEL annealing is taken into account. In the particular case of PIN diodes, the initial response of the diodes was shown to be recoverable after irradiation by increasing the diode reverse bias voltage by an order of magnitude.

Finally, the use of Passive Optical Networks (PONs) for timing distribution applications was demonstrated in a test set-up based on commercial PON transceivers and FPGAs. A full duplex bidirectional data flow with fixed latency was demonstrated using a simplified protocol.

At the end of the session, the Opto Working Group met to discuss the need for further meetings. Participants agreed that despite the emergence of well targeted common projects, a continuing communication forum on the subject is necessary. This need is enhanced further by the long foreseen timescales of the R&D period ahead of us. The working group chairs were thus encouraged to organize on a regular basis mini-workshops with invited and contributed presentations.

#### *D. Systems, Installation and Commissioning*

The completion of the commissioning of much of the LHC electronics systems allowed time in this session for presenting non-LHC projects. In these, there was clear evidence of the benefit of experience gained from LHC but also of developments that could be applied to future upgrades.

One very important subject is the noise experience of the LHC experiments in actual operational conditions, since this was hard to predict. CMS has studied and summarised the experience from commissioning its sub-detector systems over more than a year, including substantial periods of cosmic data taking. The results seem to be entirely positive. There is no evidence of much unexpected noise, except for minor issues involving HCAL photodetectors, and background effects traced to high power lights in the cavern, or associated with temporary welding operations. The widespread use of optical fibre transmission is likely to be one reason for this success, but it seems that prior concerns about grounding and shielding may have identified issues early enough for them to be overcome. Next year will be the acid test.

One specific CMS example is the case of the Muon Drift Tube system which was described. The cosmic data collected have been invaluable to study detector performance and it has behaved very efficiently and stably during data-taking. The complexity of its electronics was explained, including readout, trigger functions, services and monitoring. The quality of the DT data was found to be very good and independent of magnetic field with few integrity problems. At present, the DT Trigger has the expected performance and efficiencies and spatial and temporal resolutions are also as expected. The relatively lengthy period of operations has provided further confidence that the system is ready for LHC physics once beam operations resume.

There were two presentations from the newlyinstalled CMS Preshower detector. The first described the readout system consisting of 9U format boards equipped with optical receivers and FPGAs to carry out data reduction. The large size of the FPGAs required careful testing of their connectivity on the boards and specific test modules were developed to optimise speed and efficiency. The installation and performance of the front-end system of the Preshower were then presented. The detector was installed in a short period of time and first commissioning results indicate that 99% of channels are working with the expected signal-tonoise ratio. A change in the pedestal behaviour with magnetic field has been observed but is not expected to cause problems.

The non-LHC presentations covered several topics. A system for proton imaging was the first, consisting of a tracker of silicon microstrip detectors and a calorimeter of YAG:Ce scintillating crystals. To sustain the necessary event rate of 1 MHz, a data acquisition system was developed based on FPGA technology and parallel processing.

A novel system was presented for acquiring and processing the data from radio-telescopes mapping the distribution of hydrogen gas in the universe. Specific requirements of the system are the large signal frequency range, the difficulties of

transmitting fast timing signals across the large telescope array, and the high data throughput. Fast ADCs are used to convert to the digital domain early in the system. Complex signal processing is carried out in FPGAs and the high bandwidth is supported by Gb/s serial links between the modules. The challenges of precisely timing this system and handling the data bandwidth overlap nicely with requirements for many future developments.

A number of underwater neutrino experiments are running or planned for Mediterranean locations, and the data transmission system for the future KM3NeT project was presented. The emphasis is on a simple and reliable system which requires optical transfer of all data from photomultiplier modules up to 4000 m underwater to the shore. Wavelength division multiplexing will be employed to minimise the number of data fibres. The proposed scheme would use shore-based lasers sending light to passive modulators mounted on the underwater modules. Commercial developments, such as mating connectors that can survive high pressure, are being investigated for this project. The particular constraints on the layout and installation of the system were discussed.

#### *E. Radiation Tolerant Components and Systems*

A substantial amount of design effort in the particle detector community is now focusing on new designs for the SLHC, where radiation levels around the interaction points will increase by approximately a factor 10.

 The Medipix3 full pixel readout chip, fabricated in 130 nm CMOS technology, was irradiated with Xrays up to 460 Mrad. Results confirm that this technology is a strong candidate for the fabrication of future pixel chips in the SLHC.

The new readout electronics for the ATLAS LAr Calorimeter upgrade is being designed in more radiation hard technology to deal with the expected constraints of SLHC. Current developments are focusing on radiation tolerant ASICs for the analog and digital frontends (both using a  $0.13 \mu m$  CMOS process), the mixed-signal front-end ADCs, the silicon-on-sapphire serializer ASIC, the high-speed off-detector FPGA based processing units, and the power distribution scheme. First results of the ADC output stage were presented.

In the accelerator community, a considerable effort is made to reduce radiation damage to electronic equipment already installed and operational in the LHC underground areas. In contrast with the electronics located in the tunnel, most of the electronics in the protected alcoves does not rely on radiation tolerant designs. A system test campaign to validate the electronics built for the LHC cryogenic system was described. Whereas the radiation hardness of the signal conditioner cards for the LHC tunnel was once more confirmed in a complete system test, the insulated temperature conditioners and the AC heaters did not operate correctly due to damage from neutrons and from total dose. The short-term solutions currently being investigated are relocation and radiation shielding.

#### *F. Power*

Power supplies and power distribution are key issues for current experiments and will be a major challenge for future experiments. A lot of work is being invested in the subject and the number of presentations and posters submitted this year has reached the same high level as last year. There were 6 oral presentations in the dedicated power session, 2 posters and another 5 presentations during the ATLAS-CMS power working group session that followed. All contributions but one were related to the powering of the ATLAS and CMS upgraded trackers for SLHC; all of them dealt with power distribution efficiency.

A presentation of the powering of undersea experiments has shown that for these applications, a unipolar DC distribution together with DC-to-DC converters give the most efficient power distribution scheme.

The two main routes being pursued for the powering of trackers at SLHC (one based on DC-to-DC converters and one based on a serial powering scheme) have made impressive progress in a year.

On the serial powering front, a dedicated Serial Powering Interface (SPI) ASIC has been produced and successfully tested on readout hybrids. This chip contains some power devices (shunt regulator and linear regulators) and also service components such as AC coupled LVDS transceivers. Flip chip packaging techniques have been used to optimize the connection of the ASIC to the hybrid it powers. The capability of distributing the shunt transistors in the readout front-end ASICs was also validated on ATLAS readout hybrids using the available features of the ABCN chip. Some work has been done to optimize the power dissipated by the front-end electronics. With low feature size CMOS technologies (130 nm and below) the power consumption is dominated by the digital part of the readout chips and not anymore by the analog part. To reduce this digital power it is proposed to use two separate power lines, one for the analog part and one for the digital part at a lower voltage. It is then more efficient to have the digital power supply delivered by the shunt regulators and to produce the higher analog voltage with switched capacitors DC-to-DC converters embedded in the front-end ASICs. A design of such a converter has been done in a 130 nm CMOS technology; its efficiency should be about 80%. In the coming year some additional work at the system level is required to include all the control elements needed to implement, for instance, the necessary protection mechanisms as well as efficient on and off switching of power elements. Prototyping of these elements in 130 nm (or below) technology is foreseen.

On the DC-to-DC converter front two main types of activities have been reported, one related to EMC issues and the second to the design of a radiationhard converter.

Studies of EMC issues when switching DC-to-DC converters are used close to the detector and its front-end electronics have been carried out. Several tests have been done with different types of converters and different front-end systems (current CMS modules and prototypes of new ATLAS modules). The noise performance has improved a lot thanks to optimization of the layout of those devices and of the shielding of the air-core inductors. In most of the cases, the noise level when using these converters is identical to the one obtained using linear power supplies.

A CMS project aiming at studying and taking into account early enough in the detector system design phase the EMC issues related to the presence of a large number of DC-to-DC converters was presented. Such an approach would avoid the difficult and expensive implementation of late corrections.

The work towards the development of a radiation-hard DC-to-DC converter has been threefold: identification of suitable technologies, design of ASICs in these technologies and design of a complete converter with these components.

Five technologies have been extensively tested with radiation. One 0.25  $\mu$ m technology has exhibited a very good behaviour and is the baseline for future developments, while a  $0.35 \mu m$  technology identified last year can be used as backup.

Three ASICs have been designed, two in 0.35 µm technology and one in the newly identified 0.25 µm technology. Only the first two had been tested before the workshop. They work satisfactorily and a complete converter has been designed with one of them. This converter has been used with an ATLAS prototype module and has been irradiated. A small but acceptable loss in efficiency after 5 Mrad has been observed.

As in the case of serial powering, an optimization of the power dissipated in the front-end leads to the distribution of two separate voltages, one for the analog part of the chips and one for the digital part. It is proposed to distribute power at twice the needed voltage and to implement step-down switched capacitor DC-to-DC converters in the front-end ASICs. Such a converter has been designed in 130 nm CMOS technology; simulation shows that up to 90% efficiency can be obtained.

A work programme for the coming year has been presented. It includes the design of a DC-to-DC converter with the newly identified 0.25 µm technology, some optimization of the packaging in

order to minimize the size of the converter and the ohmic losses, the integration of protection functions, and some studies at the system level.

CMS has selected the DC-to-DC converter solution as its baseline for powering (with the serial power scheme as fall-back solution) and is planning to use such devices for the first upgrade of their pixel detector. ATLAS is still pursuing the two options in parallel and will make a decision once they have both been tested on a prototype stave.

The very dense agenda of the power sessions and of the ATLAS-CMS power working group has limited the necessary time for fruitful discussion and it was agreed to organize in early 2010 a dedicated one-day workshop in order to go more deeply into results analysis and have more time for discussion.

#### *G. Trigger*

The trigger sessions were filled with results from running and commissioning experiments, as well as studies of upgrades to the LHC experiments.

The challenges for the trigger and data acquisition systems for experiments such as NA62 and COMPASS involve processing high data rates without data losses and with high efficiency. Experiments are moving to all-digital systems to allow more complex and flexible logic as well as more comprehensive monitoring.

The present status of important aspects of LHC experiment trigger systems was shown for LHCb, CMS and ATLAS. These presentations were largely focused on the activities of commissioning and the first operations phase of the trigger systems, which were used in data taking, recording cosmic muons for long periods, after the machine stop of autumn 2008. While waiting for the first collisions, activities were mainly concentrated on the development of timing and energy calibration procedures.

Regular data taking runs are preparing the detectors for the restart of the LHC. Strategies for setting the parameters that will be used in colliding beams have been developed. Furthermore, online and offline Data Quality and Monitoring has been set up to provide intensive and precise trigger studies on performance and efficiency.

A large fraction of the work is still dedicated to the development of the software tools, which are essential for system operations and monitoring. Layered software frameworks have been developed in many cases, for configuring, controlling and testing, partial or complete trigger systems.

In the framework of the SLHC upgrade, two proposals for a Level-1 tracking trigger, one from ATLAS and one from CMS, were presented. Studies are required for a detailed understanding of detectors, for pile-up simulations and data reduction techniques. Two key issue are very important, for compatibility with the existing sub-detector systems;

first, the trigger rate must not exceed the present one by much and secondly, the level-1 trigger latency must not increase by more than a few microseconds. Upgrade studies for the SLHC CMS Trigger highlighted new ideas on architectures and tests of newer technologies, such as Advanced Telecommunications Computing Architecture (ATCA), high speed serial links, cross-point switches and large Field Programmable Gate Arrays (FPGAs) with integrated serial links. These offer capabilities and flexibility significantly greater than the present LHC trigger systems in more compact hardware.

#### *H. Programmable Logic, Boards, Crates and Systems*

This plenary session consisted of one invited talk and three oral presentations. In fact, several other presentations during the workshop could have qualified for this session but were given elsewhere due to synergy with other sessions.

The invited talk on Recent Advances in Architectures and Tools for Complex FPGA-based Systems presented the expected short and medium term development of FPGA technology. It introduced a set of tools available or under development to help firmware designers to profit from these advances.

The three contributed presentations introduced FPGA based solutions to implement functions that in the past were solved by other means.

The first presentation described a TDC (Time to Digital Converter) implementation based on high frequency oscillators. The second described the FPGA implementation of a high speed serial transceiver, SerDes and CoDec suitable for the counting house side of the GBT based optical link (see sections A and C). The third presentation introduced an FPGA based solution for a Bit Error Rate tester.

All presentations pointed to the fact that the areas where the use of FPGAs may be a valid alternative to ASICs or to expensive instruments for particular tasks is widening rather fast and that the complexity of the firmware development and in particular its verification and testing is increasing rapidly.

#### *I. Posters*

The courtyard of the Institut des Cordeliers provided a spacious setting for the TWEPP-2009 Poster Session. Some sixty posters were displayed describing forefront work on ASICs, Radiation Effects, Power, Systems and Triggering to name a few. The display area covered three full walls and allowed viewers full access to the posters over the course of the week. The dedicated poster session was particularly well attended by presenters and viewers alike and the resulting discussions were quite lively and animated. For the first time this year, posters were grouped by topic. By doing this it was easier for attendees to locate posters of interest to them. In addition, each oral session had a separate projector to show where posters related to that session could be found. In general, the grouping of posters appeared to work well and was well received.

#### III. CONCLUSION

As confirmed by the large attendance this year, the TWEPP workshop seems to have established itself as a reference European event for the community of electronics designers in High Energy Physics. Even though the currently running machines and experiments will keep us busy for many years to come, a healthy development cycle targeting future applications has started, as highlighted by the majority of the presented contributions.

In a dynamic and rapidly evolving environment, a high quality forum to exchange ideas, collaborate and create synergies is a necessity. This year, the workshop could benefit in Paris from excellent conditions both in terms of venue and organization, thanks to the outstanding preparation and efficiency of the local organizing committee [2].

#### **IV. LINKS**

- [1] http://indico.cern.ch/event/twepp09
- [2] http://twepp09.lal.in2p3.fr/



# Programme Overview







## *MONDAY 21 SEPTEMBER 2009 OPENING PLENARY*

#### The future of the LHC programme and machine

Sergio Bertolucci CERN, 1211 Geneva 23, Switzerland

sergio.bertolucci@cern.ch























#### **Next steps in commissioning with beam**

**a** 

- **complete the BPM checks ( 70%H, 30% V done)**
- **adjust and capture beam 1**
- **beam 1 & beam 2 timing**

 $\, \mathbb{Q} \,$ 

- **experiments magnets : turn on solenoids and toroids**
- **possible to allow for first collisions at 2** × **450 GeV**
- **turn on IP2 / 8 spectrometers turn on IP2 / 8 - verify perfect bump closure verify perfect bump**
- **start to use collimators, increase intensity**
- **check out the beginning of the ramp, ~ 450 GeV to 1 TeV**
- **QPS commissioning**
- **beam dump commissioning**
- **full ramp commissioning to initial physics energy of 3.5 TeV**
- **first collisions at physics energy of 2** × **3.5 TeV**
- **increase intensity and partial squeeze**






















































Detector performance needs to be maintained despite the new environment we will find at sLHC (pile-up, radiation, ….) …… in particular now when we don't know anything about the new energy domain



High-mass (~TeV Z',W',..) can tolerate some degradation; backgrounds are low

WW scattering (Higgs couplings or WW scattering (Higgs couplings vector boson fusion) needs forward jet reconstruction and central jet veto

Vertex, missing  $E_T$ ,  $p_T$  resolution and efficiencies remain important, for many channels of interest

Electron and muon identification fundamental for W/Z, W'/Z', and SUSY



TWEPP 2009



























### **Inner Tracking Detectors**

**Phase 2 : > 600-700 fb-1, L ~ 10<sup>35</sup> cm-2 s-1, ~ 2018**

- $\checkmark$  The present silicon and straw tracker will definitely not survive and will need to be replaced
- $\checkmark$  Both ATLAS and CMS plan a major upgrade, needing a substantial shutdown (ATLAS  $\sim$ 18 months) for in situ installation/integration









TWEPP 2009







### **Detectors Upgrade Strategy**

- **Major R&D and construction work needed.** Even if we learned the lesson with the first LHC detectors, it will take a many years of construction work and few years to integrate it and getting it operational (ID in particular).
- **Designing today also means that we assume the technical** feasibility of sLHC and we integrate in the design the worst **pile-up and radiation/activation environmen**t
- While the financial green light for this new enterprise will probably take a few years and will be tuned to the first LHC discoveries, **the detector community has to act now**, preparing technology, making choices, testing prototypes and going deeply in the engineering design.

TWEPP 2009

### **Summary**

- Both accelerator and experiments are vigorously planning the LHC Luminosity upgrade
- The accelerator will have to consolidate its injection chain. A series of new<br>machines are in preparation. The LINAC4 is already an active project, ready for<br>2014. Several solutions exist for phase 2 upgrade, but need now
- The experimental challenge for the detectors is in the tracking and in the trigger, which will need to be fully rebuilt around 2018
- The detector upgrade projects have started and will now enter the usual phase of proposal and approval (LOI, TP, TDRs, MOU, ….). The project organization is slowly taking shape!
- I am sure, once the first LHC discoveries will be evident, this luminosity upgrade strategy will become a natural and necessary road map of the LHC program and of the HEP community at large

TWEPP 2009

## To conclude

6

- By year 2013, **experimental results** will be dictating the agenda of the field.
- **Early discoveries will greatly accelerate the** case for the construction of the next facilities (sLHC, Linear Collider, v-factory ...)
- No time to idle: a lot of work has to be done in the meantime

TWEPP 2009





### Ryosuke Itoh<sup>a</sup>

### <sup>a</sup> IPNS, KEK 1-1 Oho, Tsukuba, Japan

ryosuke.itoh@kek.jp

*Abstract*

The HEP experiment in Japan is now stepping into next phase. J-PARC, which is a newly-built high intensity proton synchrotron facility, has started the operation recently. A new long-baseline neutrino experiment T2K is now at the commissioning stage utilizing the beam. In parallel, the upgrade of KEKB/Belle, a new generation B-factory experiment at KEK, is about to start. The accelerator will be upgraded to SuperKEKB whose luminosity is expected to be about 50 times higher. The detector is also upgraded to Belle II to keep up with the drastic increase. In this talk, a detailed review is given for these new experiments with some coverage of the readout and DAQ technologies.

### I. INTRODUCTION

Japan has a long history of the accelerator based HEP experiments. The first electron synchrotron with an energy of 1.3GeV was built at Institute for Nuclear Studies of University of Tokyo (INS) from 1956. The construction of the second large accelerator, 12GeV proton synchrotron (KEK PS)[1], was started at KEK in 1970 and many early HEP experiments were performed with it. From 1999, a long baseline neutrino experiment K2K[2] had been started.

Successively, an electron-positron collider called TRISTAN[3] was constructed at KEK from 1983 and four experiments (TOPAZ, AMY, VENUS and SHIP) started data taking from 1986. In 1995, the construction of KEKB accelerator for the B-factory experiment was started by recycling the TRISTAN tunnel and the data taking by the Belle experiment has been going on since 1999.

Recently, the HEP experiment in Japan is stepping into the new generation. The operation of KEK-PS was terminated in 2005 and a new proton accelerator complex named J-PARC[4] has newly built in the Tokai campus of KEK. The beam from the accelerator is fed into the neutrino facility for T2K[5], which is the upgrade experiment of previous K2K at KEK-PS. The commissioning of the facility is just been started.

On the other hand, KEKB and Belle have been running for more than 10 years and their upgrades are about to start. This talk summarizes the preparation status of these Japanese next generation experiments, T2K and SuperKEKB/Belle II[6].



Figure 1: Aerial view of J-PARC.

### II. J-PARC AND T2K

### *A. J-PARC*

J-PARC is a proton synchrotron facility built at the Tokai campus of KEK. It consists of a 400MeV injection linac, a 3GeV RCS (rapid cycle synchrotron) and a main 50GeV synchrotron. The proton beam from RCS is also used to provide both neutron and muon beams to the materials and life experimental facility(MLF). Two facilities, the hadron facility and the neutrino beam line, are constructed to feed protons from the main 50GeV synchrotron. The aerial view of J-PARC is shown in fig. 1.

The beam power of J-PARC is 10 to 100 times higher than that of KEK-PS, aiming at the MW class operation. In December, 2008, the beam acceleration up to 30 GeV succeeded with the fast extraction to the beam abort dump. Also the MLF at RCS started the operation for user runs with a power of 20kW. The slow beam extraction to the hadron experimental hall was achieved in January, 2009. There still remain some problems in the high energy/power operation in the complex and the operation at 30GeV with a power of 0.1MW is foreseen in years of 2009 and 2010.

### *B. T2K*

T2K (Tokai to Kamioka) is a new generation long baseline neutrino experiment. It is the successor of the K2K experiment at KEK 12GeV PS. The physics goal of T2K is the measurement of the lepton mixing angle  $\theta_{13}$  and is also aiming at the discovery of the CP violation in the lepton mixing matrix. T2K consists of the neutrino facility of J-PARC, the near detector complex located 280m apart from the target station, and the Super Kamiokande (SK) detector located in Kamioka which is ∼300km far from J-PARC as shown in Fig. 2.

The neutrino beam from J-PARC is tilted by 3 degree from the axis to SK so that the neutrino flux becomes maximum in the SK sensitive energy where the oscillation becomes maximum. The construction of the neutrino facility at J-PARC was completed on schedule and the beam commissioning has been started from April 2009.



Figure 3: ND280: T2K off-axis detector system.





Figure 2: T2K configuration

### *C. Near detector : ND280*

The "near" detector is the key component of T2K which is newly built at 280m downstream of the target station. The detector consists of two systems. One is the on-axis detector called INGRID placed along 0 degree axis, and its purpose is to measure the direction and intensity of the neutrino beam. The other is the off-axis detector ND280[7] placed in 2.5 degree off-axis which measures the flux and energy spectrum of neutrino beam.

Fig. 3 shows the configuration of the off-axis detector. The detector is composed of lead/scintillator tracking detectors for  $\pi^0$ , TPCs and Fine-Grained Scintillator detectors (FGDs), and downstream electro-magnetic calorimeters. The detector complex is equipped in a solenoid coil surrounded by a magnet yoke providing a 0.2 Tesla magnetic field, which is recycled from the UA1 experiment[8]. New technologies are used in the detector. The Micromegas[9] technology is used for the gas-amplification readout of TPC. The dE/dx resolution measured in the beam test is 6.9%, which provides  $a > 5\sigma e/\mu$  separation for a momentum range more than 200 MeV/c, with the spacial resolution of  $320(650)\mu$ m for 15(75)cm drift length. The FGD is the solid active target with the plastic (1st layer) and water (2nd layer) scintillators and the scintillation light from them is fed into Multi-Pixel Photon Counters (MPPC) array from Hamamatsu through WLS fibers. The beam test result confirms the expected performance with a pulse height of  $\sim$  30 p.e. for the minimum ionizing electrons.

The magnet installation was already completed in 2008 and the installation of FGDs and TPCs is being in progress. The commissioning of the entire detector system is scheduled by the end of 2009.

### *D. SK DAQ upgrade*

Super Kamiokande (SK) is a large water Cerenkov counter system consisting of 13000 PMTs in 50,000 tons of water. The detector is used as the far-side detector of T2K. Recently the data acquisition system of SK has been upgraded with the new front end electronics so that it can process all the PMT hits with no dead time[10]. The system has no central trigger and each PMT hit exceeding the threshold is sent to the event builder asynchronously where the average data flow becomes ∼430MB/sec. The "software trigger" processing is performed by combining the hits in the neighboring time slice and the data flow is reduced down to ∼9MB/sec at the storage level. Fig. 4 shows the schematic view of the upgraded SK DAQ.

### *E. Physics sensitivity of T2K and schedule*

The physics sensitivity of T2K is summarized in Fig. 5. The lepton mixing angle  $\sin^2 \theta_{13}$  can be measured down to 0.01 which is a 10 times improvement from the current best limit set by CHOOZ[11]. The error in the mixing angle  $\sin^2 \theta_{23}$  is also expected to be reduced to ∼1/10.



Figure 4: New data acquisition system for Super Kamiokande. No hardware trigger is used (triggerless DAQ).



### *A. Operation History*

Fig. 7 shows the operation history of the KEKB accelerator. The operation was started from July, 1999 and the luminosity of the machine gradually increased. Recently the world highest luminosity of 2.108 ×  $10^{34}$ cm<sup>-1</sup>sec<sup>-1</sup> has been achieved. The machine is still running and Belle has accumulated an integrated luminosity of  $\sim$ 950 fb<sup>-1</sup> by now (September,2009), which provides the world's largest data sample of  $B$  meson pairs.



Figure 7: KEKB operation history





Figure 5: Physics sensitivity of T2K.

The commissioning of T2K has been started from April, 2009 and the first physics results are expected in 2010.

### III. KEKB AND BELLE

The Belle experiment[12] is a B-factory experiment in the KEKB  $e^+e^-$  collider located at the KEK Tsukuba campus. Fig. 6 shows the KEKB accelerator and the Belle detector.

In order to achieve such a high luminosity, the crab crossing scheme is used in the KEKB accelerator. By rotating the electron and positron bunches before the collision so that they make the "head-on" collision as shown in Fig. 8. Crab cavities were installed in both electron and positron rings from 2007. It is confirmed that the luminosity increases by about 30% with the crab cavities compared to that without them.



Figure 8: Principle of crab crossing



Figure 10: Upgrade to SuperKEKB.

### *B. Physics results by Belle*

The Belle experiment produced a number of important physics results. The most prominent result is the observation of the CP violation in B meson decays[13] as shown in Fig. 9. The CP phase  $\sin 2\phi_1$  is measured to be  $0.642 \pm 0.031 \pm 0.017$ and it is confirmed to be non-zero. It brought Nobel Physics Prize to Profs. Kobayashi and Maskawa in 2008.



Figure 9: The observation of CP violation in B meson decays by Belle.

The other important result is the discoveries of new particles  $X(3872)$ ,  $Y(3940)$  and  $Z(4430)$  which are considered to be particles composed of 4 quarks[14]. The evidence of  $D^0 - \bar{D}^0$ mixing is also confirmed by the experiment for the first time.

However, in order to go beyond, especially to search for New Physics, an event statistics of more than 50 times higher is required. It is the reason of the upgrade to SuperKEKB and Belle II as discussed in following sections.

### IV. SUPERKEKB AND BELLE II

Fig. 10 shows the upgraded SuperKEKB accelerator. The target luminosity of the machine is  $L = 8 \times 10^{35}$ cm<sup>-2</sup>sec<sup>-1</sup> which is ∼ 50 times higher than that of existing KEKB. In order to achieve such a high luminosity, various modifications are made to the KEKB ring. The main improvement is to have extremely small beam size of less than 50nm, which is called the "nano-beam" scheme. To realize the nano-beam, the beam emittance is required to be very small, and the dumping ring is newly added to the injection linac for the purpose.

The Belle detector is also upgraded to Belle II to keep up with the increased luminosity. The comparison of Belle and Belle II detector systems is shown in Fig. 11. Since the hit rate of each detector is expected to increase drastically, the main issue of the upgrade is to manage the detector occupancy by pixelizing the detection unit. The key changes are 1) the addition of pixel detector, and 2) the upgrade of the particle identification(PID) device. The detail of each upgrade is described in following subsections.

### *A. Pixel detector*

To achieve the better vertex resolution in Belle II which is essential for the study of CP violation in B meson decays with a low background, the pixel detector is newly added. The detector is composed of 2 layers of very thin monolithic silicon pixel

sensors made using the DEPFET[15] technology, which is originally developed for TESLA and ILC. The thickness of a sensor is only 50 $\mu$ m with a pixel size of 50 $\times$ 50 $\mu$ m<sup>2</sup>, and the sensors are placed just outside of the beam pipe of 1cm radius. Together with the 4 layers of silicon strip detectors (DSSD) which surrounds the pixel detector, the vertex resolution is expected to improve more than twice. Fig. 12 shows the current design of the DEPFET pixel detector.



Figure 11: Comparison of Belle and Belle-II.



Figure 12: Design of DEPFET pixel detector

### *B. Particle Identification Device*

The performance of the particle identification(PID) is the key issue for the search of New Physics in the rare decays of B mesons. The device used in Belle is the threshold type Cerenkov counters using the aerogel as radiators and the performance is limited. The PID upgrade consists of two different type detectors. The first is the device for the barrel region. The detector is a variant of DIRC[16] used in the BaBar detector, which detects the Cerenkov ring produced in the quartz bars surrounding the central drift chamber. There are several candidates for the upgrade. Among them, the TOP (Time Of Propagation)[17] counter is the most promising candidate which utilize the 3-D information of the Cerenkov ring by detecting the projected position of the Cerenkov light obtained by the precise measurement of the detection timing.

The other is the device for the endcap region. The proximity focusing RICH[18] is supposed to be used where the aerogel is used as the radiator. Fig. 13 shows the current design of these devices.

Since the performance of both detectors rely on the detection efficiency of Cerenkov photons, the choice of photon sensor is the key issue. The R&Ds on two different sensors are in progress. They are MCP-PMT (Micro Channel Plate PMT) and HAPD (Hybrid Avalanche Photo Diode)[19]. Two candidates of MCP-PMT, HPK SL10 and Photonis 85015, are being tested for the use in the barrel PID. HAPD (Hybrid Avalanche Photo-diode) is a good candidate for endcap PID sensors as well as MCP-PMT. The pictures of candidate sensors are shown in Fig. 14 with the preliminary test results.



Figure 13: TOP detector used for particle identification in barrel region, and aerogel RICH in endcap region.

### *C. Data Acquisition System*

The requirements to the data acquisition system are quite tough to manage the drastic increase of trigger rate of Belle II detector. Table 1 shows the comparison of the requirements to DAQ for Belle and Belle II. As seen, the L1 trigger rate becomes ∼20kHz with an event size of 300kbytes after the front end reduction in Belle II. It implies the data flow of 6Gbytes/sec before the HLT(High Level Trigger) processing. The events to be used for the actual physics analysis is estimated to be ∼10% of them and other background events are supposed to be discarded by HLT before the storage.

|                               | Belle    | Belle II           |  |
|-------------------------------|----------|--------------------|--|
| L1 trigger rate               | 0.5kHz   | 20kHz              |  |
| Raw event size                | 40kB     | 500 <sub>k</sub> B |  |
| Front end data size reduction |          | 1/3                |  |
| Event size at L1              | 40kB     | 300 <sub>k</sub> B |  |
| HLT reduction                 | 1/2.     | 1/10               |  |
| Data flow at storage          | 20MB/sec | 700MB/sec          |  |

Table 1: Comparison of requirements to DAQ for Belle and Belle II. The data flow at storage includes the additional data bandwidth to store HLT processing results.



Figure 15: Belle II DAQ Design



Figure 14: Candidates of photon sensors. MCP-PMT(left) and HAPD(right).

### *D. Physics reach and Plan*

The goal of the upgrade is to acquire more than 50 times higher statistics of B meson pairs. With the statistics, the measurement precision of the unitarity triangle is expected to reach order of 1%. Such a high precision enables the search for New Physics by looking at the "shifts" between various measurements as shown in Fig. 16.



Figure 16: New physics search in various measurements by Belle II.

of existing Belle DAQ system. In the Belle DAQ system, a unified readout scheme is used for most of subdetectors, which consists of the Q-to-T converter and the pipeline TDC implemented on a common pipeline readout module (COPPER)[20]. In Belle II, the readout electronics is placed near the detector and only the digitized signals are transferred to the data acquisition system through optical fibers. We recycle the COPPERs and use them as readout modules by replacing the TDC daughter cards with the newly developed data link receiver connected to the readout electronics over the fibers. The data transmission scheme is unified and implemented using the Xilinx's Rocket IO technology. Fig. 15 shows the global design of Belle II DAQ system.

The new DAQ system is designed by inheriting the concept

Although the plan of the upgrade is not yet fixed, which depends on the budget situation of Japanese government, it is quite likely that the upgrade work is started from 2010. The final decision of the detector configuration will be made by early 2010 and the construction is expected to start from 2010 spring. The commissioning of Belle II is expected in the summer of 2013. Fig. 17 shows the expected luminosity accumulation until 2020.



Figure 17: Expected luminosity accumulation

### V. SUMMARY

- 1. Japan has a long tradition of accelerator-based HEP experiments.
- 2. A new proton facility called J-PARC just started the operation and the commissioning of the T2K experiment is now going on.
- 3. The Belle experiment in the KEKB  $e^+e^-$  collider has been running for more than 10 years and already produced various physics results.
- 4. The upgrade of Belle and KEKB, SuperKEKB, is about to start soon aiming at a luminosity of more than 50 times higher.
- 5. Both T2K and SuperKEKB will be the flagship experiments in Japan in coming decades.

### **REFERENCES**

- [1] P.Paul et al., "KEK 12GeV PS 2008 Review Report", http://www-ps.kek.jp/kekps/eppc/Review08/index.html (2008).
- [2] The K2K collaboration, M.H.Ahn *et al.*, "Measurement of Neutrino Oscillation by the K2K Experiment", Phys. Rev. **D74**, 072003 (2006).
- [3] KEK, "Report of TRISTAN Project" (in Japanese), KEK Progress Report 96-2 (1996)
- [4] Accelerator Group, JAERI/KEK Joint Project Team, "Accelerator Technical Design Report for J-PARC", http://hadron.kek.jp/ accelerator/TDA/tdr2003/index2.html (2003).
- [5] Y.Itow *et al.*, "The JHF-Kamioka neutrino project",arXiv:hep-ex/010619 (2001).
- [6] K.Abe *et al.*, "SuperKEKB Letter of Intent", KEK Report 04-4 (2004).
- [7] T2K-ND280 collaboration, http://www.nd280.org/ (2008).
- [8] A.Astbury *et al.*, A  $4\pi$  solid-angle detector for the SPS used as  $p\bar{p}$  collider at c.m. energy of 540 GeV", CERN/SPSC/78-06, SPCS/P 92 (1978)
- [9] Y.Giomataris *et al.*, "A High granularity position sensitive gaseous detector for high particle flux environment", Nucl. Instr. Meth. **A376**, 20 (1996).
- [10] S.Yamada et al., "Commisioning of the New Electronics and Online System for the Super-Kamiokande Experiment", talk at IEEE NPSS Real Time Conference 2009, Beijing, China, May 10-15 (2009).
- [11] CHOOZ Collaboration (M.Appolonio *et al..*), "Search for neutrino oscillations on a long baseline at the CHOOZ nuclear power station", Eur.Phys.J.C27, 331 (2003).
- [12] A.Abashian et al., "The Belle Detector", Nucl. Instr. Meth. A479, 117 (2002); A.Akai *et al.*, Nucl. Instr. Meth. A499, 191 (2003).
- [13] K.Abe et al.. (Belle Collaboration), "Observation of Large CP Violation in the Neutral B Meson System", Phys. Rev. Lett. **87**, 091802 (2001).
- [14] S.K. Choi, S.L. Olsen, et al.. (Belle Collaboration), "Observation of a near-threshold omega-J/psi mass enhancement in exclusive B→K omega J/psi decays", Phys. Rev. Lett. **94**, 182002 (2005), and other papers.
- [15] J.Ulrici *et al.*, "Spectroscopic and image performance of DEPFET pixel sensors" Nucl. Instr. Meth. **A465**, 247 (2000).
- [16] BaBar-DIRC Collaoration (*I.Adam et al..*), "The DIRC particle identification system for the BaBar experiment", Nucl. Instr. Meth. **A538**, 281 (2005).
- [17] K.Inami *et al.*, "Development of a TOP counter for the super B factory", Nucl. Instr. Meth. **A595**, 96 (2008).
- [18] T.Iijima et al., "Studies of a proximity focusing RICH with aerogel radiator for future Belle upgrade", Nucl. Instr. Meth. **A595**, 92 (2008)
- [19] S.Nishida *et al.*, "Development of an HAPD with 144 channels for the aerogel RICH of the Belle upgrade", Nucl. Instr. Meth. **A595**, 150 (2008).
- [20] T.Higuchi *et al.*, "Modular pipeline readout electronics for the SuperBelle drift chamber", IEEE Trans. on Nucl. Sci. **52**, 1912 (2005).

## ILC-CLIC

### Alex Kluge (CERN, Geneva, Switzerland)

alex.kluge@cern.ch

## **Tracker Read‐out at ILC** & CLIC

**Presented by Alexander Kluge** 

**TWEPP 2009, Sept 21 – 25 , 2009**





A. Kluge



















































































**Data rate & Power pulsing**

A. Kluge



A. Kluge





















































**Higgs physics:**<br>•Study of light standard-model Higgs boson (< ~225 GeV) properties using ZH<br>radiation and WW fusion process.

•Precise measurement of Higgs mass (50 MeV) and width (7%) •Higgs coupling to gauge bosons and quarks (to ~10% precision)

**Top-quark physics:**<br>Precision top measurements (at √s=350 GeV)<br>Measurement of top mass (to~150 MeV) and width (5% of predicted 1.4 GeV<br>width)

These precision measurements allow to look for departures of standard model and constrain parameters of new physics models.

**Supersymmetry:** •Complete study of light sparticles •Discovery of heavy sparticles

A. Kluge



### **And in addition:**

•Probe for theories of extra dimensions<br>A. Klew heavy gauge bosons (e.g. Z')<br>•Excited quarks or leptons



# *TUESDAY 22 SEPTEMBER 2009 PLENARY SESSION 1*

## Experiment protection at the LHC and damage limits in LHC(b) silicon detectors

M. Ferro-Luzzi<sup>a</sup>

<sup>a</sup> CERN, 1211 Geneva 23, Switzerland

Massimiliano.Ferro-Luzzi@cern.ch

### *Abstract*

The Large Hadron Collider (LHC), once in operation, will represent approximately a 200-fold increase in stored beam energy with respect to previous high energy colliders. Safe operation will critically rely on machine and experiment protection systems. A review is given of possible beam failure modes at the LHC and of the strategy adopted in the LHC experiments to protect the detectors against such events. Damage limits for the detectors are discussed.

### I. INTRODUCTION

The proton momentum  $(7 \text{ TeV/c})$  and intensity  $(2808$ bunches of each  $1.15 \times 10^{11}$  protons/bunch) at the CERN Large Hadron Collider (LHC [1]) will be such that the total energy stored in one beam, 360 MJ, will be more than two orders of magnitude above the maximum beam energy stored in previous high energy colliders, like TEVATRON and HERA. Even during injection into the LHC, at 450 GeV, the energy stored in a single nominal batch of protons (288 bunches) will be 2.4 MJ, i.e. in excess of the maximum energy stored in a TEVATRON proton beam (1.5 MJ) or in a HERA proton beam (2 MJ). Equipment damage potential also relates to the energy density of the beam. In this respect, considering the small LHC beam dimensions, the maximum energy density will be about a factor 1000 higher than for other accelerators. To cope with these extreme conditions, a robust machine protection system has been developed for the LHC machine [2].

Past experience with beam accidents in particle physics detectors, particularly in vertex detectors, teach us that experiments should implement a dedicated experiment protection system against beam failures. At the LHC, beyond relying on passive machine protection elements (absorbers and collimators), the experiments will have (i) a stand-alone protection system capable of detecting potentially harmful beam conditions and, when required, triggering a beam abort on the appropriate time scale, (ii) the capability to inhibit injection into the LHC machine, and (iii) the means to monitor particle rates in the experiment during injection and stop the process if necessary.

The purpose of this article is to give an overview of experiment protection at the LHC. Section II describes the general LHC layout. Because the machine protection system constitutes the bulk and first line of defense of LHC experiment protection, we outline in section III its general strategy and principal features (a more detailed and more expert description can be found in Ref. [1, 2, 3]). Section IV briefly reviews possible beam failure scenari. In section V we describe the main features of the experiment protection systems. The special case of movable detectors is covered in section VI, while in section VII we discuss the damage potential of LHC beams. Finally, section VIII gives a summary and outlook.

### II. THE LHC MACHINE AND EXPERIMENTS



Figure 1: Schematic top view of the LHC (courtesy Rudiger Schmidt, CERN). Beam 1 (clockwise) and beam 2 (anticlockwise) are injected in IR2 and IR8 and both are extracted in IR6. Note that IR1 and IR5 are also hosting forward detector experiments, LHCf [5] and TOTEM [7] respectively (not indicated on the figure).

Figure 1 shows the general layout of the LHC which is divided in eight octants joined by eight insertion regions (IR). Four of these insertion regions (IR1, IR2, IR5 and IR8) are traversing experimental areas. The RF system for beam acceleration is located in IR4. The clockwise beam (beam 1) is injected near interaction point 2 (IP2), while the anticlockwise beam (beam 2) is injected near IP8. Apart from a few specific collimators, the collimation system is implemented in IR3 (for momentum cleaning) and IR7 (for betatron cleaning). Beam extraction is implemented in IR6. Figure 2 shows the layout of two insertion regions. The top figure displays the IR5 layout, similar to IR1, while the bottom figure shows the IR8 layout, similar to IR2. ATLAS [4] and LHCf [5] are installed around IP1, which can be rated the safest of all interacton points in terms of possible beam failure scenari. CMS [6] and TOTEM [7] are located at IP5, one arc away from the beam dump section in IR6. AL-ICE [8] and LHCb [9] are hosted in IR2 and IR8 which are the regions of beam injection, beam 1 and beam 2 respectively, just about 200 m away from the IP. Furthermore, ALICE and LHCb

each have a dipole spectrometer magnet in the experiment and three compensator magnets deviating the beams in the vertical plane (for ALICE) and ring plane (for LHCb). Finally, ATLAS and CMS have so-called TAS absorbers, which are 1.8 m long copper blocks situated at  $\pm 19$  m from the IP. These are needed to protect the inner triplet of cryogenic quadrupoles around the IP from the excessive heat load due to particles from protonproton collisions. Accessorily, the TAS absorbers also protect the inner detectors of ATLAS and CMS from a variety of beam failures. Due to the lower design luminosity of ALICE and LHCb, the inner triplets in IR2 and IR8 do not require such protection. The different configurations in each experimental area imply that, beyond beam failure scenari common to all, some experiments will be more exposed to specific beam failures.

### III. MACHINE PROTECTION AND BEAM INTERLOCK SYSTEM

The LHC machine protection system has been described in great details in Ref. [2] and references therein. It relies on both passive and active protection. The former is based on aperture limitation and dilution/absorption of beam losses (by collimators, absorbers, diluters). The latter implements fast detection of problem conditions (beam loss and beam position monitors, quench detectors, etc.) and fast beam extraction (LHC beam dumping system or LBDS). At the LHC, about 85% of the 27 km of the ring circumference is composed of superconducting magnets operated at 1.9 or 4.5 K. The combination of a large stored energy in the beams and a massive usage of cryogenic superconducting magnets requires a sophisticated collimation system with unprecedented performance [1, 10]. In contrast to other machines such as HERA, RHIC and TEVATRON, the LHC machine cannot be operated without collimation, because of the tight quench margins<sup>1</sup>. This by itself will ensure a significant level of safety for the experiments: the collimators must define the aperture at all times. For an assumed beam loss lifetime of 10 h, the collimation system must catch with 99.9% probability the particles that would otherwise be lost on sensitive items, such as the cold aperture (superconducting magnets) or the detectors around the interaction points.



Figure 2: Layout of two insertion regions, IR5 (top) and IR8 (bottom). Warm magnets are indicated in red (MBXW, MBXWS, MKI, MSI, MQI, MBI), cold magnets in light blue (MQXA, MQXB, MBRC, MQY, MQML, MBX, MQM/L, MQM, MBA). Yellow elements indicate absorbers (TAS, TAN) or collimators (TDI). XRP1 & XRP2 show the positions of the TOTEM Roman Pot stations. Distances are shown in meters.

A sketch of the beam interlock system (BIS) is shown in Fig. 3. Two redundant optical loops per beam transport socalled BeamPermit signals around the ring. Each pair of loops is composed of a clockwise and anticlockwise propagating signal loop. Two beam interlock controllers (BIC) per insertion region are used to make a logical And of a number of logical signals provided by the users (UserPermit signals). When a UserPermit signal is set False, then the BeamPermit is removed (the optical signal loops interrupted), which fires the dump system and blocks injection from the Super Proton Synchrotron (SPS). For

example, beam loss monitors and beam position monitors may detect abnormal conditions and fire a beam dump, or quench protection sensors may detect a developing quench and fire a beam dump. In total, there will be several thousand LHC devices with input to a BIC, which imposes severe availability and reliability levels<sup>2</sup> [11]. The LHC beam dump system, described in Ref. [1], relies on at least 14 out of 15 kicker magnets firing to extract a beam. The kick amplitude is coupled to an energy tracking system which ensures that beams are properly extracted at any energy [2]. Every beam filling scheme contains an abort

<sup>&</sup>lt;sup>1</sup>The quench levels for slow, continuous losses are expected to be approximately  $7 \times 10^8$  protons m<sup>-1</sup> s<sup>-1</sup> at 450 GeV and  $7.6 \times 10^6$  protons m<sup>-1</sup> s<sup>-1</sup> at 7 TeV [1].

<sup>&</sup>lt;sup>2</sup>A fraction of these user inputs may be masked under specific conditions.

gap of at least 3  $\mu$ s in the bunch structure, corresponding to the dump kicker rise time. The abort gap in each beam is tracked and monitored. Each beam has an independent dump system. When fired by the BIS, the extraction of the full beam is completed within less than 270  $\mu$ s from the removal of the UserPermit signal at the BIC.



Figure 3: Sketch of the LHC beam interlock system based on optical loops around the ring (extracted from [3], courtesy Rudiger Schmidt, CERN). Explanation in the text.

### IV. BEAM FAILURES

Beam failures can occur on different time scales. Slow, steady losses resulting from beam degradation on the time scale of seconds or minutes may damage the detectors, for instance by increased radiation dose, but do not necessarily require an automated beam abort (recovery of good beam conditions may be attempted). On the contrary, faster losses require rapid, automated reaction via the BIS. For example, beam losses due to a tripping magnet will generally develop on the time scale of several turns<sup>3</sup> (for warm magnets) to several milliseconds (for superconducting magnets). Ultra-fast losses, on the time scale of one turn or less, are tackled by passive protection. Such losses are due to e.g. an injection failure or a beam extraction failure.

### *A. Failures with circulating beams*

A large class of beam failure scenari involves circulating beams (at any energy), where beam degradation may be due to a magnet failure or wrong change of settings, to an RF failure, to a collimator failure or wrong change of settings, etc. In these cases beam perturbations will generally affect a large portion of the ring and therefore are likely to be detected by the machine protection system before experiments are affected. An exception to this are possible faults in local bumps, which may affect a single IR with minimal effects outside the local bump region. In this respect, vertex detectors in ALICE and LHCb may be more exposed than those in ATLAS and CMS, as TAS absorbers (for the latter experiments) could help limiting direct hit or high rate splashes to the innermost silicon detectors. A recent study

of such type of failures has been reported in Ref. [12].

### *B. Beam failures at injection*

Another class of failures involves beams at injection. Here, an incomplete or unsynchronized kicker fire or wrong magnet settings in the transfer line could detrimentally affect ALICE or LHCb. Wrong magnet settings in the LHC, in particular in any of the experimental IRs (e.g. D1 magnet, see Fig. 2), could cause local damage and affect only the experiment of that particular IR. At 450 GeV, wrong magnet settings can potentially produce much larger deviations on the beam than at top energy. Again, the absence of TAS absorbers and the presence of spectrometer/corrector magnets in ALICE and LHCb may render these two experiments more prone and/or more exposed to such beam failures. A recent study shows how various wrong magnet settings can direct the beam in the vicinity of the ALICE and LHCb experiments [13, 14]. The results of these studies will be used to set software interlocks on a number of critical magnets around the IRs.

To mitigate the risk for this class of failures, a number of movable and fixed absorbers are placed upstream of the experiments [15]. Furthermore, injection into an empty LHC ring will always start with a single bunch of low intensity, in order to probe at once the settings of all static beam-steering elements of the machine and transfer line and/or to detect unexpected aperture limitations. Once circulating beam is established, injection of high intensity batches may proceed. This procedure is enforced by interlocks [16]. Circulating beam current will be measured in each LHC ring by beam current transformers that will set True an interlock flag (the BeamPresence flags, one per ring) if the measured current is at least  $5 \mu A (\approx 2.8 \times 10^9 \text{ stored})$ protons). In the SPS injector a similar device sets a flag (Probe-Beam flag) which determines whether the prepared beam batch is safe for injection, i.e. whether it is below a certain predefined intensity limit. If this flag is False, then beam transfer from SPS to LHC can only occur if the BeamPresence flag is set True. If ProbeBeam is True, beam transfer can occur irrespective of the BeamPresence flag value. Defining an acceptable value of the safe limit for beam transfer requires a broad understanding of all possible risks involved, for machine equipment as well as for the experiments. This value is currently set to  $10^{10}$  protons and could be configured to a maximum of  $10^{11}$  protons under special conditions.

### *C. Beam failures at extraction*

Finally, a class of failures involves beam extraction. A relevant failure of the extraction system for experiments (especially in IR5) is a possible unsynchronized beam abort (kicker prefire). In such a case, a number of bunches may be swept during the kicker rise time. Of these bunches, up to 24 may continue their trajectory in the ring, possibly creating large intantaneous losses (mostly caught by collimators). Such beam failures could occur about once per year.

Early simulation studies showed that the IP5 inner triplet and CMS could be severly affected by a kicker pre-fire event [17]. As a consequence of this study, additionnal protective absorbers

<sup>&</sup>lt;sup>3</sup>One LHC turn is about 89  $\mu$ s.

were added around IR6 in order to largely reduce the impact of such extraction failures [18, 19].

### V. LHC EXPERIMENT PROTECTION SYSTEMS

The individual LHC experiment protection systems will be centered on diamond-based beam conditions monitors (BCM), a feature common to ALICE, ATLAS, CMS<sup>4</sup> and LHCb. Additionnal detectors may participate in the experiment protection system, such as scintillator counters in the case of CMS or Čerenkov counters in ATLAS. The most exposed TOTEM detectors, the Roman Pots, will rely on nearby LHC beam loss monitors for protection [20]. Here, we limit our discussion to the diamond BCM. We outline the general picture of the BCM systems, emphasizing commonality, and refer the reader to the bibliography for the details of each individual BCM [21, 22, 23, 24].

Polycrystalline CVD<sup>5</sup> diamond pads have been selected as the primary sensors of LHC experiment protection for their compactness, simplicity, reliability and radiation resistance. Such sensors have been successfully used at other high energy physics experiments (BABAR [25], BELLE, CDF [26] and ZEUS). BCM diamond pads for LHC experiments were developped first within the RD42 collaboration for ATLAS and CMS

and subsequently ported to LHCb and ALICE. For a recent review of diamond detectors in high energy physics applications see Ref. [27].

The LHC experiment beam conditions monitors are generally composed of an array of diamond pads located in the vicinity of the beams, typically 4 or 8 diamond sensors on each side of the IP. Each sensor is about  $1 \times 1$  cm<sup>2</sup> in size and 0.3 to 0.5 mm thick. Some selected characteristics of the diamond BCM systems of LHC experiments are listed in table 1. Indicatively, the average primary flux of minimum-ionizing particles (MIP) per diamond pad at  $r \approx 4$  cm (radius from beam axis) and  $z \approx 2$ m (longitudinal distance from IP) are expected to be in the order of 0.05 per inelastic proton-proton interaction, at 14 TeV center of mass energy.

The BCM systems will be operated stand-alone with a dedicated readout chain. The readout schemes and speeds differ from experiment to experiment. As shown in table 1, some systems integrate over  $\sim 40 \mu s$ , others implement bunch-by-bunch rate capability (25 ns readout speed) which also allows monitoring beam halo by timing. All systems use an FPGA-based readout board to process the data and generate a decision. The detailed algorithms and threshold definitions are specific to each experiment. The experiment protection systems will dump both beams when generating a beam abort.

Table 1: Selected characteristics of CVD diamond beam conditions monitors of LHC experiments.

| Experiment   | System           | Diamond<br>pads | Radial distance<br>from beam line | Longitudinal<br>distances from IP |                          | Readout<br>frequency | Ref.   |
|--------------|------------------|-----------------|-----------------------------------|-----------------------------------|--------------------------|----------------------|--------|
| <b>ALICE</b> | BCM-A1           | 4               | 15 cm                             | $\overline{\phantom{0}}$          | $+4.5$ m                 | $40 \mu s$           | $[21]$ |
|              | BCM-A2           | 4               | 6 cm                              | $\overline{\phantom{a}}$          | $+13.5$ m                | 40 $\mu$ s           |        |
|              | BCM-C            | 8               | 8 cm                              | $-19m$                            | $\overline{\phantom{a}}$ | 40 $\mu$ s           |        |
| <b>ATLAS</b> | BCM              | 4 per side      | 5.5 cm                            | $-1.83$ m                         | $+1.83$ m                | $25$ ns              | [22]   |
|              | <b>BLM</b>       | 6 per side      | $6.5 \text{ cm}$                  | $-3.45$ m                         | $+3.45$ m                | 40 $\mu$ s           |        |
| <b>CMS</b>   | <b>BCM1L</b>     | 4 per side      | $4.5 \text{ cm}$                  | $-1.8 \; \mathrm{m}$              | $+1.8$ m                 | $5 \mu s$            | [23]   |
|              | BCM <sub>2</sub> | 12 per side     | 4.5 and 29 cm                     | $-14.4 \text{ m}$                 | $+14.4 \text{ m}$        | 40 $\mu$ s           |        |
| LHCb         | BCM-U            | 8               | $3 \text{ cm}$                    | $-2m$                             | $\overline{\phantom{0}}$ | 40 $\mu$ s           | [24]   |
|              | BCM-D            | 8               | $3.1 \text{ cm}$                  | $\overline{\phantom{a}}$          | $+2.8 \text{ m}$         | 40 $\mu$ s           |        |

Given the fact that all experiment will use a non-maskable input to the local BIC, the LHC machine will not operate if any of the experiment UserPermit signals is missing. This imposes strong availability requirements on the experiment protection system, in particular the BCM, which is required to be ready from the first day of LHC operation. The experiment protection systems are required to implement post-mortem data retrieval and analysis that allows reconstructing a posteriori the few milliseconds preceding any beam abort.

þ

In general, the primary purpose of the BCM systems is to protect the experiments against circulating beam failures. Nonetheless, the experiments are considering the use of BCM systems and others detectors to monitor possible abnormal rates at all times, in particular during injection, to generate a feedback warning for the experiment and LHC control rooms and/or to inhibit further injection if necessary.

### VI. THE SPECIAL CASE OF MOVABLE DETECTORS

Several experiments will make use of movable detectors in the LHC machine. These require special interlock functionality in order to reduce the risk of beam damage when the detectors are in the closed position for physics. TOTEM will use silicon strip detectors in pairs of Roman Pot devices located at  $z \approx \pm 147$  m and  $z \approx \pm 220$  m from IP5 [7]. These Roman Pots consist of movable vacuum enclosures that enable bringing the silicon sensors to a distance of 1.2 mm from the beam. ATLAS will implement scintillating fiber detectors in a similar configuration in IR1 [28], though not before the year 2010. The LHCb vertex locator (VELO [29]) at IP8, composed of 42 silicon strip modules mounted in the vacuum, is also a movable detector. It is divided in two halves (left and right of the beams) that can be retracted sideways by 30 mm during LHC filling. The 21 VELO

<sup>4</sup>The TOTEM trackers T1 and T2, are mechanically integrated in CMS and protected by the CMS BCM [20]. 5Chemical vapor deposition.

modules of each half are enclosed in a thin-walled Aluminium box (250  $\mu$ m) that separates the beam vacuum from the detector vacuum. In nominal physics operation the silicon detectors will be precisely positioned around the colliding beams and the box material will approach the beams to a mere 5 mm distance (the silicon edges reaching a radial distance of about 7 mm).

Because of the expected beam excursions during beam filling and preparation for physics, all movable detectors are required to be in the retracted (or garage) position during these operations. Beam modes have been defined to characterize the operational states. Interlock flags derived from the beam modes will be transmitted to the experiment protection systems for conditioning their interlocks. One particular flag will signal when movable detectors are allowed to leave their garage position. If this flag is set False and a movable detector is not in garage position, then the experiment protection system will fire a beam dump. Furthermore, whenever a movable detector is not in garage position the corresponding experiment will inhibit injection from the SPS.

Protection of movable detectors critically relies on the experiment or machine protection systems (BCM or BLM). A local excursion of a circulating beam, or a failure in the motion system of a movable detector, may dangerously bring the beam envelope in overlap with the detector enclosure or other nearby elements. The motion systems are too slow to react on such eventualities. Therefore, these abnormal conditions must be detected by the BCM (or BLM) and instantly lead to a beam dump trigger.

### VII. DAMAGE POTENTIAL OF LHC BEAM FAILURES

All vertex detectors and most inner trackers at LHC experiments are based on silicon technology. Beam-induced damage for silicon detectors may have different causes, among which: heat deposition, radiation damage and charge-induced breakdown.

- Heat deposition: A crude estimate<sup>6</sup> suggests that an instantaneous rate of  $R \approx 10^{13}$  protons/cm<sup>2</sup> will not increase the local temperature of silicon by more than a few degrees (neglecting particle showering in the silicon). Given the lightness of silicon vertex detectors and the assumed beam failure scenari, heat-induced damage to silicon seems unlikely, although thermal shock effects have not been considered in details for the LHC detectors.
- Radiation damage: Incurred displacement damage will eat up the radiation dose budget. However, silicon detectors at LHC experiments are designed to sustain fluences<sup>7</sup> of up to several  $10^{14}$   $n_{eq}/\text{cm}^2$ , corresponding to an absorbed dose of about 10 Mrad. Example studies by ALICE [30] and AT-LAS [31] indicate that, given the assumed scenari and occurence probabilities, increased radiation dose due to beam failures is not expected to significantly cut down the detector life time. Nonetheless, all experiments will carry

out detailed monitoring of radiation fluences, so that minimization of radiation damage may be attempted by beam tuning.

• Charge-induced breakdown: Sudden high rate may drastically change the electric field configuration in silicon, which locally becomes conductive, and possibly destroy local features of the sensors, depending on the technology used. For example, the bias voltage may be moved to across a silicon oxide dieletric layer between strip implant and readout strip. The silicon oxide layer breaks down at about 1 V/nm and thus, depending on the particle rate, the sensor may be locally damaged (e.g. production of pin holes). A direct hit to front-end integrated circuits may cause even greater damage, as the loss of a readout chip generally implies the loss of many detector channels.

The damage potential of an LHC beam for bulk inactive material has been studied by simulations and cross-calibrated with a controlled experiment at injection energy [32]. SPS batches of increasing intensity were directed into a stack target of selected materials. The result of these studies were found in reasonable agreement with simulations (at the 30% level) and indicated that, for copper, the melting point was reached at about  $2.4 \times 10^{12}$  protons and clear damage became visible at about  $4.8 \times 10^{12}$  protons. These studies led to a definition of the SafeBeam value for LHC equipment at injection energy  $(10^{12}$ protons) and, based on simulation, at top energy  $(10^{10} \text{ protons}).$ Below this value, the number of active inputs to the BIS may be relaxed by masking specific inputs.

Concerning silicon detectors, high particle rate tests were performed at the CERN Proton Synchrotron (PS) with batches of 24 GeV bunches by ATLAS [33] and CMS [34]. Proton bunches<sup>8</sup> were directed onto silicon detector modules, with peak bunch densities in the order of  $3 \times 10^{10}$  protons/cm<sup>2</sup>. The detectors were under bias and the front-end electronics were kept under voltage. Both groups concluded that LHC beam losses producing particle rates of up to  $3 \times 10^{10}$  protons/cm<sup>2</sup> in about 40 ns would not cause irreversible damage to the studied silicon detectors, although a reset of the front-end electronics may be required. Furthermore, laboratory tests were carried out to infer from the response to laser beam pulses the damage potential of high MIP rates on ATLAS silicon strip detectors under bias [35]. Damage to aluminium readout strips was observed at laser pulse densities corresponding to rates of the order of 10<sup>9</sup> MIP in 6 ns injected in a single strip (thus, on a surface area smaller than  $0.1 \text{ mm}^2$ ).

More recently, LHCb has carried out a high rate test at the CERN PS Booster, exposing a VELO module to particle rates ranging from 2.5 to  $9000 \times 10^9$  protons/bunch with a beam spot size of the order of 1 cm<sup>2</sup> and bunch length  $\sim 100$  ns. The detector was tested with high voltage bias on and off, and with front-end electronics turned on and off (in all possible combinations). The beam was sent perpendicular to the sensor plane in various places of the sensors and directly on the electronic

 $^6\Delta T \approx (1.66 \text{ MeV cm}^2/\text{g}) \cdot R/C_p = 3.8 \text{ K}$ , with a specific heat of silicon  $C_p \approx 0.7 \text{ J g}^{-1} \text{ K}^{-1}$ .

<sup>&</sup>lt;sup>7</sup>Equivalent non-ionizing energy loss:  $1n_{eq} = 1$  MeV neutron equivalent displacement damage in silicon.<br><sup>8</sup>Bunch length 42 ns, bunch intensity  $10^{11}$  protons and bunch separation of 256 ns.

readout chips. No obvious damage was obsreved and the detector was fully operational at the end of the high rate test. More details can be found in Ref. [36].

Quite generally, detector components in the experiments, in particular close to the beam line, such as silicon sensors, might not be as sturdy as machine equipment. Although the actual damage limits (in terms of MIP  $cm^{-2}$ ns<sup>-1</sup>) of silicon detectors used in LHC experiments is yet unclear, recent experience with TEVATRON or LEP experiments [37] would suggest reducing the limit for ProbeBeam to the lowest possible value. However, LHC beam instrumentation is limited in sensitivity, which precludes efficient machine studies at intensities below about  $3 \times 10^9$  protons. In addition, dealing with bunches of such small intensity may require time costly adjustements in the injector chain. Therefore, a trade-off value for the ProbeBeam flag has been found, which soundly balances experiment protection and machine operation efficiency.

### VIII. SUMMARY AND OUTLOOK

In summary, with the Large Hadron Collider a new domain for stored beam energy is entered which imposes extreme requirements on machine and experiment protection. The installation of these protection systems is completed and commissioning is well advanced.

The damage risk for silicon vertex detectors depends on the detailed design of the sensors (pixels versus strips, p-in-n versus n-in-n, AC versus DC coupling, geometry, etc.) which broadly varies across LHC experiments. It may as well depend on the state of the detector (value of silicon bias voltage, state of frontend chip supply voltage, etc.). A detailed characterization of the most exposed detectors in each experiment and good understanding of the risks associated with possible beam failures can lead to a better policy of operation of these detectors when the LHC is not in stable beam conditions. For instance, the advantages of detector stability (no charge up effects, no temperature excursions, etc., when all voltages are kept on at all times) will have to be weighed against a possible risk increase for the detectors in situations where beams are not ready for physics.

Further detector tests and beam failure simulation studies help refining the operation policy of the machine and detectors, and defining initial dump thresholds, especially during the beam commissioning phase.

### ACKNOWLEDGEMENTS

The author would like to stress that the presentation herein covers the work of a large number of people. Presumably, he has not been able to do justice on all the excellent work done on the development and construction of the LHC machine and experiment protection systems, and apologises for the inevitable ommissions. Furthermore, he would like to address special thanks to Mario Deile, Antonello Di Mauro, Richard Hall-Wilton, Richard Jacobsson, Daniela Macina, Siegfried Wenig and Jörg Wenninger for their help in preparing the presentation for the TWEPP 2009 Workshop.

### **REFERENCES**

- [1] O. Brüning, P. Collier, P. Lebrun, S. Myers, R. Ostojic, J. Poole and P. Proudlock editors (Geneva, CERN, 2004), LHC Design Report v.1: the LHC Main Ring, CERN-2004-003-V-1a.
- [2] R. Schmidt *et al.*, Protection of the CERN Large Hadron Collider, *New J. of Phys.* **8** (2006) 290.
- [3] B. Todd, A Beam Interlock System for CERN High Energy Accelerators, Ph.D. thesis, Brunel University, West London, and CERN-THESIS-2007-019, Geneva, Switzerland, October 2006.
- [4] W. W. Armstrong *et al.*(The ATLAS Collaboration), ATLAS: Technical Proposal for a General-Purpose pp Experiment at the Large Hadron Collider at CERN, CERN-LHCC-94-43 (1994), http://cdsweb.cern.ch/ record/290968.
- [5] O. Adriani et al.(The LHCf Collaboration), LHCf Experiment: Technical Design Report, CERN-LHCC-2006-004 (2006), http://cdsweb.cern.ch/record/926196.
- [6] M. Della Negra, L. Foà, A. Hervé, A. Petrilli et al.(The CMS Collaboration), CMS Physics: Technical Design Report. Vol. 1: Detector Performance and Software, CERN-LHCC-2006-001 (2006), http://cdsweb.cern. ch/record/922757.
- [7] V. Berardi et al.(The TOTEM Collaboration), Total Cross-Section, Elastic Scattering and Diffractive Dissociation at the Large Hadron Collider at CERN: TOTEM Technical Design Report, CERN-LHCC-2004-002 (2004), http: //cdsweb.cern.ch/record/704349.
- [8] F. Carminati et al.(The ALICE collaboration), ALICE: physics performance report, volume I, *J. Phys. G* **30** (2004) 1517; B. Alessandro *et al.*(The ALICE collaboration), AL-ICE: physics performance report, volume II, *J. Phys. G* **32** (2006) 1295.
- [9] S. Amato *et al.*(The LHCb Collaboration), LHCb Technical Proposal: a Large Hadron Collider Beauty Experiment for Precision Measurements of CP Violation and Rare Decays, CERN-LHCC-98-004 (1998), http://cdsweb. cern.ch/record/622031.
- [10] R. Assmann *et al.*, The Final Collimation System for the LHC, 10th European Particle Accelerator Conf. EPAC 2006 (Edinburgh, 26-30 June 2006), CERN-LHC-PROJECT-Report-919.
- [11] R. Filippini *et al.*, Reliability Assessment of the LHC Machine Protection System, in Proc. of the 2005 Particle Accelerator Conference, Knoxville, Tennessee, p. 1257; R. Filippini *et al.*, Reliability Analysis of the LHC Beam Dumping System, in Proc. of the 2005 Particle Accelerator Conference, Knoxville, Tennessee, p. 1201; A.V. Fernández, F. Rodríguez-Mateos, Reliability of the quench protection system for the LHC superconducting elements, *Nucl. Instrum. Methods Phys. Res. A* **525** (2004) 439.
- [12] R.B. Appleby, LHC circulating beam accidents for near-[25] A.J. Edwards *et al.*, Radiation monitoring with CVD diabeam detectors, LHC Project Report 1176.
- [13] R.B. Appleby, LHCb Injected Beam Accidents, LHC Project Report 1174.
- [14] R.B. Appleby, ALICE Injected Beam Accidents, LHC Project Report 1175.
- [15] V. Kain, What is required to safely fill the LHC, in 3rd LHC Project Workshop, Divonne-les-Bains, France, Jan 2006, CERN-AB-2006-014, J. Poole editor, http: //cdsweb.cern.ch/record/939516.
- [16] B. Puccio, R. Schmidt, J. Wenninger, Beam Interlocking Strategy between the LHC and its Injector, in Proc. of the 10th ICALEPCS Int. Conf. on Accelerator & Large Expt. Physics Control Systems., Geneva, 10 - 14 Oct 2005, http://epaper.kek.jp/ica05/INDEX.HTM.
- [17] A.I. Drozhdin, N.V. Mokhov, M. Huhtinen, Impact of the LHC beam abort kicker prefire on high luminosity insertion and CMS detector performance, in Proc. of the 18th Biennial Particle Accelerator Conference, New York (1999), p. 1231.
- [18] B. Goddard, R.W. Assmann, E. Carlier, J. Uythoven, J. Wenninger, W.Weterings, Protection of the LHC against Unsynchronised Beam Aborts, in Proc. of the European Particle Accelerator Conference, Edinburgh, Scotland, UK, 26 - 30 Jun 2006 (LHC-PROJECT-Report-916).
- [19] Report on final simulation results for the impact of LHC extraction kicker pre-fire is in preparation (B. Goddard, private communication).
- [20] M. Deile, private communication.
- [21] A. Di Mauro, private communication, and H. Schindler, Protecting the ALICE experiment from beam failures, Master Thesis, Technische Universität, Wien, to be published.
- [22] S. Wenig, private communication; M. Mikuž et al., Diamond Pad detecor telescope for beam conditions and luminosity monitoring in ATLAS, Nucl. Instrum. Methods Phys. Res. A 579 (2007) 788; A. Gorišek et al., ATLAS diamond Beam Condition Monitor, Nucl. Instrum. Methods Phys. Res. A 572 (2007) 67; H. Pernegger, H. Frais-Kölbl, E. Griesmayer and H. Kagan, Design and test of a high-speed single-particle beam monitor, Nucl. Instrum. Methods Phys. Res. A 535 (2004) 108.
- [23] A. Macpherson, private communication; L. Fernandez-Hernando *et al.*, Development of a CVD diamond Beam Condition Monitor for CMS at the Large Hadron Collider, Nucl. Instrum. Methods Phys. Res. A 552 (2005) 183.
- [24] C.J. Ilgner, private communication, and C.J. Ilgner Implementation of a Diamond-Beam-Conditions Monitor into the LHCb Experiment at CERN, Nuclear Science Symposium Conference Record, 2007, IEEE, p. 1700.
- monds in BABAR, Nucl. Instrum. Methods Phys. Res. A 552 (2005) 176.
- [26] R. Eusebi et al., A Diamond-Based Beam Condition Monitor for the CDF Experiment, Nuclear Science Symposium Conference Record, 2006, IEEE Vol. 2, p. 709.
- [27] R.S. Wallny, Status of diamond detectors and their high energy physics application, Nucl. Instrum. Methods Phys. Res. A 582 (2007) 824.
- [28] See e.g. B.D. Girolamo, Luminosity measurement at AT-LAS with Roman Pots and scintillating fibre detectors, Nucl. Instrum. Methods Phys. Res. A 581 (2007) 526.
- [29] See e.g. M.G. van Beuzekom, The LHCb Vertex Locator: Present and future, Nucl. Instrum. Methods Phys. Res. A 579 (2007) 742.
- [30] P. Giubellino, A. Morsch, L. Leistam, B. Pastirck, Radiation in ALICE from a misinjected beam to LHC, CERN-ALICE-INT-2001-03.
- [31] D. Bocian, Accidental Beam Losses during Injection in the Interaction Region IR1, CERN-LHC-Project-Note-335.
- [32] V. Kain, K. Vorderwinkler, J. Ramillon, R. Schmidt, J.Wenninger, Material damage test with 450 GeV LHCtype beam, in Proc. of the 2005 Particle Accelerator Conference, Knoxville, Tennessee, p.1607.
- [33] A. Andreazza, K. Einsweiler, C. Gemme, L. Rossi, P. Sicho, Effect of accidental beam losses on the ATLAS pixel detector, Nucl. Instrum. Methods Phys. Res. A 565 (2006) 50.
- [34] M. Fahrer, G. Dirkes, F. Hartmann, S. Heier, A. Macpherson, Th. Müller, Th. Weiler, Beam-loss-induced electrical stress test on CMS Silicon Strip Modules, Nucl. Instrum. Methods Phys. Res. A 518 (2004) 328.
- [35] K. Hara, T. Kuwano, G. Moorhead, Y. Ikegami, T. Kohriki, S. Terada and Y. Unno, Beam splash effects on ATLAS silicon microstrip detectors evaluated using 1-w Nd : YAG laser, Nucl. Instrum. Methods Phys. Res. A 541 (2005) 15.
- [36] L. Eklund (for the LHCb Collaboration), LHC Operations - Beam Incidents, presented at the 2009 VERTEX Workshop (13-18 September 2009), Amsterdam, The Netherlands (proceedings to be published soon).
- [37] For CDF, see e.g. J. Spalding, TEVATRON commissioning and interaction with experiments, in third meeting of the TEV4LHC Workshop, CERN, April 28-30 2005, http://mlm.home.cern.ch/mlm/TEV4LHC.html and http://indico.cern.ch/conferenceDisplay. py?confId=a052004; for LEP, see e.g. J. Rothberg, Limitations due to Backgrounds at LEP1, in Proc. of the 5th Workshop on LEP Performance, Chamonix, January 1995, CERN-SL-95-08-DI CERN SL/95-08 (DI), J. Poole editor, http://cdsweb.cern.ch/record/277821, or also in Chapter XIII / section 5 of The ALEPH handbook: 1995, vol. 2, C. Bowdery editor.

# *TUESDAY 22 SEPTEMBER 2009*

# *PARALLEL SESSION A1 ASICS*

# A ten thousand frames per second readout MAPS for the EUDET beam telescope

C. Hu-Guo<sup>a\*</sup>, J. Baudot<sup>a</sup>, G. Bertolone<sup>a</sup>, A. Besson<sup>a</sup>, A. S. Brogna<sup>a</sup>, C. Colledani<sup>a</sup>, G. Claus<sup>a</sup>, R. De Masi<sup>a</sup>, Y. Degerli<sup>b</sup>, A. Dorokhov<sup>a</sup>, G. Doziere<sup>a</sup>, W. Dulinski<sup>a</sup>, X. Fang<sup>a</sup>, M. Gelin<sup>b</sup>, M. Goffe<sup>a</sup>, F. Guilloux<sup>b</sup>, A. Himmi<sup>a</sup>, K. Jaaskelainen<sup>a</sup>, M. Koziel<sup>a</sup>, F. Morel<sup>a</sup>, F. Orsini<sup>b</sup>, G. Santos<sup>a</sup>, M. Specht<sup>a</sup>, Q. Sun<sup>a</sup>, O. Torheim<sup>a</sup>, I. Valin<sup>a</sup>, Y. Voutsinas<sup>a</sup>, M. Winter<sup>a</sup>

<sup>a</sup> IPHC, University of Strasbourg, CNRS/IN2P3, 23 rue du loess, BP 28, 67037 Strasbourg France b IRFU - SEDI, CEA Saclay, 91191, Gif-sur-Yvette Cedex, France

\*Corresponding author: Christine.Hu@ires.in2p3.fr

#### *Abstract*

Designed and manufactured in a commercial CMOS 0.35 μm OPTO process for equipping the EUDET beam telescope, MIMOSA26 is the first reticule size pixel sensor with digital output and integrated zero suppression. It features a matrix of pixels with 576 rows and 1152 columns, covering an active area of  $\sim$ 224 mm<sup>2</sup>. A single point resolution of about 4 μm was obtained with a pixel pitch of 18.4 μm. Its architecture allows a fast readout frequency of ~10 k frames/s. The paper describes the chip design, test and major characterisation outcome.

## I. INTRODUCTION

EUDET is a project supported by the European Union within the 6th Framework Programme structuring the European Research Area, with the aim to support the detector R&D in Europe for the next large particle project, the International Linear Collider (ILC).

Within the EUDET collaboration, a high resolution beam telescope [1] is being developed. It consists of 2 arms of 3 measurement planes (Fig.1). The latter are equipped with MAPS (Monolithic Active Pixel Sensors) providing excellent tracking performances [2]. An impact position resolution of  $\sim$ 2  $\mu$ m is delivered by the beam telescope on the surface of the Device Under Test (DUT). It will be operated at DESYII 6 GeV electron beam facility like initially expected, and at CERN-SPS 120 GeV/c pions beam facility as well as.



Figure 1: Schematic of the pixel telescope layout

In order to accelerate its commissioning, the construction of the telescope was organised in two stages. In the first stage,

a demonstrator telescope, exploiting the existing CMOS MAPS sensors with analogue readout (MIMOSA17), has been realised. It has been successfully operating since 2007 [3]. In 2009, the final telescope is being equipped with the sensors presented in this paper: MIMOSA26. It provides an active surface exceeding 2 cm², which is 4 times larger than MIMOSA17. The readout time  $(\sim 100 \,\mu s)$  is about an order of magnitude shorter [3] than with the previous sensor.

The design of the architecture of MIMOSA26 optimised for the EUDET beam telescope is discussed in this paper, followed by the preliminary test results obtained in the laboratory and with particle beams.

#### II. MIMOSA26 ARCHITECTURE

MIMOSA26 is a full scale sensor, designed in 2008 and fabricated at the beginning of 2009, in a CMOS 0.35 µm OPTO technology. It combines the architecture of MIMOSA22 and SUZE-01, already validated by two separate prototyping lines [4]. MIMOSA22 [5, 6] is composed of 128 columns of 576 pixels, each column being ended with a discriminator [7]. The pixel contains a pre-amplifier and a Correlated Double Sampling (CDS) circuitry. The matrix is read out in rolling shutter mode. SUZE-01 [8], a reduced scale prototype chip, incorporates the zero suppression logic, the memory buffers and the serial transmission. The measured Temporal Noise (TN) of MIMOSA22 (the pixel array associated with the discriminators) is about 0.6-0.7 mV, corresponding to  $\sim$  12 e<sup>-</sup>, while the Fixed Pattern Noise (FPN) is  $\sim 0.3$  mV, corresponding to  $\sim 6$  e. The detection efficiency is close to 100% up to a threshold value of the discriminators of  $\sim$ 6 times the noise standard deviation (6N), with a fake hit rate below  $10^{-4}$  and a spatial resolution better than 4  $\mu$ m [6]. These performances were obtained with the 120 GeV/c pion beam at CERN-SPS.

Figure 2 shows the block diagram of MIMOSA26. It features 1152 columns of 576 pixels with a pixel pitch of 18.4  $\mu$ m, covering a 224 mm<sup>2</sup> wide active area.

The voltage signal induced by the charges collected through an Nwell/P-epi diode is amplified in each pixel by a preamplification stage [5]. The signal from two successive frames is extracted by the clamping technique (in-pixel CDS [7]). In the rolling shutter read-out mode, the 1152 pixel signals of the selected row are simultaneously transmitted to the bottom of the pixel array where 1152 column-level, offset compensated discriminators perform the analogue-to-digital conversion. A second double sampling, implemented in each discriminator stage, removes pixel to pixel offsets introduced by each in-pixel buffer [7]. In principle, that allows using a common threshold for all discriminators. However, in order to minimise the charge injection coming from the thousands of switches of discriminators during the calibration and the readout phases, the 1152 discriminators are sub-divided into 4 groups of 288 discriminators. Each group has its own threshold provided by a separate bias DAC.



The discriminator outputs are connected to a zero suppression circuitry [8], organised in pipeline mode, which scans the sparse data of the current row. It skips non-hit pixels and identifies contiguous hit (signals above the threshold) pixels (string). The length and address of the beginning of the strings are stored successively in two SRAM allowing a continuous read-out. A data compression factor ranging from 10 to 1000, depending on the number of hits per frame, can be obtained. The collection of sparsified data for a frame is then sent out during the acquisition of the next frame via (one or) two 80 Mbits/s LVDS transmitters.

An optional PLL module, allowing a high frequency clock generation based on a low frequency reference input clock, and a 8b/10b encoder for high speed data transmition with clock recovering, are implemented in the design. The on-chip programmable biases, voltage references and the selection of the test mode are set via a JTAG controller.

The possibility to test each block (pixels, discriminators, zero suppression circuit and data transmission) is an important aspect of the chip: DfT (design for testability). This capability is implemented in the MIMOSA26 design.

## III. TEST & EVALUATION IN LABORATORY

MIMOSA26 sensors were tested extensively in the laboratory. The tests were first performed with the analogue part from pixel outputs in order to check the responses of all pixels. Next, the digital outputs were tested, in 4 different configurations:

- 1152 discriminators alone (isolated from the pixel array)
- all discriminators connected to the pixel array
- zero-suppression circuitry alone

full chain including the pixel array, the discriminators and the zero-suppression logic.

## *A. Tests of the analogue part of the pixel array*

The analogue response was studied on 8 different sensors in order to evaluate the pixel noise, the charge collection efficiency and the uniformity of the response over the sensitive area. All sensors exhibited very similar performances.

The result of the pixel noise measurements is illustrated by Figure 3, which displays the noise level of all pixels composing one of the sensors. One can see that the noise is uniformly distributed and that there are no dead pixels. The average noise value amounts to  $\leq$ -14 e<sup>-</sup> ENC at a read-out clock frequency of 80 MHz.



Figure 3: Distribution of the pixel noise of MIMOSA-26 at the nominal frequency (80 MHz).

The charge collection efficiency (CCE) was investigated by illuminating the sensors with a <sup>55</sup>Fe source. The CCE was derived from the reconstructed clusters generated by the 5.9 and 6.49 keV X-Rays. The measured values are shown in Table 1, where they are compared to the CCE values observed with MIMOSA22. The latter are well reproduced with MIMOSA26, which validates the extension of the MIMOSA22 pixel design to full scale.

| <b>Cluster Size</b> | seed   | 2x2           | 3x3          | ,x5          |
|---------------------|--------|---------------|--------------|--------------|
| MIMOSA26            | $22\%$ | $\frac{0}{0}$ | $\mathbf{0}$ | $\mathbf{0}$ |
| MIMOSA22            | ንን %   | :0 O/         | 0.           | 86 V         |

Table 1: MIMOSA26 CCE measurements compared to those of MIMOSA22.

## *B. Tests of the digital part*

The behaviour of the discriminators isolated from the pixel array was studied on 15 unthinned and 6 thinned  $(\sim 120 \text{ }\mu\text{m})$ sensors. The noise performance was estimated for each discriminator group separately. The measurement consisted in estimating the response of the discriminators to a fixed voltage by raising progressively their threshold.

The outcome of the study is illustrated in Figure 4, which displays the response of a group of 288 discriminators as a

function of the threshold value – S curve. The slope of the transition and its dispersion were interpreted in terms of temporal (TN) and fixed pattern noise (FPN). The TN is  $\sim 0.4$  mV and the FPN is  $\sim 0.2$  mV. These results reproduce well the observations made with MIMOSA22 [6], and show that all discriminators are fully operational at nominal readout frequency.



Figure 4: Response of a group of 288 isolated discriminators.

On the next step, the discriminators were connected to the pixel array. The chip response was assessed at 80 MHz (112.5 µs frame read-out time) with the 15+6 sensors mentioned earlier. 4 sensors were also studied at a read-out clock frequency of 20 MHz. The noise measurements performed with isolated discriminators were repeated with each group of 288 connected discriminators. The values observed are shown for one group in Figure 5.



Figure 5: Response of a group of 288 discriminators connected to the pixel array.

The total TN amounts to  $\sim 0.6$ -0.7 mV, which is basically the value of the pixel TN. The total FPN amounts to  $\sim 0.3$ -0.4 mV, which is dominated by the FPN of the 1152 discriminators. These values remain nearly constant when varying the read-out clock frequency from 80 to 20 MHz. The conclusion of the tests at this stage is that the complete array reproduces the performances extrapolated from MIMOSA22 [6].

Next, the zero-suppression logic, disconnected from the rest of the chip, was investigated. Various patterns were

emulated with a pattern generator, and ran through the logic millions of times without any error up to frequencies of 115 MHz (i.e. 1.4 times the nominal frequency). All critical configurations, e.g. with strings overlapping two contiguous blocks, were checked repeatedly to be treated properly.

Finally, the signal processing of the complete chain, ranging from the pixel array to the output of the data transmission, was characterised on several different sensors. Their outputs were studied in the absence of any radiation source in order to evaluate the ratio of noisy pixels to all pixels, corresponding to the fake hit rate. The ratio of fake hits is a function of the discriminator threshold. Table 2 summarises the results. One observes that the discriminator threshold values ranging from 5 to 5.5 times the noise value allow maintaining the fake hit rate at a level of  $10^{-4}$  (i.e.  $\lt 70$ ) pixels per frame). This result remains essentially unchanged when varying the operation temperature from  $+20^{\circ}$ C to +40°C. It was also checked that multi-hit frames translate into the right output memory patterns.

| Discriminator Threshold   $4 N$   $5 N$   $5.5 N$   $6 N$   $8 N$   $10 N$ |                                                                     |  |  |  |
|----------------------------------------------------------------------------|---------------------------------------------------------------------|--|--|--|
| (Npix>Vth)                                                                 | $(10^{4})$ $  \leq 8$ $  \sim 1.5$ $  \sim 1$ $  \sim 0.5$ 0.1 0.03 |  |  |  |

Table 2: Fake hit rate of a MIMOSA-26 sensor measured as a function of the discriminator thresholds.

Finally, the power consumption of the sensor was measured and found to be ~750 mW for the whole chip. This value agrees well with the one simulated, and corresponds to ~250 mW/cm<sup>2</sup> and to ~640  $\mu$ W per column. This latter value reflects consumptions of  $\sim$ 250 µW per pixel and  $\sim$ 350 µW per discriminator.

## IV. BEAM TEST RESULTS

From July to October 2009, MIMOSA26 was operated 3 times on particle beams at the CERN-SPS. Parts of these beam periods were devoted to the integration of the sensors in the EUDET beam telescope, where they are supposed to equip all planes of the final telescope version. Separate beam tests were performed to evaluate the sensor performances.

The tests started with a set of 3 sensors introduced as Device Under Test (DUT) in the EUDET telescope demonstrator. The 3 sensors were operated synchronously and the track reconstruction was running smoothly after only a few days of run. The next step of the EUDET programme consisted in replacing all 6 analogue output sensors composing the telescope demonstrator with MIMOSA26 chips. The complete telescope was commissioned in September 2009 with ~120 GeV/c pions at the CERN-SPS.

Six other sensors, some of them thinned to  $120 \mu m$ , were combined to build another telescope, which was installed at the CERN-SPS for the sensor assessment. They were operated during about 10 days with ~120 GeV/c pions and their response to the beam particles were studied as a function of the discriminator threshold value.

A discriminator threshold scan was performed, similar to those performed in the laboratory, in order to derive the value of the total noise. The TN was found to be  $\sim 0.6$ -0.7 mV and the FPN was observed to be  $\sim 0.3{\text -}0.4$  mV. These values reproduce well those observed in the laboratory.

Next, the rate of fake hits was determined (at room temperature and at 80 MHz). Table 3 summarises the results for two different sensors, illustrating the spread of the responses between chips. One observes that a threshold slightly above 5 times the noise value allows to keep the fake hit rate in the order of  $10^{-4}$  or below.

| Discriminator threshold                                                    |  |  | 15N 6N 7N 8N 10N 12N |  |
|----------------------------------------------------------------------------|--|--|----------------------|--|
| <b>Fake rate of chip Nr. 24</b> $(10^{-4})$ 1.6 0.6 0.24 0.095 0.026 0.017 |  |  |                      |  |
| Fake rate of chip Nr 1 $(10^{-4})$ 3.3 1.2 - 0.23 0.054                    |  |  |                      |  |

Table 3: Values of the average fake hit rate due to pixel noise fluctuations as a function of the discriminator threshold.

The characteristics of the noise of the pixel array were studied in some detail in order to evaluate its impact on the occupancy of the zero-suppression logic. Figures 6 and 7 display the number of times each pixel exhibits a noise fluctuation above a threshold of 6 times the average noise (6N) based on ~40,000 frames test without beam.

Figure 6 shows also the distribution of the number of pixels per frame with noise fluctuations above this threshold.



Figure 6: Number of pixels per frame with a noise fluctuation passing a discriminator threshold of 6N.



One observes that the average value of fired pixels per frame is about 40. Compared to the total number of pixels composing the sensor  $(-660,000)$ , this corresponds to a rate of  $\sim 0.6x10^{-4}$ . The noise fluctuations above the threshold follow a Gaussian (more precisely a Poisson) distribution, with a standard deviation equal to the square root of the mean value.

Figure 7 shows whether the noise fluctuations are rather concentrated in a few pixels firing frequently or if they are more distributed among a large number of pixels firing from time to time. One observes that a relatively modest fraction of the pixels generates most of the fake hits. For instance, 0.2 % of the pixels fire at least once every 100 frames due to their noise fluctuation. More statistics needs to be accumulated in order to evaluate in how far these values vary from one sensor to another.

The detection efficiency was evaluated for different threshold values and on different sensors, as well as the cluster multiplicity distribution and the single point resolution. The events collected were triggered with a 7×7 mm² scintillator slab. Good quality tracks were reconstructed through the telescope for ~80 % of the triggers. Figure 8 represents the distribution of the particles' impacts in each of the 6 MIMOSA26 sensors, which gives an image of the beam spot. The correlation between impacts in different planes is clearly visible.



Figure 8: Beam spot derived from about  $10<sup>4</sup>$  beam particle tracks reconstructed through the telescope (6 planes of MIMOSA26).

A detection efficiency of  $\sim 99.5 \pm 0.1\%$  was obtained for a fake rate of  $\sim 10^{-4}$  (Fig. 9). This very satisfactory performance is however slightly below the one observed with MIMOSA22. Besides the preliminary aspect of the analysis, which may be partly at the origin of the difference, the latter is also suspected to follow from the large number (1152) of discriminators integrated in MIMOSA26, translating into threshold dispersions which are also limiting the sensor performance. Solutions to this feature exist, which will be



implemented in the next real scale sensor, to be fabricated in Spring 2010 for the STAR vertex detector.

Figure 9: Variation of the detection efficiency with the fake hit rate.

The threshold dependence of the cluster multiplicity was also evaluated. Figure 10 shows the cluster multiplicity for 3 different threshold values.



Figure 10: Cluster multiplicity for different threshold values.

Figure 11 summarises the variation of the resolution with the discriminator threshold. Its value varies around 4.5 µm, which is exceeds the values observed with MIMOSA22/- 22bis by  $\ge$  -0.5 µm. This feature is not consistent with the observed cluster characteristics of MIMOSA26, which can be considered as identical to those of MIMOSA22/-22bis. The investigation of this inconsistency is still on-going.



the discriminator threshold value.

## V. CONCLUSION & PERSPECTIVES

MIMOSA26 is the first reticule size, fast readout, MAPS which integrates on-chip data sparsification for the EUDET beam telescope. The assessment of MIMOSA26 is not yet completed but the preliminary conclusion is that its architecture provides the expected tracking capability needed for this telescope.

The fast readout architecture of MIMOSA26 will serve as base line architecture for vertex detectors of several experiments, such as the STAR Heavy Flavor Tracker (HFT) upgrade. It will also be extended to the CBM Micro Vertex Detector (MVD) (SIS-100) and is proposed for the ILC vertex detector.

#### VI. REFERENCES

- [1] T. Haas, "A pixel telescope for detector R&D for an international linear collider", *Nucl. Instrum. Meth. A,* vol. 569, No. 1, Dec. 2006, pp. 53-56
- [2] M. Winter et al., "Vertexing based on high precision, thin CMOS sensors", Proc. of the 8th ICATPP, Como, Italy, October 2003
- [3] M. Winter, "Towards the final EUDET telescope", talk given at the EUDET Annual Meeting, 6th Oct. 2008, Amsterdam
- [4] Ch Hu-Guo et al, CMOS pixel sensor development: a fast read-out architecture with integrated zero suppression, 2009 *Journal of Instrumentation-JINST* 4 P04012
- [5] A. Dorokhov *et al.*, "Optimization of amplifiers for monolithic active pixel sensors for the STAR detector", *2008 IEEE Nucl. Sci. Symp. Conf.,* Oct. 2008, Dresden, Germany
- [6] M. Gelin *et al.*, "Intermediate Digital Chip Sensor for the EUDET-JRA1 Beam Telescope", *2008 IEEE Nucl. Sci. Symp. Conf. Record*, October 2008, Dresden, Germany.
- [7] Y. Degerli, "Design of fundamental building blocks for fast binary readout CMOS sensors used in high-energy physics experiments", *Nucl. Instr. and Meth. A 602* (2009) 461-466
- [8] A. Himmi et al., "A Zero Suppression Micro-Circuit for Binary Readout CMOS Monolithic Sensors", this proceeding

# Front End Electronics for Pixel Detector of the PANDA MVD

# D. Calvo<sup>a</sup>, P. De Remigis<sup>a</sup>, T. Kugathasan<sup>a,b</sup>, G. Mazza<sup>a</sup>, M. Mignone<sup>a</sup>, A. Rivetti<sup>a</sup>, and R. Wheadon<sup>a</sup>

a INFN, Sezione di Torino, 10125 Torino, Italy <sup>b</sup> Universita di Torino, Dip. di Fisica Sperimentale, 10125 Torino, Italy `

kugathas@to.infn.it

## *Abstract*

ToPix 2.0 is a prototype in a CMOS 0.13  $\mu$ m technology of the front-end chip for the hybrid pixel sensors that will equip the Micro-Vertex Detector of the PANDA experiment at GSI. The Time over Threshold (ToT) approach has been employed to provide a high charge dynamic range (up to 100 fC) with a low power dissipation (15  $\mu$ W/cell). In an area of  $100 \mu m \times 100 \mu m$ each cell incorporates the analog and digital electronics necessary to amplify the detector signal and to digitize the time and charge information. The ASIC includes 320 pixel readout cells organized in four columns and a simplified version of the end of column readout.

## I. INTRODUCTION

PANDA [1] will be one of the main experiments at FAIR, the future facility for antiproton and ion research under construction at Darmstadt, Germany.

PANDA will exploit antiproton-proton and antiprotonnucleus reactions for precise QCD studies. Its physics program includes the spectroscopy of charmonium states and investigation of open charm production, the search of glueballs and hybrids, the study of the behavior of hadrons in nuclear matter, and precise  $\gamma$  ray spectroscopy of hypernuclei. The PANDA experiment will be located in the High Energy Storage Ring (HESR), which will provide a high quality antiproton beam (see Table 1).





PANDA is a fixed target experiment, the experimental apparatus is divided in two parts: the target spectrometer which surrounds the interaction point and the forward spectrometer to cover the angular region below  $10^\circ$ .

The Micro Vertex Detector (MVD) [2] is located in the innermost part of the experimental apparatus and will consist of silicon pixel and silicon strip detectors to obtain precise tracking of all charged particles. Since the MVD tracks a high number of low momentum particles [3] it is possible to achieve a particle identification trough the measurement of the energy loss per unit path-length (dE/dx). Physics simulations show that an accurate measurement of an energy loss up to  $2.3MeV$  allows

the separation of different particle species (protons, kaons, pions/muons/electrons).

Figure 1 shows the present design of the MVD:

- Four Barrels Two inner layers: Hybrid pixel Two outer layers: Double sided strip
- Six Forward Disks First four disks: Hybrid pixel Last two disks: Pixel + Strip

The MVD requires 11M pixel readout channels covering  $0.14m^2$ , and 70k strip readout channels covering  $0.5m^2$ . The custom solution for the readout of the pixel detector is motivated by the high track density (up to  $12.3Mhit/(s\cdot cm^2)$ ) and the absence of a trigger signal.



Figure 1: The PANDA MicroVertex Detector (MVD)

## II. PIXEL READOUT CHIP

The specifications for the readout electronics are given by the PANDA radiation environment and the close proximity of the MVD to the interaction point. Table 2 summarises the specifications for the pixel readout cell.

Table 2: Pixel specifications

| Pixel Size           | $100 \ \mu m \times 100 \ \mu m$ |
|----------------------|----------------------------------|
| Noise Level          | $200 e^-$ rms                    |
| Linear dynamic range | Up to $100 \text{ fC}$           |
| Power consumption    | $< 20 \mu W$                     |
| Input polarity       | Selectable                       |
| Leakage compensation | Up to $50$ nA                    |

The ASIC has to give simultaneous time stamping and charge measurement. Table 3 shows the ASIC specifications.

Table 3: ASIC specifications

| Trigger             | Self Triggering    |
|---------------------|--------------------|
| Active area         | $O(1cm^2)$         |
| Data rate           | O(0.8Gbit/s)       |
| Radiation tolerance | 10 Mrad            |
| Time resolution     | $6ns(50MHz$ clock) |

## *A. Time over Threshold Technique*

The pixel readout architecture is based on the Time over Threshold technique [4] [5] which makes possible a low power charge digitization. The value of the injected charge is measured through the time needed to discharge a capacitor with a constant current.

The output voltage of a Charge Sensitive Amplifier is given by:

$$
v_{out}(t) = \frac{Q_{in}(t)}{C_f} = \frac{1}{C_f} \int_0^t I_{in}(t') - I_{dis}(t')dt'
$$

where  $Q_{in}(t)$  is the collected charge at the input node,  $C_f$  is the feedback capacitance,  $I_{in}(t)$  the injecting current and  $I_{dis}(t)$ the discharging current.

It is possible to assume the charge injection as instantaneous:  $Q_{inj} = \int_0^{\epsilon} I_{in}(t')dt'.$ 

 $\mathcal{L}_{ins} = \int_0^t I_{in}(t) dt$ .<br>The discharging current is constant:  $\int_0^t I_{dis}(t') dt' = I_{dis}t$ . With these assumptions the output voltage can be written as:

$$
v_{out}(t) = \frac{Q_{inj} - I_{dis}t}{C_f}
$$

When  $t = ToT$  the voltage output is 0:

$$
v_{out}(ToT) = \frac{Q_{inj} - I_{dis}ToT}{C_f} = 0
$$

and the linear relationship between the injected charge and the ToT is thus obtained:

$$
ToT = \frac{Q_{inj}}{I_{dis}}
$$

The ToT allows to achieve good linearity and excellent resolution even when the preamplifier is saturated, thus allowing an high dynamic range of the charge measurement.



Figure 2: Output ToT signals

#### *B. Analog Front End*

The analog front end generates a pulse which width is proportional to the charge injected by the pixel detector. It is made by a Charge Sensitive Amplifier with feedback, a leakage compensation system and a comparator with tunable threshold via a 5 bit DAC.



Figure 3: Analog ReadOut Channel

## *1) Charge sensitive amplifier*

The charge sensitive amplifier is the core component of the ToT stage.

The input stage is a gain enhanced cascode amplifier with capacitive feedback, its input DC level is fixed by the input transistor current bias.

The output stage is made by a source follower with selectable polarity in order to maximize the output dynamic range, the output DC level is regulated by the leakage compensation system.



Figure 4: Charge Sensitive Amplifier

#### *2) Feedback circuit*

The feedback circuit generates a constant current to discharge the charge deposited on the input node. It is made by a differential stage which receives at the input the output signal of the CSA and the reference voltage, and injects the discharging current at the input node of the CSA ( $I_{dis} = 5nA$ ). At the equilibrium provides an equivalent  $8M\Omega$  feed-back resistor.

#### *3) Leakage compensation*

The pixel sensor leakage current may be up to 50nA. If the leakage current is smaller than the designed discharging current  $(5nA)$ , it generates a voltage offset which unbalances the differential stage and the the effective discharging current depends on the leakage current. If the leakage current is larger than the discharging current, the extra current charges the feedback capacitance  $(C_f)$  quickly saturating the Charge Sensitive Amplifier. In the upper part of the dynamic range the Time over Threshold can be very long, reaching up to  $20\mu s$  for a 100 fC input. Therefore, a very low cutoff frequency is necessary in order to prevent these long signals from being clipped by the leakage compensation circuit. This would introduce a nonlinearity which is not desirable, because the resulting compression curve might depend on the value of the leakage current.



Figure 5: Leakage compensation stage

A compact filtering resistor with very high value is implemented through a PMOS with the gate shorted to the source. The filtering capacitor is implemented through MOS devices.

#### *4) Comparator*

The comparator has a folded cascode input stage with two CMOS inverters.



Figure 6: Comparator with 5 bit DAC

To mitigate the threshold dispersion a local five bit DAC is added in each pixel, to allow a fine tuning of the threshold on a pixel by pixel basis.

The DAC can sink or source current to a low impedance node, 1 bit selects the polarity and the other 4 the current value. The DAC full scale range is set by an external component on the PCB.

#### *C. Saturation and Cross Talk*

When the CSA is not saturated the charge is collected on the feedback capacitance  $C_f$ , while it is saturated the extra charge is collected on the input capacitance  $C_{in}$ : the gain of the CSA drops and the input can not be considered anymore as a virtual ground.



Figure 7: Saturated and linear mode

Table 4: CSA modes

| Modality          | Non Saturated        | Saturated                  |
|-------------------|----------------------|----------------------------|
| Charge collection | on $C_f = 24 fC$     | on $C_{in} \approx 200 fC$ |
| CSA gain          | $1/C_f = 41.7 mV/fC$ | drops                      |
| $v_{out}$         | $Q_{inj}-I_{dis}t$   | $v_{s}$                    |
| $v_{in}$          |                      | $Q_{inj} - v_s C$<br>dis   |

Due to the inter-pixel capacitance a voltage signal at the input of one channel induces a spurious signal at the input of the adjacent pixel. When the CSA saturates a voltage signal at the input node is present and the cross-talk effect is greatly magnified.



Figure 8: Cross talk effect

Figure 8 shows the result of two simulations performed to estimate the cross talk effect with an inter-pixel capacitance of  $100 fF$ . In the first simulation (8.a) the injected charge is 90fC. the ToT signal saturates the preamplifier (dark line), and a spurious signal is present in the adjacent channel due to the cross talk (light line). In the second simulation (8.b) the injected charge is 5fC, the ToT signal does not saturates the preamplifier and the cross talk is negligible.

### *D. Digital Readout*

In each pixel the control logic receives the signal from the comparator and stores the value on the time stamp bus at the rising and falling edge in the 12 bit leading edge and trailing edge registers. It is also present a 12 bit configuration register. The registers are based on the DICE cells [6] in order to be Single Event Upset tolerant.

The pixels are arranged in columns, each column has a readout logic made in a fixed priority scheme to read the timestamps of the pixel cells and to read/write the configuration bits.

## III. ASIC FLOORPLAN

Each readout cell incorporates the analog and digital electronics necessary to amplify the detector signal and to digitize the charge information (Fig. 9). In each cell it is present a calibration circuit. When it is enabled it makes possible the injection of a test current pulse. Moreover 16 readout cells have an external connection to the wirebonding pads in order to connect a sensor or inject an external calibration signal.

ToPix 2.0 [7] has 320 pixel readout cells arranged in four columns: two short columns with 32 pixels and two folded column with 128 pixels. In the final version of ToPix the length of each column will be  $\approx 11$  mm. In this prototype folded columns are employed to estimate the effect of the column length on the data transmission. In this way it is possible to implement a long column in a limited area thus saving on the cost. Each column has a simplified readout logic.



Figure 9: Layout of the readout cell

Fig.10 shows a photo of the chip, its size is  $5mm \times 2mm$ .



Figure 10: ToPix 2.0 Photo

The final version of ToPiX will consist of a matrix of  $116 \times 110$  cells with a pixel size  $100 \mu m \times 100 \mu m$ , thus covering a  $1.28$ c $m<sup>2</sup>$  active area.

## *A. Test Results*

#### *1) ToT Linearity*

Figure 12 shows the result of ToT linearity simulation and and measurement on two readout pixels (p-type sensor signal). The result of the fit on the simulated values is :

$$
ToT_p = 188 \frac{ns}{fC} Q_{inj} + 300ns
$$

The two measurement are compatible with this linear fit.



Figure 11: ToT linearity for a P-Type sensor

#### *2) ToT dispersion*

The channel to channel ToT dispersion is  $\frac{\Delta T \sigma T}{T \sigma T} \approx 10\%$ . The discharging feedback current has minor implication on the uniformity between the different channels (Figure 12).



Figure 12: ToT dispersion

This result shows that the current source that biases the feedback stage does not give the major contribution in the ToT dispersion. The other blocks that contribute to the ToT dispersion are the differential pair of the feedback stage and the differential pair of the leakage compensation. Montecarlo simulations have been done to understand how to improve the ToT uniformity. The critical block is the leakage compensation stage, where the mismatch effects on its input transistors creates an offset that unbalance the feedback circuit changing the effective discharging current value.

#### *3) Threshold dispersion*

The local 5 bit DAC in each pixel for the threshold dispersion mitigation has been tested. Figure 13 shows the dispersion of the threshold values before the correction (light curve) and after the correction (dark curve).



Figure 13: Threshold dispersion

#### *4) First spectra with an epi-sensor*

ToPix 2.0 has been tested with an epitaxial sensor (thickness:  $50\mu m$ , size:  $125\mu m \times 325\mu m$ ) connected by wire bonding to the external pad of chip. Figure 14 shows the spectra obtained using a <sup>214</sup>*Am* source (60*keV*  $\gamma$  photons).

In this case the signal to noise ratio is limited by parasitics capacitance due the external connections: bonding pad, wire bonding and protection diodes.



Figure 14: Epitaxial sensor measurement with a  $^{214}Am$  source.

## *B. Conclusions*

Tests show good agreement between specifications and measurements. An upgrade of Topix 2.0 is currently under design. ToPix is designed to work with a clock of  $50MHz$ , the clock of PANDA experiment has been fixed recently at  $160MHz$  and the new version has to be compatible with the new clock. In order to keep the same *clock\_cycles* to *injected\_charge* ratio of ToPix 2.0, the discharging current value has to be proportionally increased from 5nA to 16nA.

The chip will be designed using a different flavour of the process, which allows a more robust power supply distribution at the expense of the increased pitch of some of the metal layers. Moreover the sensitivity of the digital logic to the SEU has to be decreased. Consequently, the area reserved to the digital part must be increased. To comply with the new, more stringent space requirement the layout of the analog part has to be partially revised. Since the leakage compensation circuit is the analog block occupying the largest surface, a more compact design of this part is underway.

### **REFERENCES**

- [1] PANDA-Technical Progress Report, February 2005. http://www-panda.gsi.de/archive/TPR2005/panda tp.pdf
- [2] T. Stockmanns [PANDA Collaboration], "The microvertex-detector of the PANDA experiment at Darmstadt", Nucl. Instrum. Meth. A 568 (2006) 294.
- [3] Physics Performance Report for PANDA: "Strong Interaction Studies with Antiprotons" March 2009, http://arxiv.org/abs/0903.3905v1
- [4] I. Kipnis et al., "A time-over-threshold machine: the readout integrated circuit for the Babar silicon vertex tracker", IEEE Transactions on Nuclear Science 44, 3 (1997) 289.
- [5] I. Peric et al., "The FEI3 readout chip for the ATLAS pixel detector", Nucl. Instr. and Meth. A 565 (2006) 178.
- [6] T. Calin, M. Nicolaidis and R. Velazco, "Upset Hardened Memory Design for Submicron CMOS Technology", IEEE Trans. Nucl. Sci., vol. 43, pp. 2874-2878, Dec 1996
- [7] D. Calvo, et al., "The silicon pixel system for the Micro Vertex Detector of the PANDA experiment", Nucl. Instr. and Meth. A (2009), doi:10.1016/j.nima.2009.09.043

# Advanced Pixel Architectures for Scientific Image Sensors

# R. Coath<sup>a</sup>, J. Crooks<sup>a</sup>, A. Godbeer<sup>a</sup>, M. Wilson<sup>a</sup>, R. Turchetta<sup>a</sup> and the SPiDeR Collaboration<sup>b</sup>

<sup>a</sup> Rutherford Appleton Laboratory, Science and Technology Facilities Council, UK

<sup>b</sup> https://heplnm061.pp.rl.ac.uk/display/spider/

rebecca.coath@stfc.ac.uk

# *Abstract* We present recent developments from two projects targeting ad-

vanced pixel architectures for scientific applications. Results are reported from FORTIS, a sensor demonstrating variants on a 4T pixel architecture. The variants include differences in pixel and diode size, the in-pixel source follower transistor size and the capacitance of the readout node to optimise for low noise and sensitivity to small amounts of charge. Results are also reported from TPAC, a complex pixel architecture with ~160 transistors per pixel. Both sensors were manufactured in the 0.18µm INMAPS process, which includes a special deep p-well layer and fabrication on a high resistivity epitaxial layer for improved

This section describes the technologies developed and used by us for scientific image sensors.

II. TECHNOLOGIES

## *A. The INMAPS 0.18µm Process*

NWELL

**DIODE** 

**PWFLL** 

**SUB** 

**CONN** 

**INCIDENT PARTICLE** 

**NMOS** 

**TRANSISTOR** 

A typical CMOS pixel consists of several elements on a ptype epitaxial layer. These elements are a diode (an n-type diffusion forming a junction on the p-type epitaxial layer), and some readout circuitry. A typical cross-section of the pixel, showing these elements, is given in Figure 1.

**PMOS** 

TRANSISTOR CONN

EPITAXIAL LAYER **SUBSTRATE** 

**NWELL** 

WELL



charge collection efficiency.

The scientific community often requires advanced image sensors, where the requirements can include high sensitivity, low noise, high charge collection efficiency and a tolerance to radiation. CMOS Monolithic Active Pixel Sensors (MAPS) can achieve these requirements, and have been demonstrated to be suitable for detecting minimum ionising particles (MIPs) [1].

Improvements in the detection capabilities of MAPS devices can be implemented in two ways; via the careful tailoring of the resistivity of the epitaxial layer and the process used, or via advanced pixel architectures. To achieve these requirements, we have been developing a novel process, INMAPS [2], alongside investigating 4T (four transistor) pixels. INMAPS contains a deep p-well layer and the option to fabricate on a high resistivity epitaxial layer for improved charge collection efficiency. The architecture of the 4T pixel can achieve lower noise and a higher conversion gain for increased sensitivity to small amounts of charge compared to the common 3T pixel.

Section II. will discuss the technologies involved in the INMAPS process and the 4T pixel architecture. Section III. will discuss two sensors developed using these technologies, FORTIS (4T Test Image Sensor) and TPAC (Tera-Pixel Active Calorimeter). Section IV. will present results from FORTIS, showing the benefits of these technologies, and an update on TPAC, which was presented at last year's conference [3]. Results from radiation hardness testing of FORTIS 1.0 will also be shown, as well as some preliminary findings from a beam test at CERN, which was performed as part of the SPiDeR (Silicon Pixel Detector Research and Development) collaboration [4]. Finally, the findings from both sensors will be summarised in Section V. and some next steps for both sensors as they become part of the SPiDeR collaboration will be detailed.

Figure 1: Cross-Section of a Typical CMOS Pixel

If in-pixel procesing is required, complex readout circuitry often demands the use of full complementary MOS transistors (i.e. both PMOS and NMOS). However, the use of PMOS transistors requires an n-well implant on the p-type epitaxial layer. This forms additional parasitic p-n junctions, which act as charge collection areas and reduce the overall amount of charge collected by the diode. This problem can easily be overcome by using purely NMOS transistors, however, this limits the functionality of the readout circuitry.

The INMAPS process was designed to address this issue [2]. An additional deep p-well layer was developed and can be placed under parasitic n-wells and prevent them from collecting charge [5]. The deep p-well layer, which can be seen in Figure 2, is more highly doped than the p-type epitaxial layer, and acts as a potential barrier for minority carriers, reflecting them back into the epitaxial layer and allowing them to continue to diffuse, eventually being collected by the diode. In this way, PMOS transistors for complex in-pixel circuitry can be implemented

successfully without significantly affecting the charge collection efficiency. As well as the deep p-well layer, the INMAPS process also features the use of epitaxial layer thicknesses up to 18µm. Advanced pixel architectures such as the 4T pixel can also be implemented as described in Section C.



Figure 2: Cross-Section of a Typical CMOS Pixel Showing Addition of INMAPS Deep P-Well Layer

## *B. Use of a High Resistivity Epitaxial Layer*

The resistivity of the silicon in which the CMOS pixel is placed defines the depth of the depletion region into the epitaxial layer that forms from the n-type implant creating the diode (for a given bias voltage). The typical resistivity of a standard epitaxial layer is between  $10-100Ω$ cm [6].

When electron-hole pairs are generated within silicon by a MIP, the electrons will typically diffuse through the epitaxial layer, and if they are sufficiently close to the depletion region of the diode, they will be collected. If they are generated far away from the diode, they will travel within the epitaxial layer for longer distances than those generated close to the diode, which can lead to crosstalk between pixels and degrade the magnitude of the signal collected by the pixel which the MIP passed through.

In the ideal situation, the entire epitaxial layer underneath the diode would be completely depleted, changing the main charge transport mechanism from diffusion to drift, where the increased electric fields from the larger depletion region attract more charge than in the case of a smaller depletion region.

As the depletion region width increases with increasing resistivity of the epitaxial layer, one way to extend the depletion region further into the epitaxial layer and improve the charge collection efficiency is to use an epitaxial layer with a high resistivity between 1-10kΩcm [6], [7]. We are currently investigating the use of a high resistivity epitaxial layer for both sensors presented in Section III.

The use of a high resistivity epitaxial layer should increase the charge collection efficiency and reduce the crosstalk. The sensor's tolerance to ionising radiation should also be improved, as the effects of minority carrier lifetime degradation are expected to be reduced due to the increased charge collection speed [8].

## *C. The 4T Pixel Architecture*

One common pixel architecture present in CMOS image sensors is that of the 3T (three transistor) structure as shown in Figure 3. This pixel architecture consists of a diode, a reset transistor, a source follower transistor and a row select transistor. The operation is as follows; first the diode is reset via the reset transistor, and then charge (generated from ionising particles or electromagnetic waves) is collected. After a set "integration" time, the row select transistor is turned on and the signal from the pixel is read out via external readout circuitry.



Figure 3: 3T CMOS Pixel Architecture

The 4T (four transistor) pixel architecture is shown in Figure 4. This architecture adds three additional elements to this architecture; the transfer gate (TX), the floating diffusion node (FD), and a pinned photodiode instead of a normal diode [9]. Charge will be collected by the pinned photodiode as long as TX is off, and is transferred to the floating diffusion node by turning on TX following the integration time. The pinned photodiode is manufactured with an additional shallow p-type implant above the standard n-type diffusion on a p-type epitaxial layer. Because of the p-n-p structure, when the floating diffusion is reset to a voltage above or equal to the pinning voltage and TX is turned on, the diode becomes fully depleted, allowing for full noiseless charge transfer.



Figure 4: 4T CMOS Pixel Architecture

There are two key benefits to the 4T pixel architecture. Both of these benefits are due to the fact that the charge collection area and the readout node within the pixel are separated, which is not the case in the 3T pixel. This allows low noise operation to be obtainable via correlated double sampling. The main source of noise within a CMOS pixel is kTC (or reset) noise from the resetting of the capacitive floating diffusion node through the resistive channel of the reset transistor (a few tens of electrons). By sampling the floating diffusion node before and after TX is turned on, correlated double sampling with a short sampling time can be performed, thus eliminating kTC noise. The remaining noise is due to readout noise, which can be thermal, 1/f or random telegraph signal noise, and typically gives an input referred noise of the order of several electrons, depending on the characteristics of the sensor [10].

The second benefit of the separated charge collection and readout nodes is that a high conversion gain can be obtained. The conversion gain defines the sensitivity of the pixel to small amounts of charge in the voltage domain. It is given by  $V =$  $q/C$ , where C is the capacitance where the charge is stored before readout. In the 3T case, this capacitance is the diode capacitance, but in the 4T case, this capacitance is the floating diffusion node capacitance, which can be geometrically tailored to give a smaller capacitance, depending on the application. If charge is transferred from a large capacitance (with a low conversion gain) to a smaller capacitance (with a higher conversion gain), then the sensitivity to small amounts of charge is increased.

#### III. THE SENSORS

This section describes the sensors which have used the technologies introduced in the previous sections.

#### *A. FORTIS*

FORTIS (4T Test Image Sensor) is a prototype sensor containing thirteen different variants on a 4T pixel architecture. There have been two iterations of this sensor; FORTIS 1.0, and FORTIS 1.1, where the latter explored the variants chosen for FORTIS 1.0 further via fabrication with and without the deep pwell layer, and on both a standard and a high resistivity epitaxial layer. FORTIS 1.1 also contained an optimised processing step to reduce the noise associated with the source follower.

Both sensors consist of the same simple readout architecture, with decoders for row and column access to focus on one pixel variant array at a time, and a simple analogue output stage with sampling capacitors for storage of the reset and signal samples to implement correlated double sampling. In FORTIS 1.0, there were twelve different pixel variants, consisting of some reference pixel designs plus several geometric variations, such as variations in the size of the source follower transistor, the diode size, and the pixel pitch (6µm, 15µm, 30µm and 45µm). FORTIS 1.1 contains an extra pixel variant where four diodes have been combined at the floating diffusion node to investigate the effects of charge binning.

#### *B. TPAC*

TPAC (Tera-Pixel Active Calorimeter) is a MAPS sensor designed for a tera-pixel electromagnetic calorimeter at the International Linear Collider [3], [11]. The sensor contains ~28,000 pixels on a 50 km pitch, and within each pixel, there are  $\sim$ 160 transistors, comprising a preamplifier, a shaper, a comparator with trimming and masking logic, and a monostable element to generate the binary output pulse, representing a MIP "hit".

TPAC was the first of our sensors to utilise the special IN-MAPS deep p-well implant, and without it, the charge collection within the pixels would be severely reduced due to the amount of PMOS transistors within the pixels. The latest version of TPAC was also fabricated on a high resistivity epitaxial layer.

#### IV. RESULTS FROM FORTIS

This section details the results from FORTIS 1.0 and 1.1.

## *A. FORTIS 1.0 Results*

A photon transfer curve (PTC) plots the dark corrected signal against the dark corrected noise. This is a standard way of measuring image sensors and gives a lot of information about the characteristics of an image sensor [12]. The PTC from one of the best pixels from FORTIS 1.0 can be seen in Figure 5. The results show that the conversion gain is high, relating to a floating diffusion capacitance of ~2fF. The noise is 5.8e-rms. This gives a substantial signal-to-noise ratio for a MIP (where the typical signal value for a MIP is 250-1000e- for a 12µm epitaxial layer thickness).



Figure 5: PTC Results from the Best Pixel of FORTIS 1.0

### *B. FORTIS 1.1 Results*

Some interesting results from comparing fabrication on a standard and a high resistivity epitaxial layer have already been found via the use of charge collection efficiency scans. A white light source was focused down to a 2um x 2um spot size and then horizontally scanned across the centre of the diodes of three adjacent pixels. The charge collection from the three pixels was then analysed by looking at the location of the spot and the resulting signal obtained out of each pixel in turn.

Figure 6 shows the results from the standard resistivity epitaxial layer. The geometric features of the pixels are immediately clear; the peaks represent the positions of the diodes (i.e. where the spot was focused directly on the pixel of interest), and the dips mark where metal covers the pixel and light cannot get through. However, there are secondary peaks present within these scans, which represent the charge collected when the spot is focused on an adjacent pixel. These secondary peaks represent crosstalk.



Figure 6: FORTIS 1.1 Charge Collection Scan - Standard resistivity. The peaks and troughs represent the diode and metal within the pixel respectively, and the secondary peaks represent crosstalk

Figure 7 shows the results from the high resistivity epitaxial layer. The geometric features are again apparent, but the secondary peaks have significantly diminished. This shows that crosstalk has been reduced within the pixels, as the primary charge transport mechanism has changed. Charge diffusion within the epitaxial layer to neighbouring pixels is reduced. Instead, charge is attracted by the electric fields extending deeper into the epitaxial layer as described in Section II. and is therefore more likely to be collected by the nearest pixel. This clearly shows the benefits of using a high resistivity epitaxial layer.



Figure 7: FORTIS 1.1 Charge Collection Scan - High resistivity. The secondary peaks as in Figure 6 have diminished significantly

#### *C. Radiation Hardness Results*

The best pixel (as shown in Figure 5) from five FORTIS 1.0 sensors was irradiated up to 1MRad in steps of 10kRad, 20kRad, 50kRad, 100kRad, 200kRad, 500kRad and 1MRad using 50kVp x-rays from an x-ray tube. In-between the irradiations, when not being tested, the chips were stored at  $-25$  °C. It was found that the noise significantly increased beyond 500kRad to a point where the signal-to-noise ratio decreased substantially and a MIP would not be reliably detectable, therefore the suggested radiation tolerance for FOR-TIS 1.0 is between 500kRad-1MRad. The noise distribution for 0kRad and 500kRad is given in Figure 8. A logarithmic increase with respect to irradiation level was found between 0-500kRad from 6-9e-rms, and the noise distribution clearly spreads out, suggesting that random telegraph signal noise and 1/f noise has increased, which are both associated with charge trapping in the source follower transistor gate oxide and the corresponding silicon-silicon dioxide interface [13].



Figure 8: Radiation Hardness RMS Noise Results from FORTIS 1.0 from a 32 x 32 Pixel Region

## *D. Beam Test Results*

As part of the SPiDeR collaboration, FORTIS 1.0 and FOR-TIS 1.1 have just returned from a beam test at CERN, where they were tested with 120GeV pions. Both standard and high resistivity epitaxial layer chips were taken, as well as chips with and without deep p-well. The results are currently being analysed, and the benefits of using a high resistivity epitaxial layer should be visible. Some provisional results are shown in Figure 9, which show the first detection of MIPs with a 4T pixel architecture.

TPAC also went to the beam test at CERN as part of six sensors in a stack. In conjunction with the sensors, three scintillators and photomultiplier tubes (PMTs) were used; two in front and one at the rear of the stack, to be able to detect the particles when they entered the stack for producing time tags to correlate the hits seen by the sensor with the time at which the particles were detected and confirm that tracks were seen throughout the stack. The data from the beam test is currently being analysed, but early indications show that the time tags from the scintillators and PMTs show good correlation with the hits from the sensors. Events were seen in all six sensor layers, showing that the particles were tracked through the stack. Results were also seen in the sensors fabricated on a high resistivity epitaxial layer.



Figure 9: Beam Test Results from FORTIS Showing MIPs "Hits"

#### V. CONCLUSIONS AND NEXT STEPS

FORTIS has proved to be a very promising sensor for applications where low noise and high sensitivity to small amounts of charge are paramount. The noise was measured at the output of the best pixel of FORTIS 1.0 to be 5.8e-rms, which is a key low noise result for particle physics applications. This pixel has also been shown to be tolerant to ionising radiation up to 500kRad.

FORTIS 1.1 has yet to be fully characterised, and results from all pixels, including the geometric and processing variations, are expected within the next few months. FORTIS 1.1 will also undergo radiation hardness testing, which will be of interest for characterising the use of a high resistivity epitaxial layer, as it is expected that the sensors fabricated on such a layer will be more tolerant to radiation.

The TPAC sensor preformed well in the recent CERN beam test. TPAC will be taken to DESY for a beam test in early 2010 to be tested with 1-6GeV electrons and with tungsten layers within the stack with the aim of detecting electromagnetic showers.

Both of these sensors were fabricated with the INMAPS 0.18µm process, with and without deep p-well, and on both a standard and a high resistivity epitaxial layer, allowing us to fully assess the benefits of the process.

The results lead on to discussions under the SPiDeR collaboration as to whether to pursue a 4T style digital electromagnetic calorimeter (DECAL) sensor, or to pursue a TPAC style one. Alongside this, FORTIS is also being assessed for scaling up to a 5cm x 5cm active area.

#### VI. ACKNOWLEDGEMENTS

The author would like to thank the rest of the SPiDeR collaboration: B. Allbrooke, O. Miller, N.K. Watson, J.A. Wilson D. Cussans, J. Goldstein, R. Head, S. Nash, J.J. Velthuis P.D. Dauncey, R. Gao, Y. Li, A. Nomerotski, J.P. Crooks, R. Turchetta, C.J.S. Damerell, M. Stanitzki, J. Strube, M. Tyndel, S.D. Worm, Z. Zhang. Thanks also to Carl Morris, Daniel Packham and Tim Pickering for their contributions to the testing of both sensors.

## **REFERENCES**

- [1] R. Turchetta. CMOS Sensors for the Detection of Minimum Ionising Particles. In *Proceedings of the 2001 IEEE Workshop on Charge-Coupled Devices and Image Sensors, Lake Tahoe, Nevada, USA, 7 - 9 June 2001*, pages 7–9, 2001.
- [2] J.A. Ballin et al. Monolithic Active Pixel Sensors (MAPS) in a Quadruple Well Technology for Nearly 100% Fill Factor and Full CMOS Pixels. *Sensors*, 8(9):5336, 2008.
- [3] J.A. Ballin et al. A Monolithic Active Pixel Sensor for a "Tera-Pixel" ECAL at the ILC. In *Proceedings of the 2008 Topical Workshop on Electronics for Particle Physics, Naxos, Greece, 15 - 19 September 2008*, pages 63–69, 2008.
- [4] The SPiDeR (Silicon Pixel Detector R&D) Collaboration. https://heplnm061.pp.rl.ac.uk/display/spider/.
- [5] R.A.D. Turchetta et al. Accelerated Particle and High Energy Radiation Sensor, 7 May 2004. US Patent App. 10/556,028.
- [6] O.V. Kononchuk et al. High Resistivity Silicon Wafer with Thick Epitaxial Layer and Method of Producing Same, 6 December 2001. US Patent App. 10/008,440.
- [7] W. Chen et al. Active Pixel Sensors on High-Resistivity Silicon and their Readout. *IEEE Transactions on Nuclear Science*, 49(3):1006–1011, 2002.
- [8] A.G. Holmes-Siedle and L. Adams. *Handbook of Radiation Effects*. Oxford University Press, USA, 2002.
- [9] R.M. Guidash et al. A 0.6µm CMOS Pinned Photodiode Color Imager Technology. In *Technical Digest of the IEEE International Electron Devices Meeting, IEDM'97, Washington DC, USA, 7th - 10th December 1997*, pages 927– 929, 1997.
- [10] C. Leyris et al. Impact of Random Telegraph Signal in CMOS Image Sensors for Low-Light Levels. In *Proceedings of the 32nd European Solid-State Circuits Conference, Montreux, Switzerland, 18th - 22nd September 2006*, pages 376–379, 2006.
- [11] M. Stanitzki et al. A Tera-Pixel Calorimeter for the ILC. In *Conference Record of the IEEE Symposium on Nuclear Science, Honolulu, Hawaii, 27th October - 3rd November 2007*, volume 1, pages 254–258, 2007.
- [12] J.R. Janesick. *Photon Transfer: DN–)[lambda]*. Society of Photo-Optical Instrumentation Engineers, USA, 2007.
- [13] X. Wang et al. Random Telegraph Signal in CMOS Image Sensor Pixels. In *Technical Digest of the IEEE International Electron Devices Meeting, IEDM'06, San Francisco, USA, 11th - 13th December 2007*, pages 1–4, 2006.

## **Performance of the ABCN-25 readout chip for the ATLAS Inner Detector Upgrade**

F. Anghinolfi<sup>a</sup>, W. Dabrowski<sup>b</sup>, N. Dressnandt<sup>c</sup>, J. Kaplon<sup>a</sup>, D. La Marra<sup>d</sup>, M. Newcomer<sup>c</sup>, S. Pernecker<sup>d</sup>, K. Poltorak<sup>a</sup>, K. Swientek<sup>b</sup>

<sup>a</sup> CERN, 1211 Geneva 23, Switzerland

<sup>b</sup> AGH University of Science and Technology, Faculty of Physics and Applied Computer Science, Kraków University of Pennsylvania, Physics Department

<sup>d</sup> University of Geneva

## francis.anghinolfi@cern.ch

#### *Abstract*

We present the test results of the ABCN-25 front end chip implemented in CMOS 0.25 μm technology and optimised for the short, 2.5 cm, silicon strips intended to be used in the upgrade of the ATLAS Inner Detector. We have obtained the full functionality of the readout part, the expected performance of the analogue front-end and the operation of the power control circuits. The performance is evaluated in view of the minimization of the power consumption, as the upgrade detector may contain up to 70 million of channels. System tests with different power distribution schemes proposed for the future tracker detectors are possible with this chip. The ABCN-25 ASIC is now serving as the prototype readout chip in the developments of the modules and staves for the upgrade of the ATLAS Inner Detector.

#### I. INTRODUCTION

A primary challenge of tracking detectors being developed for the SLHC environment is the high occupancy, which affects directly the granularity of sensors and the number of electronic channels, to be about 10 times higher compared to the present SCT detector. As a result, power consumption in the readout ASICs is one of the most critical issues on top of usual requirements concerning noise and radiation resistance. These requirements have to be considered taking into account present and expected trends in development of industrial CMOS processes. In order to address all these aspects an R&D proposal has been initiated to develop a new ASIC for the ATLAS Silicon Tracker Upgrade [1].

The ABCN-25 ASIC has been designed as a prototype test vehicle for the development of the stave/module readout concepts of the ATLAS tracker upgrade for the SLHC. It has been fabricated in the 0.25 μm CMOS technology from IBM. Reasons for developing the present prototype in the 0.25 μm technology were partially economical. Furthermore we were able to reuse well-known functional blocks, like the memory elements, the bandgap reference or the elements of DAC circuits and complete the design in a relatively short time.

The first critical aspect of the future electronics in the detector concerns the delivery of the power to the nearly 70 million of readout channels, transmission and service devices, which have to be located inside the detector. The constraint is twofold, as it concerns the minimization of the power, and also the reduction in the current. The power dissipated should be limited to less than 70 kW, for the overall silicon strip

detector, to comply with the estimated capacity of the cooling system. The supply current has a direct impact on the necessary amount of metallic conductor cross-section to deliver the current, and this amount is actually limited by the available cables existing in the present detector. Various schemes for power distribution, like serial powering of modules or DC-DC step-down converters on the detector, are under investigation in the frame of another R&D project [2]. The ABCN-25 ASIC incorporates two shunt regulator circuits to exercise the serial powering system of the detector modules, as well as a low drop voltage regulator for the front-end supply voltage.

The second critical aspect concerns the readout architecture: the large amount of data generated by modules, will be transmitted off the detector through 4.8 Gb/s optical links [3]. The concentration of data from different modules towards the high throughput optical device requires a readout structure involving a per-module controller and a complex data concentrator/multiplexer at the end of the stave [4].

#### II. ABCN-25 CHIP MEASUREMENTS

The ABCN-25 architecture follows the concept of binary readout of silicon strip detectors as implemented in ABCD3T ASIC [5]. It comprises 128 channels of preamplifier, shaper and comparator circuits with two memory banks, one used as a pipeline for the trigger latency and another one used as a derandomizing buffer. The front-end has been optimised for 2.5 to 5 pF detector capacitance (2.5 cm long silicon strip detector) and it is compatible with either detector signal polarity. The shaper is designed for 25 ns peaking time providing double pulse resolution of 75 ns.

The chips have been mounted on different boards and prototype hybrids [6], and tested through 2 test systems: one, called the SCT-DAQ, is the custom designed hardware and software developed for the tests of the SCT detector modules actually installed in the detector, the other one, called NI-DAQ, allows performing equivalent tests but is a new solution based on a VXI-NI crate, commercial data acquisition board and the LabVIEW software [7].

#### *A. Front-End*

The design specifications of the analogue front-end is summarized in Table 1. There are two parameters, which drive the design of the front-end, namely the ENC be below 750 el. rms and the power dissipation be below 0.7 mW per channel.

Let us note that after heavy radiation damage in the SLHC environment the expected signal from the silicon strip detectors is about a half of that available in the present SCT detectors. Therefore the ENC is required be significantly lower, below 750 el. rms, compared to 1500 el. rms in the present detector.



Table 1: Main specifications for the ABCN-25 front-end.

The noise figures of the front-end have been measured on a test board with a specific "clean" printed circuit board layout in front of the channel input pads to minimize the tracks parasitic capacitance and remove any crosstalk from digital signals or power lines. Small SMD type capacitors can be added on 2 channels to perform the measurement of the noise versus the input capacitance. The measurement results are summarized in Figure 1: the measurements were made on two different chips, one set (J) reporting values from zero external capacitance to 5.6 pF, with different bias currents of the input transistor. For the other set (M) the capacitance at the input has been extended up to 15.8 pF. For short strips the expected strip capacitance should not exceed 2.5 pF. For long strips it should be in the range of 10 pF. The measurement at 2.3 pF shows a noise of ~580 electron rms at nominal bias current of 140 μA. At high bias current (198 μA) the noise for 10.8 pF is at  $\sim$ 1100 electron rms. This measurements demonstrate that, although the design has been optimised for the short strips it provides also satisfactory performance for long strips by increasing the bias current in the input transistor and so in expense of additional power dissipation.

The black dots in Figure 1 correspond to noise measurements made with the chips mounted on one of the first hybrid prototypes equipped with 20 ABCN-25 chips and

connected to detector strips of 1 cm, 2.5 cm, 5 cm and 7.5 cm. They show a higher noise than with discrete capacitance, for strips length higher than 2.5cm. The noise deviation above the 2.5cm strip length should be reviewed when more advanced versions of the hybrids will be tested.



Figure 1: Measured noise figures of the ABCN-25 readout chip. Two sets of measurements (M and J), with different bias currents in the input transistor are shown.

For comparison, the noise figures from simulation with close conditions to the test (bias, temperature) are plotted in Figure 2. The expected noise at 3 pF capacitance is just below 600 el. rms for drain current in the input transistor of  $140 \mu A$ , matching well with the above measurements. We have obtained 580 el. rms for 2.3 pF test capacitance, to which a parasitic value of 0.5-0.8 pF should be added, due to stray capacitances on the printed circuit board.



Figure 2: Simulated noise figures for the front-end.

The gain has been measured by performing threshold scans for three different values of the input charge, namely 1.5, 2.0 and 2.5 fC. The input charge is applied simultaneously to  $\frac{1}{4}$  of the channels, through a charge injection circuit present on the ABCN-25 front-end. The measurements are repeated 4 times to scan all channels. The plots in Figures 3 shows the excellent uniformity across the channels, with an average value of 97 mV and a deviation of less than 1%.



Figure 3: Gain histogram over one ABCN-25 chip.

The linearity is estimated from the threshold scan measurements at charges up to 10 fC. The plot on Figure 4 shows the maximum deviations from a linear fit for 128 channels. The measurements include the factors of nonlinearity of the front-end preamplifier and shaping stages as well as the dependence of the minimum discriminator overdrive on the threshold level.



Figure 4: Linearity measurement, maximum deviation from a linear fit from 1.5 fC to 10 fC for 128 channels of the same chip.

The time walk is measured from the signal detection efficiency versus the injection pulse delay scans, related to a clock reference, for signal charges of 1.25 fC to 10 fC and a threshold fixed at 1.0 fC. The plot in Figure 5 presents the result as the position of the leading edge of the discriminator output versus the fixed clock reference. The measured time walk is within 15 ns for the nominal conditions.



Figure 5: Time walk measurements on the 128 channels of ABCN-25 chip.

*B. Digital* 



Figure 6: ABCN-25 functional block diagram.

Beyond the front-end part, the ABCN-25 contains the logic functions to perform the processing of the binary data obtained after the discriminators outputs. The block diagram of the ABCN-25 ASIC including the logical function is shown in Figure 6. The pipeline is made of a 128×256 addressable static RAM, storing the data for 6.4 μs. After a L1 signal is received, the data corresponding to 3 bunch crossings (BC) is transferred to the local data derandomizer. It is made of one 128×128 addressable static RAM, allowing to store up to 43 L1 events. The most complex part of the logic is the data compression module. This function scans in a few clock cycles the hit information contained in one event (including data from 3 BC) of the derandomizer and generates the 7 bit channel number and the 3 bit data for each channel which matches a preselectable data pattern. The 10 bits per each valid hit are buffered and are available for transmission.

The data transmission is initiated by either the L1 signal (master mode) or the reception of a token signal (slave mode). The mechanism is such that when a token is received, the ABCN-25 transmits its data to one adjacent ABCN-25 chip and then issues a token. In this way the data from the same event are appended by passing from a chip to the next one (up to 20 in case of the "short" strips hybrids) up to the last chip in the data chain, set in the "master" mode. This last chip analyses the data flow, generates a header and terminates the transmission of the event. There are multiple logical mechanisms involved, which make this part of the circuit rather costly in term of the number of gates and consequently the power. One feature for example is that the data flow mechanism can be fully reversed, to have a redundant possibility of sending data.

The current consumption versus supply voltage of the ABCN-25 digital part is plotted in Figure 7. It should be noted that below 1.7 V the current supplying the chip is not anymore coming from the power source, but rather from the I/O ports. The chip is functional for power supply voltage as low as 1.3 V. The static current in the digital part is 48 mA at 2.5 V. This current is about 30 mA higher than the expected value. Further investigations are needed and will be perform to identify the sources of this excess of current. The switching current (the one scaling with the frequency) is 92 mA at 2.5 V and clock frequency of 40 MHz.



Figure 7: ABCN-25 digital current versus voltage supply.

During the design phase, control bits were added to activate or deactivate the clocking on some functional blocks of the chip, to measure the dynamic current related to these functions. The dynamic currents taken by the different functions, measured by controlling these bits are shown in Table 2.

PIPE-STOP interrupts the clock of the main pipeline block, made of the 128×256 RAM cells, the R/W addressing circuit, and the Built-in Self Test circuit. REG\_STOP affects the correcting mechanism of the triple redundant registers. There are 8 such registers in this circuit, built in such a way that one bit corruption due to a Single-Event-Upset (SEU) is detected and corrected. This functionality requires continuous running of the local clock. The REG\_STOP bit interrupts this clock and consequently the SEU correction mechanism is not

active. The CD\_STOP interrupts the clock on the command decoder circuitry, which executes the non critical commands, like the R/W operation of the biasing registers. The column "Readout clock STOP" of Table 2 indicates the possibility to not apply the clock to the data serializer (this clock may run at either 40 or 80 MHz in normal condition).

| Mode                         | I digital       | Reduction       | $\frac{0}{0}$ |
|------------------------------|-----------------|-----------------|---------------|
| Normal                       | 92 mA           |                 |               |
| PIPE-STOP                    | 72 mA           | $20 \text{ mA}$ | 22%           |
| <b>REG STOP</b>              | 87 mA           | $5 \text{ mA}$  | 5.5%          |
| CD STOP                      | $87 \text{ mA}$ | $5 \text{ mA}$  | 5.5%          |
| Readout clock<br><b>STOP</b> | 87 mA           | $5 \text{ mA}$  | 5.5%          |
| <b>All STOPS</b>             | 57 mA           | $35 \text{ mA}$ | 38.5%         |

Table 2: Digital supply current measurement for various digital functions deactivated.

The last column resumes the current left if all the STOPS are applied. The remaining dynamic current (57 mA, 61.5% of the total current) is taken by the derandomizer, the large compression logic circuitry, and the readout controller. These measurements show that a significant amount of the dynamic current is taken by the data compression and readout logic parts. In this realization, these two parts are not designed with specific SEU error detection and/or correction mechanisms. Implementing the SEU detection and correction in this functional blocks in the future will increase further the amount of current. A careful optimisation, possibly simplification, of the readout mechanism should be considered to limit the power consumption in the digital parts.

#### III. SHUNT REGULATORS

Serial powering of detector modules is one of the possible option to the power distribution problem, however, it introduces new aspects that have to be addressed in the frontend ASIC. The scheme requires that each module, comprising 20 to 40 ABCN ASICs, depending on the module design, have to be powered through a shunt regulator. The shunt regulator can be either an external device, one per hybrid, or can be a distributed structure, i.e. each ASIC contains a shunt regulator, which are then connected in parallel on the hybrid.

The ABCN design comprises two prototypes of distributed shunt regulator circuits, which can be used alternatively. One circuit is a full shunt regulator. Another circuit comprises only shunt transistors, with gate control inputs, which are foreseen to be driven by an external voltage control loop, common for all ASICs connected in parallel on the hybrid.

#### *A. Internal Distributed Shunt Device*

The conceptual schematic diagram of the developed shunt regulator suitable for connecting several shunt regulators in parallel on the hybrid is shown in Figure 8. The circuit monitors the current flowing through the shunt transistor and compares with 6 preset reference currents. If the shunt current exceeds a given reference current the reference voltage and so the output voltage of the regulator is adjusted. In this mechanism the output voltages of several shunt regulators connected in parallel are adjusted to a common level.



Figure 8: Conceptual schematic diagram of the shunt regulator with auxiliary correction amplifier.

The effectiveness of this circuit has been verified on a dedicated test board on which 4 ABCN-25 chips are powered, activated and transmitting data. Current monitors are added in the power distribution lines to measure the current of each ABCN-25 individually, whereas the power source is set in the current source mode.



Figure 9: Voltage versus current on the 4 ABCN-25 test board. Squares : V versus I with the power supply set as a voltage source. Diamonds: V versus I with the internal shunt enabled, and the power supply as a current source.

In Figure 9, the voltage shunt operation with the 4 internal shunt devices in parallel is demonstrated, with the test board equipped with 4 ABCN-25. The voltage is limited below 2.8 V when the current on the board is forced well above the nominal current of 800 mA at 2.7 V.



Figure 10: Current distribution per ABCN-25 chip versus total current, internal shunt enabled.

The additional current is derived through the shunt elements, as shown on Figure 10. The currents per each ABCN-25 are showing the expected behaviour: looking to the chip ABCN0, the current in the shunt device is increasing according to the source current, then it gets limited to approximately 80 mA. When increasing the source current, the excess current is passing through the shunt devices of ABCN2 and ABCN1, up to the same limit of around 80 mA, finally it gets through the shunt device of ABCN3.

#### *B. Shunt Device with external control*

The schematic diagram of the distributed shunt devices with an external feedback control is shown in Figure 11. The external shunt control line drives in parallel the shunt elements distributed in chips.



Figure 11: Conceptual schematic diagram of the shunt regulator with auxiliary correction amplifier.

In Figure 12, the voltage shunt operation with the 4 shunt devices in parallel is shown, with the test board equipped with 4 ABCN-25. The voltage is limited slightly below 2.5 V when the current on the board is forced above the nominal current of 700 mA at 2.45 V.

The additional current is derived through the shunt elements, as shown on Figure 13. The current in excess is reasonably distributed across each ABCN-25 shunt device. As expected, the degenerated current mirror circuit controlling the gate of the large shunt transistor helps to limit the difference of current, which may result from different transistor



parameters. It avoids that one chip takes the majority of the excess current.

Figure 12: Voltage versus current on the 4 ABCN-25 test board. Squares : V versus I with the power supply set as a voltage source. Diamonds : V versus I with the distributed shunt enabled, and the power supply set as a current source.



Figure 13: Current distribution per ABCN-25 chip versus total current, distributed shunt enabled.

## IV. PERSPECTIVE FOR THE ABCN CIRCUIT IN 130 NM PROCESS

The ABCN-25 power measurements demonstrate that, with this readout architecture, the digital power dominates the analogue power, by a factor of approximately 3 to 4. A number of parameters contribute this large factor, which can be reviewed in the case of transferring the same functionality to a 130 nm technology.

 Because the digital power supplies the on-chip LDO voltage regulator delivering the 2.2 V required by the analogue front-end, the supply voltage for the digital part is fixed at 2.5 V. With this arrangement it is not possible to operate the

ABCN-25 chip at lower voltage supply in the prototype hybrids, to take advantage of the resulting reduction in the power consumption of the digital part (the dynamic power scales with the power of 2 of the voltage).

With a 130 nm technology, the voltage applied to the digital part could be as low as 1 V or 0.9 V (observing that our speed requirement of 40 to 160 MHz for the readout is far away from the limits imposed by the technology). Simulations have shown that a factor 6 in power reduction is possible, cumulating the effects of voltage reduction and standard layout techniques, which were not applicable in the 250 nm technology because of radiation dose tolerance. The estimates show that reduction of digital supply current down to 56 mA (56 mW power) is possible. The analogue front-end will require still 1.2 V voltage source supply. The consequence of such choices (1.0 V or below for digital, 1.2 V for analogue) has to be considered in the framework of the discussions on the power distribution to the ASICs.

#### V. ACKNOWLEDGEMENTS

The authors acknowledge support received from the European Community: W. Dabrowski and K. Swientek from the Seventh Framework Programme FP7/2007-2013 under Grant Agreement no 21214, and also from the Polish Ministry of Science and Higher Education, K. Poltorak from a Marie Curie Early Stage Research Training Fellowship of the European Community's Sixth Framework Programme under contract number MEST-CT-2005-020216.

#### VI. REFERENCES

- [1] F. Anghinolfi, W. Dabrowski, Proposal to develop ABC-Next, a readout ASIC for the S-ATLAS Silicon Tracker Module Design. <https://edms.cern.ch/document/722486/1>
- [2] M. Weber, Research and Development of Power Distribution Schemes for the ATLAS Silicon Tracker Upgrade. <https://edms.cern.ch/document/828970/1>
- [3] P. Moreira, The GBT project, in these proceedings.
- [4] P. Farthouat, A. Grillo, Read-out Electronics for the ATLAS upgraded tracker. <https://edms.cern.ch/document/781398/1>
- [5] F. Campabadal et al., Design and performance of the ABCD3TA ASIC for readout of silicon strip detectors in the ATLAS semiconductor tracker. Nucl. Instr. and Meth. A. 2005,vol. 552, pp. 292–328.
- [6] A. Greenall, Prototype flex hybrid and module designs for the ATLAS Inner Detector Upgrade utilising the ABCN-25 readout chip and Hamamatsu large area Silicon sensors
- [7] M. Dwuznik, S. Gonzalez Sevilla, Replacing full custom DAQ test system by COTS DAQ components on example of ATLAS SCT readout, in these proceedings.

# Reduction techniques of the back gate effect in the SOI Pixel Detector

R. Ichimiya<sup>a</sup>, Y. Arai<sup>a</sup>, K. Fukuda <sup>b</sup>, I. Kurachi <sup>b</sup>, N. Kuriyama <sup>c</sup>, M. Ohno <sup>d</sup>, M. Okihara <sup>c</sup> for the SOI Pixel collaboration

<sup>a</sup> Institute of Particle and Nuclear Studies, High Energy Accelerator Research Org., KEK, Tsukuba 305-0801, Japan <sup>b</sup> Oki Semiconductor Co. Ltd., 2-4-8 Shinyokohama, Kouhoku-ku, Yokohama 222-8575, Japan <sup>c</sup>Oki Semiconductor Miyagi Co. Ltd., 1, Okinodaira, Ohira-mura, kurokawa-gun, Miyagi 981-3693, Japan

<sup>d</sup> National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba 305-8568, Japan

## ryo@post.kek.jp

### *Abstract*

We have fabricated monolithic pixel sensors in 0.2  $\mu$ m Silicon-On-Insulator (SOI) CMOS technology, consisting of a thick sensor layer and a thin circuit layer with an insulating buried-oxide, which has many advantages. However, it has been found that the applied electric field in the sensor layer also affects the transistor operation in the adjacent circuit layer. This limits the applicable sensor bias well below the full depletion voltage. To overcome this, we performed a TCAD simulation and added an additional p-well (buried pwell) in the SOI process. Designs and preliminary results are presented.

#### I. INTRODUCTION

A Silicon-on-Insulator (SOI) CMOS technology has a lot of advantages to realize a high-speed and low-power LSI circuit. Nowadays, SOI CMOS technology is widely used for commercial and industrial production. SOI technology enables a monolithic pixel detector by bonding thick highresistivity silicon for sensor and thin low-resistivity silicon for readout electronics interleaved with an insulating buried oxide layer (BOX). Contacts between the sensing nodes of the sensor layer and the readout circuitry are made through the BOX layer [1-3]. Compared to conventional bulk CMOS pixel sensors, SOI pixel sensor has following advantages:

- No mechanical bump bonding is required; minimizing multiple scattering in the detector and making smaller pixel size is possible.
- Small parasitic capacitance  $(\sim 10 \text{fF})$  of sensing nodes gives a large conversion gain and lower noise.
- Small active volume in each transistor ensures latchup immunity and high radiation tolerance.
- l Both sensor and readout electronics can be fabricated with the industry standard SOI process; further progress and lower cost are expected.



Figure 1: Cross-sectional view of the SOI pixel detector

We have been developing an SOI pixel process based on OKI Semiconductor Co. Ltd., 0.2  $\mu$ m CMOS fully-depleted (FD-) SOI commercial mass-production process [4].

### II. SOI PIXEL PROCESS

Figure 2 shows a simplified procedure for the fabrication process of the SOI pixel detector. After etching BOX layer, implantation of  $p+/(n+1)$  to handle wafer is performed, then contacts between the  $p+/(n+1)$  wells and the 1<sup>st</sup> metal layer are formed.

After wafer processing, the wafer backside is ground mechanically from 725 um to 260 um, then sputtered with 200 nm of aluminum. The detector bias voltage can be applied from the backside and also from the top pads which are connected to a high voltage n+ ring.

Characteristics and SOI process parameters are summarized in Table 1. Three types of transistors, Metal-Insulating-Metal (MIM) capacitors, depletion MOS (DMOS), lateral diodes and several kinds of resistors are provided.



Figure 2: Conceptual SOI pixel detector process flow.

Table 1: SOI pixel process specifications.

| Process         | 0.2 µm Low-Leakage Fully-Depleted SOI<br>CMOS,                 |
|-----------------|----------------------------------------------------------------|
|                 | 1 poly, 4 Metal layers, MIM cap., DMOS                         |
|                 | option,                                                        |
|                 | Core (I/O) voltage = 1.8 (3.3) V                               |
| SOI wafer       | Diameter: $200 \text{ mm} \phi$                                |
|                 | Top Si: Cz, $\sim$ 18 $\Omega$ -cm, p-type, $\sim$ 40nm thick  |
|                 | Buried Oxide: 200 nm thick                                     |
|                 | Handle wafer: Cz n-type $700\Omega$ -cm, 725 $\mu$ m           |
|                 | thick                                                          |
| <b>Backside</b> | Thinned to 260 µm, and sputtered with Al<br>$(200 \text{ nm})$ |
| Transistor      | Normal and low threshold transistors are                       |
|                 | available for both core $(1.8V)$ and IO $(3.3V)$ .             |
|                 | Three type of structure (body-floating,                        |
|                 | source-tie and body-tie) are available.                        |
| Optional        | Buried p-well (BPW) formation.                                 |
| process         |                                                                |

We have been organizing Multi Project Wafer (MPW) runs periodically to reduce development cost and share knowledge. We have three MPW runs in 2009. In each MPW run, we have about 15-20 designs from SOI pixel collaborators [5-7]. This MPW runs are open to any academic users.

#### III. BACK GATE EFFECT REDUCTION TECHNIQUE

While the SOI structure is ideal for realizing the monolithic pixel detector, applied electric field in the sensor layer also affects transistor operation in the adjacent LSI circuit layer (back gate effect). Due to this phenomenon, sufficient bias voltage to make the sensor fully depleted could not be applied. To understand the back gate effect in detail, we performed a TCAD simulation. Figure 3 shows the TCAD simulation result of the electron current density distribution of a core NMOS transistor. When a backside bias voltage  $V_{\text{back}} =$ 30V is applied, a current path is formed (back side channel) below gate at lower surface of the SOI layer (displayed in orange color). During the back side channel is open, the transistor remains ON even if negative gate voltage is applied.



Figure 3: A TCAD simulation result of electron current density distribution of a core NMOS transistor is displayed.  $(V_d=0.1V,$  $V_g=V_s=0$ V) Under a backside bias voltage  $V_{back} = 30V$  is applied, a current path is formed (back side channel) below gate at lower surface of the SOI layer (displayed in orange color). Note that the gate electrode (gray) in this plot is not in scale.

Based on this TCAD simulation study, we have introduced buried p-well (BPW) implantation process in the handle wafer. A p-type dopant is implanted through the top Si layer to form a p-well just below the buried oxide (BOX) layer (Figure 4). The doping level of BPW is about three orders lower than that of the p+ sensor node and source/drain region, so it does not affect transistor's characteristics. We have optimized the implantation energy by a TCAD process simulation so that the peak density is located under the BOX region.

Figure 5 (6) shows the  $I_d$ -V<sub>gs</sub> curve of an NMOS (PMOS) transistor of a TEG chip when the backside bias voltage is applied. Especially NMOS transistor is affected by applied backside bias voltage. However, by introducing BPW, the back gate effect is effectively suppressed for both NMOS and PMOS transistors.



Figure 4: Buried p-well (BPW) implantation method is shown. By implanting light p-type dopant under the BOX layer, the back gate effect is effectively suppressed. In the pixel, BPW can be used to extend the sensor node.



Figure 5: Measured NMOS  $I_d$ -V<sub>gs</sub> curves without BPW (left) and with BPW (right) shown for various backside bias voltages.



Figure 6: Measured PMOS  $I_d$ -V<sub>gs</sub> curves without BPW (left) and with BPW (right) shown for various backside bias voltages.

## IV. PIXEL DETECTOR TEST RESULTS

Figure 7 shows the SOI pixel sensor I-V characteristics. The break down voltage depends on the guard ring geometry and the BPW layout. The BPW layer reduces electric field gradients at critical points, so it increases the break down voltage.

We are developing two kinds of pixel detectors. One is integration type pixel detector named INTPIX. The other is counting type pixel detector named CNTPIX.



Figure 7: I-V characteristic of SOI pixel sensor (INTPIX3).

## *A. Integration type pixel (INTPIX)*

The integration type pixel (INTPIX) has 5 mm by 5 mm chip size and 128 x 128 pixels each 20 µm square. Figure 8 shows a readout circuit implemented for each pixel. The detector signal is buffered by a source follower and then stored in a 100 fF capacitor ( $C_{\text{store}}$ ). When read\_x is asserted, V<sub>store</sub> is readout by an external ADC.

Figure 9 shows a visible light image taken by the INTPIX detector.







Figure 9: A visible light image taken by the INTPIX detector with a mask.

## *B. Counting type pixel (CNTPIX)*

Figure 10 shows a readout circuit for each pixel of the counting type pixel (CNTPIX). The preamplifier circuit is based on the design proposed by Krummenacher [8] which contains leakage current compensation circuitry. It is equipped with low and high threshold discriminators so that window comparator mode is possible. Discriminator output is fed to a 16-bit counter. The size of a pixel is about 60  $\mu$ m square.

Figure 11 shows an 8 keV X-ray image taken by the CNTPIX detector with a brass mask in front.







(CNTPIX).

#### V. SUMMARY

We are developing SOI pixel detectors based on 0.2  $\mu$ m OKI Semiconductor FD-SOI commercial mass-production process. While the SOI structure is ideal for realizing the monolithic pixel detector, the back gate effect caused by applied bias voltage has to be overcome. We have developed a BPW implantation technique and confirmed it to suppress the back gate effect effectively.

We have been organizing MPW runs to share runs with designs from SOI pixel collaborators. Two types of SOI pixel detector (integration type and counting type) has been developed and confirmed of their functionality.

## VI. ACKNOWLEDGEMENTS

This work is supported by Japan Science and Technology Agency (JST) financially, also is supported by VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with Cadence Corporation and Mentor Graphics Corporation.

## VII. REFERENCES

 [1] Y. Arai, `Electronics and Sensor Study with the OKI SOI process', Proceedings of Topical Workshop on Electronics for Particle Physics (TWEPP-07), CERN-2007- 007, pp. 57-63.

[2] Y. Arai, et al., `SOI Pixel Developments in a 0.15 mm Technology', 2007 IEEE Nuclear Science Symposium Conference Record, N20-2, pp. 1040-1046.

[3] SOIPIX Collaboration,<http://rd.kek.jp/project/soi/>

[4] K. Morikawa, Y. Kajita, M. Mitarashi, `Low-Power LSI Technology of 0.15  $\mu$ m FD-SOI', OKI Technical Review, Issue 196, Vol. 70, No. 4, pp. 60-63, 2003.

[5] M. Battaglia, et al., Nucl. Instr. And Meth. A583, (2007) 526.

[6] D. Kobayashi, et. al, IEEE Trans. Nucl. Sci. Sci., Vol. 55 (2008) p.2872

[7] K. Hara, et. al, IEEE Trans. Nucl. Sci., Vol. 56, Issue 5 (2009), pp. 2896-2904

[8] F. Kurmmenacher, Nucl. Instr. and Meth. A305(1991) 527.

## Low noise, low power front end electronics for pixelized TFA sensors

K. Poltorak<sup>a</sup>, C. Ballif<sup>b</sup>, W. Dabrowski<sup>c</sup>, M. Despeisse<sup>b</sup>, P. Jarron<sup>a</sup>, J. Kaplon<sup>a</sup>, N. Wyrsch<sup>b</sup>

<sup>a</sup> CERN, 1211 Geneva 23, Switzerland

<sup>b</sup> Ecole Polytechnique Federale de Lausanne (EPFL), Institute of Microengineering (IMT), Photovoltaics and thin film electronics laboratory EPFL-STI-IMT-NE, PV-LAB, Rue Breguet 2, CH-2000

<sup>c</sup> AGH - University of Science and Technology, Faculty of Physics and Applied Computer Science, Al. Mickiewicza 30, 30-059 Cracow, Poland

Karolina.Poltorak@cern.ch

## *Abstract*

Thin Film on ASIC (TFA) technology combines advantages of two commonly used pixel imaging detectors, namely, Monolithic Active Pixels (MAPs) and Hybrid Pixel detectors. Thanks to direct deposition of a hydrogenated amorphous silicon (a-Si:H) sensor film on top of the readout ASIC, TFA shows the similarity to MAP imagers, allowing, however, more sophisticated front–end circuitry to extract the signals, like in case of Hybrid Pixel technology. In this paper we present preliminary experimental results of TFA structures, obtained with 10  $\mu$ m thick hydrogenated amorphous silicon sensors, deposited directly on top of integrated circuit optimized for tracking applications at linear collider experiments. The signal charges delivered by such a-Si:H n-i-p diode are small; about 37 e-/ $\mu$ m for minimum ionizing particles, therefore a low noise, high gain and very low power of the front- end are of primary importance. The developed demonstrator chip, designed in 250 nm CMOS technology, comprises an array of 64 by 64 pixels laid out in 40  $\mu$ m by 40  $\mu$ m pitch.

## I. THIN FILM IN ASIC

The next generation of particle colliders in high-energy physics experiments present many challenges for tracking detectors; concerning segmentation, readout speed, level of integration, power constrained low noise electronics, mechanical complexity and radiation immunity [1]. In parallel to commonly used MAP and Hybrid Pixel technologies, new trends and innovations aiming at improving detector performance are being developed [2]–[3]. One of these alternatives, called Thin Film on ASIC (TFA) technology, combines the advantages of both technologies mentioned above. In a TFA structure thin sensor film is deposited directly on top of the readout ASIC, allowing to get rid of the bump bonding, which imposes limitations on sensor segmentation, cost and material budget. A low deposition temperature of the TFA sensor elements, around 200  $\degree$ C, is compatible with post processing on finished ASIC wafers. This allows for separate design, optimization and bias of the sensor and readout electronics, like in case of Hybrid Pixel detectors. A schematic diagram of the TFA structure is presented in Fig. 1



Figure 1: Schematic representation of the TFA structure, composed of a p-i-n diode deposited directly on an ASIC.

## *A. Sensor*

The sensor is built on top of the ASIC by consecutive depositions of n-doped, intrinsic and p-doped films forming a n-i-p diode. The pixelized ASIC top metal, which serves as sensor bottom contacts (anodes), defines the sensor segmentation. In order to keep this segmentation without patterning the n-layer, which is common over all the ASIC surface, the n-layer is designed with a low conductivity, providing an isolation higher than 10 M $\Omega$ . The common top electrode (cathode), deposited on the sensor p-layer, is represented by a Transparent Conductive Oxide (TCO) made from Indium Tin Oxide. The sensing layer, made of hydrogenated amorphous silicon (a-Si:H), is placed between the ASIC top metal and the TCO electrode. This material has been studied over the past 30 years and is widely used in solar cells industry and in various imaging devices [4]. An attractive feature of the a-Si:H sensors is high radiation hardness [5], which makes them an interesting and promising option for tracking detectors in high-energy physics experiments. Although, the most recent results show that more studies need to be done on this material to conclude on its potential higher radiation hardness compared to crystalline silicon [4]. Despite significant progress in technology of depositing thin film hydrogenated amorphous silicon on ASICs, the signal charges delivered by such sensors are small, about 37 e- $/\mu$ m for a minimum ionizing particle [6]. Taking into account reasonable diode thicknesses of 15  $\mu$ m, fully depleted, one can expect the signals up to 600 e−. Therefore a low noise and high gain front-end circuit is of primary importance.

#### *B. Readout electronics*

A schematic diagram of the developed readout circuit is shown in Fig. 2. The circuit is based on a charge sensitive preamplifier built around an unbuffered cascode stage with feedback capacitor  $C_f$  of 1.3 fF, which provides sufficiently high gain of 800 mV/fC in the single stage amplifier.



Figure 2: Schematic diagram of the charge sensitive preampli fier with the soft reset.

The dimensions of the input PMOS transistor M1 are 6  $\mu$ m/0.28  $\mu$ m, which allows us to keep the gate capacitance  $(C<sub>g</sub> = 10$  fF) small compared to the total input capacitance  $(C_{int} = 40 \text{ fF}, \text{ including the detector capacitance } C_d)$ , which determine the noise performance of the circuit.

The preamplifier works as a gated integrator with acquisition time t<sub>acq</sub> and integration time constant  $\tau_i$ . The operation sequence starts with the reset phase, when switch  $S_{res}$  is open, and the reset current  $I_{reset}$  flows through current mirror M5-M6 feeding transistor M4. The gate of this transistor is biased by constant voltage  $V_{bias}$ , and therefore transistor M4 is kept in saturation, causing continuous discharging of the feedback capacitor  $C_f$ . During the reset phase, the switch  $S_{out}$  stays open and no incoming signals are sent to the preamplifier output. In the next step, the preamplifier operates in the acquisition mode, when the input signals are amplified and stored in the preamplifier output. In this phase, switch  $S_{res}$  is closed, and no current flows through transistor M4 in the feedback loop. The input signal is integrated on the feedback capacitor  $C_f$  and transferred through switch  $S_{out}$  to the output capacitor  $C_{out}$ . Simultaneously, switch S<sub>out</sub> opens and the preamlifiers array is readout out. Since capacitor  $C_{out}$  is disconnected from the cascode output, the preamplifier is kept in the reset mode while the array of output capacitors is read out by a serial multiplexer. When data from the pixels array are sent out, the output capacitors needs to be discharged by short reconnection to the preamplifier.

During the reset phase, the feedback capacitance is discharged through the transistor biased with a constant current. This is a novel solution compared to commonly used voltage controlled reset transistor. We have investigated this new schema because otherwise the parasitic charge injection from the reset signal to the very small feedback capacitor  $C_f$  would lead to saturation of the preamplifier. From this point of view, a small reset current is favorable. On the other hand, the preamplifier stage working in a soft reset regime operates as a transimpedance amplifier with parallel noise sources originated from transistors M4 and M6. For higher reset currents, the gain of the cascode stage working in the reset mode is decreased, and one could expect lower output noise. However, this circuit is even more complex, since the two switchable modes, reset and acquisition, represent two different signal (and noise) input-tooutput transfer functions.

### II. NOISE ESTIMATION

The presented design was optimised for the linear collider application, where the time window when interesting events may appear is short, in a range of hundreds nanoseconds. In order to minimize the influence of the sensor leakage current on the readout electronics, the preamplifier should be switched to the acquisition mode only when interesting events arrive to the sensor. During this time window the noise of the front-end electronics needs to be minimized to ensure high signal to noise (SNR) ratio. The noise estimation is performed separately for the reset phase and for the acquisition phase in the frequency domain. Since this circuit is time-variant and its input-to-output transfer function depends on the actual mode of the preamplier operation, the noise calculation are more complex than in case of time-invariant circuits. Therefore, we have employed a simplified model. It is assumed that the SNR ratio in the acquisition phase is determined by two noise components:

- $\bullet$  noise generated in the preamplifier during the acquisition phase,
- noise sampled at the reset phase.

The former term is due to the cascode input and load transistors (M1 and M3), as well as to the sensor lakage current. In order to describe the latter term, the following model is assumed: when the preamplifier operates in the reset mode, the noise at the output node is fed back to the input node through the feedback loop. When the preamplifier is switched from the reset to the acquisition mode, the noise at the input node is sampled. Subsequently, this sampled noise is transfered to the output node by the acquisition phase transfer function. The noise calculations were performed by using the models proposed by van der Ziel [7] slightly modified for weak and moderate inversion regions of the MOS transistor [8].

#### *A. Noise in the acquisition phase*

The main noise sources, which are taken into account, originate from the cascode input transistor M1, cascode load transistor M3 and from the detector leakage current. The analysis is performed based on van der Ziel expressions for noise power spectra densities [7] and Enz-Krummenacher-Vittoz (EKV) analytical MOS transistor model [8]. The following noise sources were taken into account in the analysis: for the input transistor M1 the channel thermal noise, gate induced current (GIC) noise, the flicker noise and the correlation term, while for the load transistor M3 the channel thermal noise term only. It is assumed that the overall shaping function of the preamplifier is equivalent to a gated integrator, where the integration time constant is defined by the bandwidth of the unbuffered cascode stage loaded with input, output and feedback capacitances. A finite readout time cuts off the low frequency noise components. After a detailed analysis of the circuit, one obtains the approximate formula (1), which describes frequency transfer function  $K_{acq}(f)$  used further for calculation of the Equivalent Noise Charge (ENC):

$$
|K_{acq}(f)|^2 = \frac{1}{1 + (2\pi f)^2 \tau_i^2} \frac{(2ft_{acq})^2}{1 + (2ft_{acq})^2},
$$
 (1)

where  $\tau_i$  is the integration time constant given by formula (2):

$$
\tau_i = \frac{C_{int}C_{out} - C_f(C_{int} + C_{out})}{-C_{int}g_{ds3} + C_f(g_{ds3} + g_{m1})},\tag{2}
$$

 $g_{m1}$  is the tranconductance of transistor M1,  $g_{ds3}$  is the output conductance of transistor  $M3$ ,  $C<sub>int</sub>$  is the estimated total input capacitance including the a-Si sensor capacitance, parasitic capacitances extracted from the layout and the gate capacitance of the input transistor M1,  $C_{out}$  is the estimated total output capacitance and  $t_{\text{acq}}$  is the duration of the acquisition phase. Assuming that the acquisition time  $t_{\text{acq}}$  is much longer than the integration time constant  $\tau_i$ , the charge gain of the circuit depends only on the value of feedback capacitor  $C_f$ .

Fig. 3 shows the ENC calculated as a function of the input transistor bias current I<sub>D</sub> for the acquisition time t<sub>acq</sub> of 1  $\mu$ s and following values of capacitances extracted from the layout and estimated for the 10  $\mu$ m thick a-Si sensor; C<sub>int</sub> = 40 fF (including detector capacitance 5.5 fF),  $C_f = 1.3$  fF,  $C_{out} = 120$  fF.



Figure 3: Calculated noise originating from the acquisition phase. Acquisition duration 1  $\mu$ s.

The nominal bias current of 2  $\mu$ A has been chosen as a compromise between power consumption, which is about 10  $\mu$ W, and the integration time constant  $\tau_i$ , equal to 80 ns, which defines minimum readout time  $t_{\text{acq}}$  and consequently, the sensitivity of the ENC to the parallel noise sources. For the nominal parameters of the circuit described above, the expected ENC is below 16 e−.

The noise related to the detector leakage current was calculated for three acquisition times: 1  $\mu$ s, 0.5  $\mu$ s and 0.3  $\mu$ s. The ENC versus detector leakage current is presented in Fig. 4. The preamplier is optimized assuming maximum sensor leakage current of 10 pA. This noise should be compared with the noise originating from the reset phase.



Figure 4: Calculated noise originating from detector leakage current.

#### *B. Noise in the reset phase*

The noise in the reset phase is due to the channel thermal noise of transistors M1, M3, M4 and M6. During the reset phase the feedback transistor M4 is biased with the reset current I<sub>reset</sub>. Therefore, the preamplifier transfer function  $K_{res}(f)$ differs from the one of the acquisition phase  $K_{acq}(f)$  defined by (1). One should remember, the  $K_{res}(f)$  strongly depends on the reset current I<sub>reset</sub>, which sets the feedback transistor transconductance  $q_{m4}$  and consequently the active feedback resistance. Taking into account the equations describing the spectral densities of channel thermal noise related to MOS transistors listed above, and applying the preamplifier transfer function  $K_{res}(f)$ , one can calculate the root mean square (RMS) value of noise, which is fed back from the output to the input node. Subsequently, this value is transfered to the preamplifier output node by using the  $K_{acq}(f)$ . This noise value, expressed in ENC, as a function of reset current  $I_{reset}$  is presented in Fig. 5. As one can conclude, in order to minimize the total noise in the acquisition phase, the reset current should be set to low values.



Figure 5: Calculated ENC originating from the reset phase as it it seen in the acquisition phase.

## III. EXPERIMANTAL RESULTS

A prototype chip, called Amorphous Frame Readout Pixel (AFRP), has been designed and manufactured in 0.25  $\mu$ m CMOS process. The photo of AFRP demonstrator chip is presented in Fig. 6.



Figure 6: AFRP chip.

The device contains an array of 64 by 64 pixels with a 40  $\mu$ m by 40  $\mu$ m area, read out serially through a multiplexer. Due to the limited number of metal layers on the prototype chip, the active area of the input electrode is only 20  $\mu$ m by 20  $\mu$ m. The analog and digital grounds and power supply buses are separated to reduce the noise in the preamplifier. Two clock signals to read out rows and columns of the chip, as well as 10MHz master readout clock are supplied externally using the Low Voltage Differential Signaling (LVDS) standard. The readout time of the chip is less than 2.5 ms (600 ns/pixel). The 10  $\mu$ m thick a-Si sensor was deposited directly on the AFRP chip surface. The deposition was done in the Institute of Microengineering (IMT, EPFL/STI), in Neuchatel by using Plasma Enhanced Chemical Vapor Deposition process.

## *A. Noise performance of the bare chip*

A bare AFRP chip was tested to characterise the noise performance of its 4096 pixels. The noise map of the 64 by 64 matrix of pixels is presented in Fig. 7



Figure 7: Bare chip noise map for  $I_{\text{reset}} = 10 \text{ nA}$  and  $t_{\text{acq}} = 1 \mu\text{s}$ .

The noise on each pixel equals the RMS value of the output voltage, taken from 200 full chip scans, decreased by the voltage pedestal (mean value of 200 measurements) and expressed in ENC. The noise performance of the raw pixels is homogeneous over whole chip area. The noise distribution of the 4096 pixels, presented in Fig. 8, shows the ENC mean value of 49 eand a standard deviation  $\sigma$  of 4 e-.



Figure 8: Bare chip noise spread over 4096 pixels for  $I_{reset}$  = 10 nA and  $t_{\text{acc}} = 1 \mu s$ .

The averaged noise dependence on reset current is shown in Fig. 9.



Figure 9: Calculated and measured noise comparison for the bare AFRP1 chip.

#### *B. Noise performance of TFA*

The noise performance of the TFA structure was investigated in the same way as for the bare AFRP chip. The chip noise maps for various values of acquisition time  $t_{\text{acq}}$  are presented in Fig. 10a – 10c. These noise maps show much larger spread across the pixel array compared to bare AFRP. This effect is even more pronounced for the longer acquisition times, which indicates that the spread is mainly due to variation of the sensor leakage current.



Figure 10: TFA structure noise map for reset current of 10 nA and a-Si diode bias of 55V.



Figure 11: TFA structure noise spread on 4096 pixels for diode bias of 55V.

Figures 11a – 11c show distributions of noise for various acquisition times. One can note that besides relatively narrow peaks we observe long tails corresponding to pixels with a noise much higher than the average. These effects need to be most likely related to the leakage current variations, originated probably from non-uniformity of the sensor–ASIC interface. This issue needs to be investigated further. Fig. 12 shows comparison of calculated and measured noise, taking into account most probable values of measured ENC distributions.



Figure 12: Calculated and measured most probable noise as a function of reset current. The 10  $\mu$ m thick a-Si diode was biased with 55 V.

## *C. Results obtained with a 405 nm blue laser*

Signals from 405 nm blue laser, obtained on 10  $\mu$ m a-Si diode reversely biased with voltages from 5V to 60V were mea-

sured. In order to minimize the influence of the sensor leakage current on the front–end noise performance, the acquisition time was set to short value of 300 ns. During this time periods the blue laser was triggered and the signals were read out from 4096 pixels. Twenty full chip scans were made with laser pulse fired to the sensor surface. In order to illustrate the sensor response  $V_{signal}$  to the laser pulse, the pedestals, measured with no incoming laser pulses, were subtracted. The fully depleted sensor response, averaged over 20 scans, is presented in Fig. 13, clearly showing the laser pulse illumination map on our imaging device.



Figure 13: Signals from 405 nm blue laser, obtained on 10  $\mu$ m thick a-Si diode biased with voltage of 55V.

The full depletion bias voltage of the 10  $\mu$ m thick sensor

was measured by recording the maximum response  $V_{\text{max signal}}$ to the laser pulse for varying diode reverse bias voltages V<sup>a</sup>−Si bias. This method, demonstrated in [4], is based on the variations of the inducted current for varying depletion thicknesses in an a-Si:H sensors. As shown in Fig. 14, the sensor response to a blue laser pulse increases as a square root of the applied voltage, and starts to saturate for a bias voltage of 55 V. Further increase of  $V_{a-Si \text{ bias}}$  does not increase the measured signal V<sub>max signal</sub>.



Figure 14: a-Si diode maximum response to the 405 nm blue laser pulse for  $I_{reset} = 100$  nA,  $t_{acq} = 0.3 \mu s$ .

This result agrees with [4], where the full depletion bias voltage for a-Si sensor with a thickness d is estimated as  $0.48 \times d^2$ , leading to about 48 V for a 10  $\mu$ m thick diode. The response of the TFA structure to 405 nm blue laser was calibrated by comparison with the response obtained on TFA sensors developed on the MacroPad chip [9], for which the calibration factors are known from measurements of X-rays. As a result, the gain of 713 mV/fC was found (simulated gain:  $800 \text{ mV/fC}$ ) and this value was used for the ENC calculations, presented above.

The average leakage current per pixel was measured as a function of the a-Si:H reverse bias voltage. As shown in Fig. 15, for the fully depleted a-Si sensor, biased with 55 V, the leakage current per pixel is about 1 nA, which is much higher compared to the assumed level of 10 pA.



Figure 15: 10  $\mu$ m thick a-Si sensor average leakage current per pixel.

## IV. CONCLUSIONS

A 64 by 64 pixels array based on TFA technology was designed, manufactured and tested. The readout electronics has a gain of 713 mV/fC and power consumption of 10  $\mu$ W/pixel. The noise performance of bare AFRP ASIC is higher than expected, but it is satisfactory taking into account expected response signal from the a-Si sensors. The noise performance of the present TFA prototype is limited by the leakage current. Since the preamplifier was designed and optimized for sensor leakage current of 10 pA, the readout electronics can not handle with 1 nA leakage current for acquisition times longer than 1  $\mu$ s. The TFA structure was tested with 405 nm blue laser pulses, which were triggered precisely during the acquisition phase of preamplifier. Precise image of the laser spot provide a solid proof of principle for the developed novel pixel detector concept. The main problem of the TFA structure is related to the excessive leakage current, which strongly depends on the quality of an a-Si–ASIC interface. Therefore, further improvements of sensor deposition on ASIC, including the planarization of the ASIC surface, are needed.

#### **REFERENCES**

- [1] J.E. Brau, *The science and challenges for future detector development in High-Energy Physics*, SNIC symposium, Stanford (2006).
- [2] M. Moll *et al.*, *Nucl. Instr. And Meth. In Phys. Res.* **A546** (2005) 99-107.
- [3] P. Jarron *et al.*, *Nucl. Instr. And Meth. In Phys. Res.* **A518** (2004) 366-372.
- [4] M. Despeisse *et al.*, *IEEE Transactions On Nuclear Science* **Vol. 55, No. 2** (2008) 802-811.
- [5] N. Kishimoto *et al.*, *Journal of Nuclear Materials* **258-263** (1998) 1908-1913.
- [6] R. Aleksan *et al.*, *Nucl. Instr. And Meth. In Phys. Res.* **A305** (1991) 512-516.
- [7] A. van der Ziel, *Noise in Solid State Devices and Circuits*, Wiley, New York, (1986).
- [8] C. Enz, *et al.*, *Analog Integrated Circuits and Signal Processing* **8** (1995) 83-114.
- [9] M. Despeisse *et al.*, *Nucl. Instr. Meth. In Phys. Res.* **A518** (2004) 35.
# *TUESDAY 22 SEPTEMBER 2009*

# *PARALLEL SESSION B1 SYSTEMS, INSTALLATION AND COMMISSIONING*

# Commissioning of the CMS DT electronics under magnetic field

C. Fernández-Bedoya<sup>a</sup>, G. Masetti <sup>b</sup> (on behalf of the CMS DT collaboration)

<sup>a</sup> CIEMAT, Madrid, Spain  $<sup>b</sup>$ Università & INFN sezione di Bologna, Italy</sup>

# cristina.fernandez@ciemat.es

# *Abstract*

After several months of installation and commissioning of the CMS (Compact Muon Solenoid) DT (Drift Tube) electronics, the system has finally been operated under magnetic field during the so-called CRAFT (Cosmic Run at Four Tesla) exercise.

Over 4 weeks, the full detector has been running continuously under magnetic field and managed to acquire more than 300 million cosmic muons. The performance of the trigger and data acquisition systems during this period has been very satisfactory. The main results concerning stability and reliability of the detector are presented and discussed.

# I. THE CMS BARREL DRIFT TUBE SYSTEM.

The Compact Muon Solenoid (CMS) [1] is a general purpose detector designed to run at the highest luminosity at the LHC collider. The central feature of the CMS apparatus is a superconducting solenoid of 6 m diameter that generates a magnetic field of up to 4 Tesla. Such a high field was chosen in order to allow the construction of a compact tracking system on its interior, and still performing good muon tracking on the exterior.

Muons are measured in CMS by means of three different technologies of gaseous detectors. In the barrel, where the magnitude of the residual magnetic field is of the order of 2 Tesla in the iron return yoke and the neutron background and muon rate are expected to be as low as a few  $Hz/cm<sup>2</sup>$ , DTs (Drift Tubes) are used [2]. The drift tube chambers are responsible for muon detection and precise momentum measurement over a wide range of energies. The DT system also provides a reliable and robust trigger system with precise bunch crossing assignment, complemented by a set of Resistive Plate Chambers (RPC) which provides redundancy in the trigger.

The DT chambers are installed in the five wheels of the return yoke of the CMS magnet (named YB-2, YB-1, YB0, YB+1 and YB+2). Each wheel is divided in 12 sectors each covering ~30º around the interaction point and each sector is organized in four stations of DT chambers named MB1, MB2, MB3 and MB4 going from inside to outside, where MB stands for Muon Barrel. There are a total of 250 DT chambers in CMS. A schematic view of one CMS wheel is shown in figure 1.

A DT chamber is made of three (or two in MB4) Superlayers (SL), each made by four layers of rectangular

drift cells staggered by half a tube width. The wires in the two inner and outer SLs are parallel to the beam line and provide the track measurement in the magnetic bending plane  $(r, \phi)$ . In the central SL, the wires are orthogonal to the beam line and measure the  $\theta$  position along the beam. The central  $\theta$ measuring SL is not present in the MB4 chambers, which therefore measure only the φ coordinate.

The basic element of the DT chamber is the drift tube, which has cross section dimensions of 13 by 42 mm. The total number of sensitive cells is around 172,000. Any charged particle going through a cell volume will generate a signal (hit) in its anodic wire that will be amplified and discriminated by the front-end electronics before being sent to the read-out boards in order to perform time digitalization. The position of the charged particle can be related to the time measurement since the drift velocity in the cell volume is constant. Each cell provides a resolution of 250 µm, and the 100 µm target chamber resolution is achieved by the 8 track points measured in the two (r-φ) SL.



Figure 1: Transverse view of a CMS Barrel Yoke Wheel.

# *A. DT Read-out Electronics*

DT read-out electronics is designed to perform time measurement of the chamber signals that will allow the reconstruction of charged particle tracks. There are several levels of data merging in order to achieve a read-out of the full detector at a Level-1 trigger rate of 100 kHz.

A schematic view of the read-out chain is shown in figure 2. First elements are the ROBs (Read Out Boards), based on the ASIC HPTDC (High Performance Time to Digital Converter), that perform the time digitalization of the hits coming from the chambers and assign them to the Level 1 trigger. They transmit their data through a  $\sim$ 30 meter copper link to the 60 ROS (Read Out Server) boards located in the tower racks in the cavern. ROS boards are in charge of merging the information from one sector and perform several tasks of data reduction and data quality monitoring. Each sector event is retransmitted through an optical link to the DDU (Device Dependent Unit) boards located in the counting room. The DDU boards merge data from up to 12 ROS to build an event fragment and send it to the global CMS DAQ through an S-LINK64 output at 320 MBps. These boards also perform errors detection on data and send a fast feedback to the TTS (Trigger Throttling System).



Figure 2: Schematic view of the DT Read-Out chain.

# *B. DT Trigger Electronics*

The purpose of the DT trigger system is to provide muon identification and precise momentum measurement, as well as bunch crossing identification. It provides an independent Level-1 muon trigger to the experiment, selecting the four best muon candidates on each event.

The first level of the DT trigger chain is located inside the so-called Minicrates, an aluminium structure attached to the DT chambers that houses the ROBs, the Chamber Control Board (CCB) and the first level of the trigger electronics: the Trigger Boards (TRB) and the Server Boards (SB).

TRBs contain the Bunch Crossing and Track Identifier (BTI), which provides independent segments from each chamber SL, and the Track Correlator (TRACO), which correlates φ segments in the same chamber by requiring a spatial matching between segments occurring at the same bunch crossing. TRB output signals are fed to the SB which selects the best two tracks from all TRACO candidates.

Track segments are sent to the Sector collector boards in the tower racks, that perform trigger synchronization and send the encoded information of position, transverse momentum

and track quality through high-speed optical links to the DT Track Finder (DTTF) in the counting room. DTTF is divided in φ and η track finders that build full muon tracks and forward the data to the wedge and muon sorters that provide the best four muon candidates to the global muon trigger. There are different spy modes all over the chain in order to verify correctness of the data.



Figure 3: Schematic view of the DT Trigger chain.

# II. DETECTOR INSTALLATION AND COMMISSIONING

The 250 chambers which form the complete CMS Barrel DT System were assembled in four production laboratories (RWTH Aachen, CIEMAT Madrid, INFN Padova and INFN Torino) which shared the work following the four different typology of chambers. In parallel to chamber assembly, parts assembly and electronics design was carried out in other laboratories (INFN Bologna – IHEP Protvino, INFN Torino – JINR Dubna, RWTH Aachen, IHEP Beijing, CIEMAT Madrid and INFN Padova). All parts of DT readout and trigger electronics were extensively tested before and after installation. Construction of the chambers started in January 2002 and was completed in June 2006. Installation in the five Yoke wheels of the CMS detector started on surface in July 2004 and was completed in the cavern in October 2007.

The commissioning of the DT Barrel System has been a long lasting process running at various stages in parallel to chamber production and installation. Beside dedicated test beam runs taken on prototypes and on final detectors prior and during chamber construction [3][4][5][6][7], system commissioning was performed through the following phases:

- 1. Test of constructed chambers with large cosmic data samples at production sites before shipment to CERN, with final front-end electronics and temporary trigger and readout electronics. The tests included gas tightness, efficiency, dead, noisy channels, and resolution;
- 2. Full dressing of the chambers with final onboard trigger and readout electronics (Minicrates). Test again as in point 1 prior installation, pairing to Resistive Plate Chambers (RPC) and survey on a dedicated alignment bench in order to determine wire positions with respect to the external reference marks of the general CMS barrel alignment system (built by the groups of Universidad de Cantabria, Santander and KFKI Budapest);
- 3. Installation and full test of each installed chamber through a cosmic test stand, with temporary cabling and local data acquisition system;
- 4. Since April 2006, and as soon as the integration progressed and final cabling for powering and data transfer where becoming available, the chamber

commissioning turned into sector commissioning, where four stations could be operated and read-out together, thus allowing also the tracking of cosmic muons between different chambers.

- 5. The subsequent step of the detector commissioning was the so-called wheel commissioning, where all sectors in a whole wheel were tested and commissioned together. In November 2007 the read-out and trigger of one full wheel was achieved and by May 2008 the five wheels were finally operating together.
- 6. These commissioning periods were spread with different global runs in which larger parts of the CMS detector were integrated and operated together. Those global data taking were done both with and without the magnetic field.

Measurements performed in the DT chambers during the commissioning phase included the identification of local tracks generated by cosmic muons, calibration patterns, as well as the measurement of the drift velocity and of the time pedestal, for synchronization purposes. It is worth noting that only 0.2% of all the DT channels was found dead after the final detector installation and commissioning.

The DT Detector Control System (DCS) has been evolving together with the integration of the electronics. At present, all basic parts can be configured and monitored in an easy and flexible way and further work is being done in order to obtain all the status information in a synthetic and comprehensible way. The same has happened with the online monitoring software that at present allows subsystem shifters to check the quality of the data as being produced by the detector and provide fast feedback. Many plots are present to study detector efficiency, data integrity and trigger performance; but more important, the summary of the status of the detector has been distilled in a limited number of concise plots.

# III. OPERATION UNDER MAGNETIC FIELD DURING THE CRAFT EXERCISES

The first data taking exercise with the CMS magnetic field was the Magnet Test and Cosmic Challenge (MTCC) during summer-autumn 2006 [8]. This challenge was the very first global exercise for CMS in which 5% of the full system was installed and operated together on the surface hall and was the predecessor to the CRAFT (Cosmic Run at Four Tesla) exercises described here.

CRAFT exercises were two extended global data taking periods conducted in the experimental cavern with the magnetic field of the CMS detector on and with all the final systems in place: CRAFT08 ended on November 11<sup>th</sup> 2008 and CRAFT09 ended on September 1<sup>st</sup> 2009. These monthlong data-taking challenges had the following goals:

- Test the solenoid magnet at nominal field  $(3.8 \text{ T})$  insitu with the CMS experiment in its final installed configuration underground.
- Gain experience operating CMS continuously for one month.

Collect more than 300 million cosmic triggers for performance studies of the CMS detectors.

These goals were successfully met and the cosmic muon dataset collected has proven invaluable for understanding the performance of the CMS experiment as a whole. During these campaigns, the 100% of the DT system was operational and due to its favourable location for cosmic detection, the contribution of the DT system to the global campaign was extremely relevant. During CRAFT08 370 million cosmics were collected, and 83% of these events were triggered and read-out by the DT system. In CRAFT09 the collection increased to 523 million cosmics, 92% of them acquired by the DT system.

As cosmics cross the detector from top to bottom a dedicated muon configuration and synchronization was set up in the DT trigger chain. It required the coincidence of at least two chambers in the same or nearby sector without requiring that the muon tracks point to the nominal interaction point. Also the upper sectors were delayed with respect to the bottom ones to take into account the time of flight and trigger at the same bunch crossing cosmic muons crossing both top and bottom sectors.

During these data taking periods we could confirm that the DT trigger rates were very stable with time. Since the cosmic rate in the cavern underground is low, random triggers were also injected in order to stress the system to the 100 kHz maximum expected during LHC running. No problems were seen in the DT read-out system when running at high trigger rate; data integrity was not affected and no backpressure or bottlenecks were detected in the read-out path.

Calibration events (around 100 Hz) were also injected during data taking. In the DT system the calibration mechanism works through the so-called Test Pulses, in which signals are injected at front-end level simulating vertical tracks ortogonal to the chamber. This procedure allows performing inter-channel synchronization and it is also a useful tool to scan for dead channels in all the electronics chain. One of the goals in these campaigns was to verify that the calibration mechanism can be implemented in the around 2.5 µs of the LHC orbit abort gap, so that no dedicated running period would be needed. After a few corrections in the timing configuration to avoid leaks outside the orbit gap, the calibration stream was operated very satisfactorily in both CRAFT exercises.

The DT system demonstrated high reliability and stability during this long data taking periods in which it has been operated continuously. Chambers and electronics were always powered on except during magnet ramps when the high voltage of the chambers was lowered for safety reasons.

Very few problems were seen during these data taking exercises. By CRAFT08, after one year of operation of the detector in many local and global data taking campaigns, only 1,2 % of the detector was lost due to various types of problems, which were fixed during the 2008-2009 shutdown. Main activities during this shutdown included improvement of the secondary back-up copper connection to the Minicrate and reinforcement of the DT safety system in order to move toward a centrally supervised operation scheme.

No issues have been found in the DT electronics for running with the magnetic field on. The only unexpected effect observed were some problems while reading the 1-wire temperature sensors in a few ROS boards while ramping up the magnet, which was easily solved with a power cycle of the crate.

The data integrity provided by the DT read-out system during these campaigns has been excellent. The number of events in which some inconsistency has been found is very low: 15 events out of 460 million. The configuration time of the DT DAQ system is below 1 minute, and very rarely (twice in CRAFT08 and twice in CRAFT09) any error in the DT read-out forced to stop the data acquisition. During these campaigns it was also possible to verify that the TTS mechanism worked satisfactorily and that the DT system recovers smoothly from sporadic errors.

Figure 4 shows the percentage of errors versus run number in CRAFT09 as detected in the ROB/ROS system. These errors can be due to parts being off, lack of communication with Minicrates, transmission problems, etc. It can be seen that the number of errors is very low and that there is no dependency with the magnetic field.



Figure 4: Percentage of errors versus run number in CRAFT09 as seen in the DT read-out chain. The continuous pink like shows the value of the magnetic field.

The hit reconstruction efficiency in the chambers is measured using the extrapolation to the considered cell computed from the track segments built in the chamber, fitted excluding the hits in the relevant layer, and looking for the presence of a reconstructed hit in the cell.



Figure 5: Mean efficiency of cell hit detection within a SL computed with respect to the offline reconstructed segment (CRAFT09).

Cell efficiency is flat both with respect to channel number and for the different typology and dimensions of chambers. Figure 5 shows the mean cell efficiency averaged on all the cells of each SL and it can be seen that the efficiency is higher than 98%, being the inefficiency due partly to the effect of the I-beams that separate the drift tubes. This efficiency was also very similar in both campaigns and no significant differences have been seen with and without magnetic field.

The DT local trigger has also shown a very good performance. Trigger primitives have quality bits assigned, according to the number of drift cells in which hits were found aligned. In each SL an alignment of 3 out of 4, or 4 out of 4 hits is called Low (L) or High (H) quality respectively. If such alignments are correlated together between the two SLs, the quality of the trigger primitive then becomes HH, HL or  $LL$ 

As can be seen in figure 6 the measured trigger efficiency is 95% for any trigger quality and 73% for high quality correlated triggers. The efficiency in the position and direction determination by the DT local trigger is not at all affected by the presence of the magnetic field. This efficiency is what expected for cosmic muons and will be higher for LHC running, since the DT trigger system has been designed to trigger muons synchronized with the beam clock and the efficiency drops with cosmic muons that have a random time of arrival with respect to the clock.



Trigger Efficiency - Phi View (High Quality - HH/HL)



Figure 6: DT Local Trigger efficiency with respect to local reconstruction in the phi view for triggers of "any quality" (top) and "high quality (i.e. HH/HL ones - bottom) (CRAFT09).

The distribution of the quality of the trigger primitives also remains unaffected by the magnetic field, as can be seen in figure 7 for all the chambers in YB-2.

Finally, no significant differences were observed in the DT system synchronization due to the magnetic field. A variation of the maximum drift time, which corresponds to an apparent change of the drift velocity which may happen due to the presence of magnetic field, can degrade the trigger

performance, since BTIs are configured to work with the same drift velocity everywhere within the same chamber. Figure 8 shows the difference between the mean of the bunch crossing distribution obtained with and without magnetic field. The largest effect is observed in MB1 at the external wheels YB+2 and YB-2, in agreement with the expectations and in any case, too small to affect trigger capability.



Figure 7: Distribution of the quality of the trigger primitives for data taken with and without magnetic field in YB-2 (CRAFT08).



Figure 8: Difference between the mean of the bunch crossing distribution with and without magnetic field, as a function of the wheel number, for the four types of muon station (CRAFT08).

Every hit registered by the front-end electronics with a signal higher than a common threshold of 30 mV (that corresponds to 9-10 fC) and not associated to the passage of a particle, is considered a noise hit. A cell is defined as noisy if its hit rate is higher than 500 Hz. The number of noisy channels within the DT system has been analysed for different run conditions of data taking, with different subdetectors participating and for runs with magnetic field switched off and on. In figure 9 it is shown the distribution of cell noise rate in the system, which has an average value of 4 Hz. There are around 20 to 30 noisy cells out of the 172,200 cells and the distribution is stable for different run conditions and in

both CRAFT campaigns. The noisy cells are usually located in the edges of the chambers, where the high voltage cables pass through.

Even though the noise in the system is usually very low, some big noisy events have been seen sporadically during both campaigns that affect large regions in the detector. These events are independent of the magnetic field and their period is extremely low, in the order of days. They do not affect chamber performance nor the electronics chain, but deeper studies are on going in order to understand their source.



Figure 9: Distribution of the cell noise rate for different conditions of data taking (CRAFT08).

# IV. CONCLUSION

The Drift Tubes system is an example of a very large and complex system that is working at present in a very efficient and stable way through long periods of data taking. The quality of the data acquired during the CRAFT campaigns is very good, and the data integrity problems are extremely low. During the whole period the chambers and the local trigger have shown a high and stable performance, as expected for cosmic muons detection.

The cosmic data collected through this period have been very valuable to the study the performance of the detector and also for the first studies of physics which are being carried out [9]. The presented system has proven to be ready for the exciting periods ahead and the whole DT muon barrel community is eagerly waiting for the first LHC collisions.

# V. REFERENCES

[1] CMS Collaboration, The CMS experiment at the CERN LHC. JINST 3 S08004. 2008.

[2] CMS Collaboration, The Muon Project, Technical Design Report, CERN/LHCC 97-32 (1997).

[3] C. Albajar et al., Test beam analysis of the first CMS drift tube muon chamber, 2004 Nucl. Instrum. Meth. A 525 465.

[4] M. Aguilar-Benitez et al., Study of magnetic field effects in drift tubes for the barrel muon chambers of the CMS detector at the LHC, 1998 Nucl. Instrum. Meth. A 416 243.

[5] M. Aguilar-Benitez et al., Construction and test of the final CMS Barrel Drift Tube Muon Chamber prototype, 2002 Nucl. Instrum. Meth. A 480 658.

[6] P. Arce et al., Bunched beam test of the CMS drift tubes local muon trigger, 2004 Nucl. Instrum. Meth. A 534 441.

[7] M. Aldaya et al., Fine synchronization of the muon drift tubes local trigger, 2007 Nucl. Instrum. Meth. A 579 951.

[8] The CMS Collaboration, The CMS Magnet Test and Cosmic Challenge (MTCC Phase I and II)*.* CMS NOTE 2007-005.

[9] M. Chen et al., Measurement of the charge ratio in cosmic rays using global muon reconstruction in CRAFT data. CMS AN-2009/102.

# Data acquisition system for a proton imaging apparatus

V.Sipala<sup>a,b</sup>, M.Brianzi<sup>c</sup>, M.Bruzzi<sup>c,d</sup>, M.Bucciolini<sup>c,e</sup>, G.Candiano<sup>f</sup>, L.Capineri<sup>g</sup>, G.A.P.Cirrone<sup>f</sup>, C.Civinini<sup>c</sup>, G.Cuttone<sup>f</sup>, D.Lo Presti<sup>a,b</sup>, L.Marrazzo<sup>c,e</sup>, E.Mazzaglia<sup>f</sup>, D.Menichelli<sup>c,d</sup>, N.Randazzo<sup>b</sup>, C.Talamonti<sup>c,e</sup>, M.Tesi<sup>d</sup>, S.Valentini<sup>c,g</sup>.

<sup>a</sup> Dipartimento di Fisica, Università degli Studi di Catania, via S. Sofia 64, I-95123, Catania, Italy.

<sup>b</sup> INFN, sezione di Catania, via S. Sofia 64, I-95123, Catania, Italy.

<sup>c</sup> INFN, sezione di Firenze, via G. Sansone 1, I-50019 Sesto Fiorentino (FI), Italy.

<sup>d</sup> Dipartimento di Energetica, Università degli Studi di Firenze, via S. Marta 3, I-50139 Firenze, Italy.

e Dipartimento di Fisiopatologia Clinica, Università degli Studi di Firenze, v.le Morgagni 85, I-50134 Firenze,Italy.

<sup>f</sup> INFN, Laboratori Nazionali del Sud, via S. Sofia 62, I-95123, Catania, Italy.

<sup>g</sup> Dipartimento di Elettronica e Telecomunicazioni, Università degli Studi di Firenze, via S. Marta 3, I-50139 Firenze, Italy.

valeria.sipala@ct.infn.it

# *Abstract*

New developments in the proton-therapy field for cancer treatments, leaded Italian physics researchers to realize a proton imaging apparatus consisting of a silicon microstrip tracker to reconstruct the proton trajectories and a calorimeter to measure their residual energy. For clinical requirements, the detectors used and the data acquisition system should be able to sustain about 1 MHz proton rate. The tracker read-out, using an ASICs developed by the collaboration, acquires the signals detector and sends data in parallel to an FPGA. The YAG:Ce calorimeter generates also the global trigger. The data acquisition system and the results obtained in the calibration phase are presented and discussed.

# I. INTRODUCTION

The proton therapy is a good clinical treatment for cancer as it permits to obtain a dose distribution extremely conform to the target volume. In order to fully exploit the potential of proton dose release, the dose calculation should be performed with high accuracy. This issue requires the knowledge of proton stopping power inside the tissues. Up to now this information is deduced from X-Rays Computed Tomography, but the error related to this procedure is relevant. To overcome this problem, proton imaging can be used as a direct method for stopping power determination. Moreover, the same imaging system can be useful in the patient positioning verification.

The aim of the Italian project [1] is to develop a proton imaging system with density and spatial resolution less than 1% and 1 mm respectively, as clinical demands [2]. The apparatus presented reconstructs the map of the electron density by tracking the single proton through the traversed tissue and by measuring its residual energy. In fact, previous our studies [3]-[5] indicate that proton imaging based on tracking of individual protons traversing an object from many different directions and measuring their energy loss and scattering angle may yield accurate reconstructions of electron density maps with good density and spatial resolution, despite the fundamental limitation of Multiple Coulomb Scattering (MCS).

# II. PROTON IMAGING APPARATUS DESIGN

The proton imaging apparatus developed by the Italian collaboration includes a tracker with four x-y planes based on position sensitive microstrip detectors to determine particle entry and exit point and direction. Each tracker plane consists of two modules with sensors and electronic read-out positioned at 90° to each other.

Downstream the tracker, a calorimeter is used for residual energy measurement. It consists of four YAG:Ce crystals optically separated and coupled in the same housing. Its readout system acquires the information about the residual energy of the particle and generates the trigger signal and the system global event number in order to label each single proton.

The proton energy used in a proton imaging apparatus, must be 250-270MeV in order to cross the entire patient thickness[2]. Moreover, using the "single tracking technique", to acquire an image in a fraction of a secondthe, the system should be able to sustain 1MHz proton beam.

#### III. DATA ACQUISTION SYSTEM

The Fig.1 shows the architecture of the data acquisition system. Before starting the data acquisition, all tracker modules must send a trigger enable signal to the trigger generator board. When the single particle traverses all tracker modules and it is stopped in the calorimeter, the global trigger signal and the global event number are generated and sent to all tracker modules and to the calorimeter acquisition board. Then, the tracker modules and the calorimeter board acquire data in parallel mode and move data in a buffer memory. Finally, by an Ethernet commercial module the data are transferred to a PC in order to reconstruct the most likely path. The data in all tracker modules and in the calorimeter are labelled by the global event number used to associate unambiguously the data to the corresponding single proton crossing.



Figure 1: The architecture of the data acquisition system of the proton imaging apparatus that the Italian collaboration is realizing.

# *A. Tracker read-out*

The tracker module includes a front-end board and a digital board. The detector is a 256-microstrip silicon detector, produced by Hamamatsu [6], with 200 µm of thickness and 200  $\mu$ m of pitch. The active area is 53 x 53mm<sup>2</sup>.

The silicon detector, positioned in the front-end board, is coupled with eight ASICs each serving 32 front-end channels. The integrated circuit, developed by the collaboration in CMOS AMS 0.35u technology, via a charge sensitive amplifier, a shaper and a comparator, converts the fast current signal from the microstrip crossed by the particle, in a digital pulse of 300-800ns width. The duration of the pulse depends on the amount of energy released by the proton and on the threshold value used. So, for fixed threshold value, by the Time Over Threshold (TOT) technique it is possible also to measure the charge released into the silicon detector.

In order to achieve 1MHz data acquisition rate, the outputs signals are sent in parallel mode to an FPGA located on the digital board which performs zero suppression and moves data to a buffer memory. An Ethernet commercial module is use both for data transfer to the central acquisition PC and to control the tracker module DAQ parameters.

# *B. Calorimeter read-out*

The material chosen as calorimeter of proton imaging apparatus is a YAG:Ce scintillating crystals. In fact, thanks to the fast scintillating light decay constant (70ns), this crystal is able to sustain 1MHz proton rate. Moreover, the characteristic wavelength of maximum emission (550ns) permits to couple the crystal with a commercial photodiode which resolves the problem of sensitivity to the magnetic field in the gantry. The calorimeter area is  $60 \times 60$  cm<sup>2</sup> to a depth of 10cm, fixed to stop proton until 200MeV.

The read-out system consists of four charge sensitive amplifiers and four shapers. The outputs are sampled by a commercial acquisition Board at 14bit and 50MHz (UltraFast 2-4000 [7]). The number of samples needed to reconstruct the pulse is acquired. By data interpolation the amplitude of the signal is obtained, which is proportional to the proton residual energy.

In the readout system an hybrid charge sensitive amplifier, with a low decay constant, will be used. As explained in [8] a low-noise non-inverting amplifier is inserted in a conventional charge-amplifier configuration (see Fig. 2). So the discharge current is increased due to the increased voltage drop across the feedback resistor  $R_F$ . The decay time constant seen at the preamplifier's output is thus reduced. In particular, the charge-to-voltage sensitivity is increased by a factor equal to the gain of non-inverting amplifier and the decay constant decreases by the same factor. This configuration permits to obtain a high acquisition rate and a low noise.



Figure 2: Schematic of the hybrid charge sensitive preamplifier that will be used in the calorimeter readout in order to achieve a high acquisition rate.

# *C. Trigger system*

The trigger signal is generated using the calorimeter outputs: each calorimeter output is compared with a fixed threshold voltage to produce a digital pulse. Four digital pulses are summed so that, when one of four crystals is crossed by proton, a trigger signal is produced.

The trigger signal forces the acquisition board to store into its local memory the samples of the calorimeter outputs and every FPGA to read its input latches and to store the data in the onboard RAM memory using zero-suppression.

Moreover, the trigger signal increases the counting of the global event number that is attached to all the data generated by the tracker modules and by the calorimeter.

The trigger board has been realized using commercial devices.

#### IV. RESULTS

The proton imaging apparatus is in advanced status of realization. In a previous work [9] the first results obtained with only the front-end board have been shown.

At present, a x-y plane of the tracker (front-end board coupled with the digital board) is ready to be tested with proton beam. Each tracker module must be calibrated before to be test with proton beam. The results of a single tracker module calibration are presented in this section. The test with a beta source permit to conclude that the module is fully efficiency for released energy lower than expected with proton imaging application.

The YAG:Ce calorimeter has been characterized with different proton beam energies: the preliminary results are discussed.

# *A. Calibration phase of tracker module*

During the calibration phase, all 256-microstrips of the detector have been characterized using a test pulse connected to the front-end chip input by integrated test capacitance. In the last front-end board prototype, each chip, containing 32 front-end channels, has an adjustable threshold voltage value to allow for a better chip optimization. The strip outputs are acquired by the FPGA, moved to the buffer memory, transferred to the PC and analyzed off-line.

The Fig. 3 shows the calibration curves with the pulse duration as function of the input charge. With this data and using the TOT technique, it is possible to estimate the charge released by the particle inside the detector.



Figure 3: Calibration curves of all detector channels. The pulse duration vs. input charge has been plotted.

The other test have been performed in order to estimate the threshold voltage dispersion and the minimum input charge that it is possible to reveal with our tracker module.

In the Fig. 4, for a fixed input charge value  $(Q=5MIP)$ , the efficiency of all channels is shown as function of the threshold voltage value. The tracker plane is full efficient even at a threshold voltage which is 240mV above the minimum the baseline. The threshold dispersion within the full module can be estimated to be of the order of 70mV.



Figure 4: Plot of the efficiency for all channels vs. the threshold voltage value: the module is fully efficient up to ∆Vth<240mV.

Fixing the threshold voltage values at minimum level over the noise, the efficiency curves have been plotted for different input charge values in order to estimate the minimum input charge to reveal. As shown in Fig. 5, the system is fully efficient when a charge equivalent to the most probable released by a MIP is injected. A MIP in 200µm of silicon creates about 15000 electrons.



Figure 5: Plot of the efficiency for all channels vs. the input charge value: the module is fully efficient for injected charge greater than  $15000e- (=1MIP)$ .

# *B. Acquisition with beta source*

The single tracker module has been tested using a beta source  $(^{90}Sr)$ . A low noise scintillator has been placed downstream the detector in order to generate the trigger signal. A total of about 100000 events have been acquired at a maximum rate of about 20kHz. Using the pulse duration and the calibration data, the released charge in each single channel has been calculated.

Moreover, a study of the time dependences between the trigger signal and the pulse delay has been performed. The distribution of strip pulse start respect to the trigger is shown in Fig. 6. Most of the counts are located before the trigger signal. The trigger signal is more fast than the strip pulses, so, it is necessary to acquire in pre-triggering mode. The FPGA provides continuous signal sampling but sends to buffer memory only the samples in the time window centred on the trigger signal.



Figure 6: Time distribution of a single tracker channel obtained with beta source. The histogram maximum is located before the trigger signal.

For higher released energy, the pulse duration increases and the delay decreases. This effect is clearly visible in Fig. 7 where the counts map of the delay signal respect to trigger start is plotted against pulse duration values. The maximum of the counts (red in the figure) shows a delay decreasing when pulse duration increases.



Figure 7: Counts map of the strip pulse delay respect to the pulse duration. The delay decrease as the pulse duration increase.

# *C. Characterization of the calorimeter*

The YAG:Ce calorimeter has been characterized at Laboratori Nazionali del Sud and at Loma Linda University Medical Center with different proton beam energies. Using a standard acquisition system, the crystal responses at different proton energies has been observed. As example, in Fig. 8 the charge spectrum obtained with a single crystal and 100MeV proton beam is shown. The system has a good resolution equal to 2,7 %.



Figure 8: Charge spectrum obtained with 100MeV proton beam and a single crystal of the calorimeter: the resolution is equal to 2.7%.

In order to test the linearity of the single crystal the charge spectrum for three different energy values have been acquired. The Fig. 9 shows the peak position in the charge spectrum as function of the proton energy: in an energy range comprised

between 35-200MeV the response of the crystal is linear with 1.15 % error.



Figure 9: Peak position in the charge spectrum vs. the proton beam energy. In 35-200MeV energy range the linearity is equal to 1.15%.

Each crystal has been characterized and shows a good resolution at different energies and a good linearity in 35- 200MeV energy range.

# V. CONCLUSIONS

A proton imaging apparatus is being built by the Italian collaboration. The goal is to realized a system able to obtain an imagine by reconstruction of the most likely path of the single particle, knowing its entry and exit position and direction and its residual energy. This technique permits to resolve the problem introduced of the multiple coulomb scattering of the proton in the matter. For clinical requirements, the detectors used and the data acquisition system should be able to sustain about 1 MHz proton rate.

The detectors has been chosen, the architecture of the data acquisition system has been fixed, a ASICs containing 32 front-end channels has been developed and the complete data acquisition system is in advanced status of realization.

Each module of the tracker must be calibrated: the results obtained with a single module have been presented in this paper.

The YAG:Ce crystal calorimeter was completely characterized using a front-end electronics with commercial parts. A new electronic front-end with higher acquisition rate has been developed and will be used in the next test.

A x-y plane of the tracker is ready to be tested with protons and coupled with YAG:Ce calorimeter. The next step will be test the system at Laboratori Nazionali del Sud with 62MeV proton beam.

# VI. REFERENCES

[1] G.A.P. Cirrone et al. - Nucl. Instr. and Meth. A 576 (2007) 194–197

[2]R. Shulte, et al., IEEE Trans. Nucl. Sci. Vol.51, N.3 (2004).

[3] G. A. Pablo Cirrone et al., IEEE Trans. Nucl. Sci, Vol. 54, N. 5, (2007)

[4] M. Bruzzi, et al. IEEE Trans. Nucl. Sci. Vol.54, N.1  $(2007)$ .

- [5] C. Talamonti, et al., Nucl. Instr. and Meth. A (2009),
- doi:10.1016/j.nima.2009.08.040
- [6] [http://www.hamamatsu.com](http://www.hamamatsu.com/)
- [7] [http://www.strategic-test.com](http://www.strategic-test.com/)
- [8] R. Bassini, C. Boiano and A. Pallia, IEEE Trans. Nucl.
- Sci. Vol. 49 N.5 (2002)
- [9] V. Sipala et al., Nucl. Instr. and Meth. A (2009), doi:10.1016/j.nima.2009.08.029.

# Commissioning and performance of the Preshower off-detector readout electronics in the CMS experiment

G. Antchev<sup>a,b</sup>, D. Barney<sup>a</sup>, W. Bialas<sup>a</sup>, R.S. Bonilla Osorio<sup>c</sup>, K.-F. Chen<sup>d</sup>, C.-M. Kuo<sup>e</sup>, R.-S. Lu<sup>d</sup>, V. Patras<sup>f</sup>, S. Reynaud<sup>a</sup>, J.S. Rodriguez Estupinan<sup>c</sup>, P. Vichoudis<sup>a</sup>

> <sup>a</sup>CERN, 1211 Geneva 23, Switzerland, <sup>b</sup> INRNE-BAS, Sofia, Bulgaria <sup>c</sup>Universidad de los Andes, Bogotá, Colombia <sup>d</sup> National Taiwan University, Taipei, Taiwan e National Central University, Chung-Li, Taiwan <sup>f</sup>University of Ioannina, Ioannina, Greece

# Paschalis. Vichoudis@cern.ch

# *Abstract*

The CMS Preshower is a fine grain detector that comprises 4288 silicon sensors, each containing 32 strips. The data are transferred from the detector to the counting room via 1208 optical fibres producing a total data flow of ~72GB/s. For their readout, 40 multi-FPGA 9U VME boards are used.

This article is focused on the commissioning of the VME readout system using two tools: a custom connectivity test system based on FPGA embedded logic analyzers read out through JTAG and an FPGA-based system that emulates the data-traffic from the detector. Additionally, the performance of the VME readout system in the CMS experiment, including the 2009 Cosmic ray at Four Tesla (CRAFT) run, is discussed.

# I. INTRODUCTION

# *A. The detector*

The CMS Preshower [1] is a fine grain detector located in front of the endcap Electromagnetic calorimeter. Its primary function is to detect photons with good spatial resolution in order to perform  $\pi^0$  rejection. The detector comprises 4288 63mm x 63mm silicon sensors, each of which is divided into 32 strips. Fig.1 shows the location of the Preshower in the CMS experiment.



Figure 1: The CMS experiment & the location of Preshower.

The micromodule [1] (see Fig.2), building unit of the CMS Preshower detector, comprises a silicon sensor DCcoupled to a PCB hybrid containing the PACE3 [2] front-end electronics, all mounted on ceramic and aluminium support structures. The signals from the 32 strips of the micromodule are amplified, shaped and sampled continuously every ~25ns and temporarily stored in an analogue memory by the PACE3.

On reception of a level-1 trigger, three consecutive time samples (on the baseline, near the peak and after the peak) per strip are multiplexed, driven out of the micromodule and digitized by a 12-bit AD41240 ADC [3] on the Preshower 'system mother-board' (SMB).

Fig.3 illustrates a signal at the output of the preamplifier/ shaper.



Figure 2: The micromodule.



Figure 3: The preamplifier/shaper output signal.

The digitized data from up to 4 micromodules are multiplexed and organized in a 600-byte packet by a K-chip [4] ASIC and transmitted through an optical link via the GOL [5] serializer ASIC to the Counting Room. The K-chip and GOL ASICs are also located on the SMB. Fig. 4 illustrates the on-detector readout scheme.



Figure 4: The on-detector data readout scheme.

# *B. The off-detector readout scheme*

The data transport from the 4288 micromodules of the ondetector system is achieved by 1208 optical channels. Since the maximum average level-1 trigger rate is 100kHz, the total data flow from the detector to the off-detector electronics reaches  $\sim$ 72GB/s (1208 x 600B/event x 100k events/s).

For the readout of the Preshower, 40 off-detector electronic cards, namely the CMS Preshower Data Concentrator Card (ESDCC) [6] are used. Each ESDCC reads out 24 to 36 optical channels and interfaces with a CMS DAQ link having bandwidth of ~200MB/s (~2kB/event).

Although the total downstream bandwidth of  $\sim$ 8GB/s (40links x 200MB/s) is one order of magnitude lower than the total data flow from the detector, the Preshower can be read out without problems since significant online data reduction is performed in the ESDCCs. The data reduction includes pedestal subtraction, inter-channel gain calibration, common mode noise rejection, bunch crossing identification & threshold application [7].

It is worth mentioning that this level of data reduction (by a factor of  $\sim$ 10 or more) is feasible since the occupancy is relatively low in the Preshower - an average of about 2% at high luminosity.

# *C. The ESDCC*

The ESDCC is a 9U-VME system based around eight high-density FPGAs.

Three of these FPGAs, incorporating embedded hardware deserializers, receive the serialized data streams from the detector and perform the on-line data reduction algorithms. Each of these FPGAs (will be referred-to as 'reduction FPGAs') can treat up to 12 input data streams. The zerosuppressed data coming from the reduction FPGAs are merged by another FPGA (will be referred-to as 'merger FPGA') that also serves as the interface with the central CMS data acquisition system.

Additionally, an FPGA (will be referred-to as 'vme FPGA') is used as an interface with the VME bus while another three FPGAs (with sufficient memory) receive the non-processed (often referred-to as 'raw') data from the

reduction FPGAs for event monitoring through the VME bus (these FPGAs will be referred-to as 'spy FPGAs').

A simplified block diagram of the ESDCC highlighting the data paths is shown in Fig. 5.



Figure 5: Data Paths of the ESDCC.

For the implementation of the ESDCC, the idea of a modular architecture has been adopted. The modularity allows the re-use of the modules by other systems [8][9]. The two modules the ESDCC is based around are:

- The optical receiver plug-in module.
- § The VME 'host board'.

The optical receiver plug-in module, named the 'OptoRx' [10], is a daughter-board that hosts one reduction FPGA and the associated optical components while the VME host board is a motherboard in 9U VME format that incorporates the remaining FPGAs (merger, vme and spy FPGAs) as well as OptoRx sockets and other auxiliary components.

Fig.6 shows a picture of the ESDCC where three OptoRxs are plugged-in to a VME host board.



Figure 6: The ESDCC.

## II. COMMISSIONING

This article is focused on hardware and software tools & methods developed for the commissioning of a complex system: the ESDCC. The concept behind this development was to have a system able to verify the ESDCC system in three steps:

- Verification of the hardware production of the OptoRx and the VME host board modules that have been produced separately in different sites. This step involves only the hardware modules of the ESDCC.
- § Verification of the functionality of the ESDCC as a whole. This step involves both the hardware and firmware of the ESDCC.
- § Verification of the compliance of the ESDCC with the central CMS data acquisition and in-situ performance. This step involves the hardware, firmware and software of the ESDCC.

This chapter describes the first two steps of the commissioning while the last step is described in chapter III.

# *A. The hardware commissioning*

The hardware commissioning tools developed were targeted both for after-production tests at the production site and for reception tests at CERN. Experience with systems comprising high-density FPGAs (~1000pin BGA packages) has shown that performing connectivity tests between components on-board (after the typical production tests e.g. thermal stress/cycling, burn-in tests etc) is essential for the verification of the production. The fact that these tests had to be performed also in the production sites outside CERN added an extra complication in the development of the commissioning tools. The main problems were:

- Difficulty in transportation of heavy hardware equipment (e.g. VME crates etc) that would normally simplify the testing procedure.
- § Software licensing complications of commercial JTAG boundary scan testing applications that would normally simplify the testing procedure.

In order to override these difficulties in the development of the hardware commissioning tools, a different strategy has been followed. The hardware commissioning system is a custom connectivity test bench based on FPGA embedded logic analyzers. The concept of the testing method is the following: In order to verify one connection line, the line must be toggled from one end and read/verified at the other end. To do so, one or more FPGAs of the unit under test generate certain patterns that are received by other FPGAs of the same unit. In case of open connections (e.g. from an FPGA to a connector), special PCBs are attached to the connectors and redirect the signals to other FPGA I/O lines. The patterns trigger the embedded logic analyzers in the receiving end and are recorded and readout through JTAG. A LabVIEW application compares the expected results with the ones received from the unit under test. It also presents the pin locations of the faulty connections for ease of debugging. To cover all interconnections of a module, a series of different tests is performed, where the transmitting and receiving ends

are defined accordingly (by configuring the FPGAs with different firmware). An example of this method is shown in Fig.7 and Fig.8. Fig.7 illustrates the interconnections of the VME host board whilst Fig.8 shows a diagram of a test setup that covers the connections to/from the VME connectors. At the right of the VME host board under test, a special PCB (see Fig.9) is attached to the associated connectors to redirect the signals to FPGA I/O lines that can capture them.



Figure 7: The interconnections of the VME host board.





Figure 8: Setup for testing the lines from/to the VME connectors.

Figure 9: The special PCB used for the VME connector tests.

Fig.10 shows a pattern captured by the embedded logic analyzer whilst Fig.11 shows the front panel of the LabVIEW application developed for analyzing recorded patterns.

The test system described above was used extensively at the production site of the VME host board. Indeed the tests (about 5 minutes per board) revealed serious soldering problems throughout the first production batch. Subsequent improvements to the production process were verified using this test system, with the result that the boards now being produced are far more reliable.

| $\Box$ D $\times$<br>Quartus II - [expected.stp*]             |                                                                                                   |  |  |  |  |  |  |
|---------------------------------------------------------------|---------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| in File Edit View Project Assignments Processing Tools Window | $-10 \times$<br>Help                                                                              |  |  |  |  |  |  |
| $ 300\rangle$<br>D 2 8 8 8                                    |                                                                                                   |  |  |  |  |  |  |
| ※/彡♡◇   ◎   ► けん   *⊙   ≿   ◎   ◎   ■   ◎                     |                                                                                                   |  |  |  |  |  |  |
| 14 results.2008.09.09.stp*<br>$\boxed{1}$ results. 3. stp     | <b>D</b> expected stp <sup>*</sup>                                                                |  |  |  |  |  |  |
| $\mathbb{R} \Rightarrow \mathbb{R}$<br>屠                      | $  \triangle \mathbf{B}  $ $  \frac{\mathbf{A}}{2} \frac{1}{2}$ $  \frac{\mathbf{B}}{2}  $<br>-10 |  |  |  |  |  |  |
| log: 2008/04/01 11:39:19 #0                                   | click to insert time bar                                                                          |  |  |  |  |  |  |
| Type Alias<br><b>Name</b>                                     | 8<br>$\Omega$<br>10                                                                               |  |  |  |  |  |  |
| Đ<br>$F - LB$                                                 |                                                                                                   |  |  |  |  |  |  |
| $-LB[0]$<br>$\ddot{\mathbf{v}}$                               |                                                                                                   |  |  |  |  |  |  |
| $\ddot{\circ}$<br>LBM1                                        |                                                                                                   |  |  |  |  |  |  |
| $-LB[2]$<br>$\bullet$                                         |                                                                                                   |  |  |  |  |  |  |
| $-LB[3]$<br>$\ddot{\circ}$                                    |                                                                                                   |  |  |  |  |  |  |
| LB[4]<br>$\ddot{\circ}$                                       |                                                                                                   |  |  |  |  |  |  |
| LB[5]<br>$\ddot{\circ}$                                       |                                                                                                   |  |  |  |  |  |  |
| LB[6]<br>$\ddot{\circ}$                                       |                                                                                                   |  |  |  |  |  |  |
| $-LB[7]$<br>$\overline{\cdot}$                                |                                                                                                   |  |  |  |  |  |  |
|                                                               | ▶                                                                                                 |  |  |  |  |  |  |
| Data 8 Setup<br>⊠                                             |                                                                                                   |  |  |  |  |  |  |
| auto_signaltap_0                                              |                                                                                                   |  |  |  |  |  |  |
| For Help, press F1                                            | 临床门道<br>Idle                                                                                      |  |  |  |  |  |  |

Figure 10: Pattern captured by the embedded logic analyzers.



Figure 11: The front panel of the LabVIEW application.

The OptoRx, contrary to the VME host board, consists only of open connections from the FPGA I/O to the socket and from the optical receiver to the FPGA hardware deserializer inputs. Therefore, special PCBs are again needed for its hardware commissioning. Although the concept of the test is the same as with the VME host board, the implementation of the special PCBs is slightly different. A module, namely OptoTx, has been developed based on the OptoRx: the only difference between these pin-to-pin compatible modules is that the optical receivers have been replaced by optical transmitters. Fig.12 shows a diagram of the test setup. Both the OptoTx and OptoRx are plugged-in to a motherboard that interconnects the sockets of the two modules. For the optical loopback, optical fibres are used. Fig.13 shows a picture of the OptoRx test setup.

By using this fast (1 min) but extensive test system,  $\sim$ 5% defective OptoRx modules (out of  $\sim$ 150) have been found after production.



Figure 12: Diagram of the OptoRx test setup.



Figure 13: The OptoRx test setup. The motherboard, the OptoTx module and the socket for the unit under test are shown.

# *B. The hardware & firmware commissioning*

For the hardware & firmware commissioning, a second FPGA-based test system known as the "ESDTE" (Preshower Data Traffic Emulator) has been developed by combining existing modules. As its name suggests, the ESDTE emulates the front-end of the Preshower, providing user-programmable data patterns combined (or not) with real previously recorded data from the detector.

The implementation of the ESDTE is based on existing components - the VME host board and the OptoTx mezzanine. The ESDTE operates like an 'inverted' ESDCC, downloading through VME data packets to the on-board memories that are then read by the FPGAs, serialized and transmitted optically.

The ESDCC functionality can thus be verified in the laboratory without the need for the real detector hardware as a data source. In addition, the ESDTE is able to generate rare (but possible) error conditions that are not easily reproducible with the real detector, such as the following:

- Data integrity errors
- § Synchronization problems
- Interrupt packet transmission
- § Do not send a packet (emulate missing triggers)
- § Send a packet w/out trigger (emulate spurious triggers)

Flexible software tools have been developed to accompany this hardware, enabling easy control of both the ESDTE and ESDCC (C++ programs) as well as the subsequent data analysis (based on MatLab) avoiding in this way the use of the complex CMS DAQ software in this stage. The duration of the functionality test is about 60 min per ESDCC.

The ESDTE played an essential role both in the firmware development and debugging and in a second stage of hardware debugging of problems not spotted in the previous commissioning phase.



Figure 14: ESDCC functionality test setup.

The firmware of the ESDCC is expected to evolve with time, depending on the real conditions experienced within CMS. The ESDTE will thus form an important part of the development and debugging throughout the lifetime of the Preshower.

# III. INSTALLATION AND FIRST USE OF THE ESDCCS IN CMS

The Preshower detectors were installed in CMS in spring 2009. Shortly afterwards the first 20 ESDCC boards (from a required total of 40) were installed in their VME crates (10 per crate) and commissioned using the ESDTE. The filled crates were then installed in their final locations underground - one for the ES+ endcap and the other for the ES- endcap and used to perform a first in-situ commissioning of the Preshower. Each crate was used twice: once for each plane of each endcap.

At the beginning of August CMS began operating 24/7 for 6 weeks with the magnet at full power - the so-called "CRAFT09" (Cosmic Ray At Four Tesla). During this period the Preshower included one ESDCC crate (front-plane of ES+) in the central CMS DAQ system whilst the other crate was used for debugging purposes. The first in-situ cosmic rays were seen in the Preshower in the middle of August. Near the end of August the remaining 2 crates were commissioned and installed and a day later all 40 ESDCCs were successfully included in the CMS DAQ system for the first time. The time between reception of these final cards from the producer and having them operating in CMS was about a week. This was only possible because of the fast but extensive test systems described earlier in this note.

# IV. SUMMARY

For the commissioning of the ESDCC readout system of the CMS Preshower subdetector, various test systems have been developed at CERN. These systems were targeted for the verification of the hardware production and the functionality of the ESDCC.

 The hardware commissioning tools have been deployed both at the producer sites and at CERN. By using these systems, many faults have been found, both "batch-wide" (soldering problems) and on individual boards that lead to actions being taken at the producer such that the final boards installed in CMS meet the specifications and are reliable.

The ESDTE functionality verification system has contributed significantly to the development of the ESDCC firmware and to the fast and efficient commissioning of the readout VME crates at CMS. The ESDTE will continue to be used for future firmware development.

Currently, all 40 ESDCCs are installed and operational at CMS.

# V. REFERENCES

[1] The CMS collaboration "The CMS experiment at the CERN LHC", 2008 JINST 8 S08004.

[2] P.Aspell et al "PACE3: A large dynamic range analog memory front-end ASIC assembly for the charge readout of silicon sensors", IEEE Nuclear Science Symposium Conference Record, 2005 Vol.2, 904.

[3] G. Minderico et al "A CMOS low power, quad channel, 12 bit, 40MS/s pipelined ADC for applications in particle physics calorimetry", 9th Workshop on Electronics for LHC Experiments Conference Record, 2003, pp. 88-91.

[4] K.Kloukinas et al "Kchip: A radiation tolerant digital data concentrator chip for the CMS Preshower detector", 9th Workshop on electronics for LHC Experiments Conference Record, 2003, pp. 66-70.

[5] P. Moreira et al "G-Link and Gigabit Ethernet compliant serializer for LHC data transmission", IEEE Nuclear Science Symposium Conference Record, 2000, Vol. 2, pp. 9/6-9/9.

[6] G. Antchev et al "A VME-based readout system for the CMS Preshower sub-detector", IEEE Trans Nucl Sci 54 623.

[7] D. Barney et al "Implementation of on-line data reduction algorithms in the CMS Endcap Preshower data concentrator card", 2007 JINST 2 P03001.

[8] G. Antchev et al "The TOTEM Front End Driver, its Components and Applications in the TOTEM Experiment", Proceedings of the Topical Workshop on Electronics for Particle Physics, 2007, pp.211-214.

[9] W. Beaumont et al "Design of the CMS-CASTOR sub detector readout system by reusing existing designs", presented at the Topical Workshop on Electronics for Particle Physics, 2009.

[10] S. Reynaud, P. Vichoudis "A multi-channel optical plugin module for gigabit data reception", Proceedings of the 12th Workshop on electronics for LHC and future experiments, 2006, pp.229-231.

# In-situ performance of the CMS Preshower Detector

W. Bialas

CERN, 1211 Geneva 23, Switzerland

Wojciech.Bialas@cern.ch

on behalf of CMS ECAL group

# *Abstract*

# *A. On-detector*

The CMS Preshower detector, based on silicon strip sensors, was installed on the two endcaps of CMS in March/April 2009. First commissioning showed that of the 137216 electronics channels almost all (>99.9%) are fully operational.

This report summarizes the electronics integration (ondetector) and in-situ performance in terms of noise (including common-mode pickup). First observations of in-situ cosmicrays during CMS summer CRAFT program are presented.

# I. INTRODUCTION

The CMS Preshower (ES [1]) is a fine-grain detector placed in front of the endcap Electromagnetic calorimeter. It's primary role is to detect photons with good spatial resolution in order to distinguish pairs of closely-spaced photons from single photons. Silicon sensors, measuring 63 x 63 mm<sup>2</sup> and 320  $\mu$ m thick, divided into 32 strips are used as active elements. The complete Preshower detector contains 4288 sensors, mounted on 4 individual planes: two orthogonal planes form each CMS endcap Preshower. The location of the Preshower detector inside CMS is shown in fig. 1.



Figure 1: Location of Preshower detector in CMS.

The ES Control and Readout architecture can be divided into on-detector and off-detector parts. The on-detector part is based on 504 modules known as "ladders", each of which hosts 7-10 sensors and associated front-end electronics.

Each silicon sensor was glued to a ceramic support, in turn glued to an aluminium tile (allowing sensor overlap in one dimension). The front-end hybrid, holding the PACE3 [2] chipset, is screwed through the ceramic to the tile and wire-bonded to the sensor. The resulting "micromodule" is shown in figure 2.



Figure 2: Preshower silicon sensor micromodule.

The role of the front-end readout chip - PACE3 - is to amplify, shape, sample (at 40MHz) and store voltage signals generated by charged particles passing through the sensor. Each channel of PACE3 contains charge amplifier, followed by switchable gain shaper and analog memory to store sampled data. Three consecutive voltage samples are stored per triggered event. The front-end can operate at two gains: High gain (HG) is mainly used for detecting minimum ionizing particles (MIPs) during calibration stage with a limited dynamic range of about 0-60 MIPs; Low gain (LG) is used for normal physics data taking operation with a high dynamic range of about 0-400 MIPs.

Micromodules were assembled into "ladders". On top of each ladder, a system motherboard (SMB) was installed. There are 4 types (shapes) of SMB in the Preshower to enable an approximately circular coverage of the endcap regions between  $1.653 < \eta < 2.6$ . This complex double-sided pcb contains voltage regulators, analog-to-digital converters, data concentrator chips, gigabit optical transmitters and slow control circuitry.

Analog time samples delivered by the front-end hybrids are digitized by the ADCs [3], then stored and reformatted to data packets in a data concentrator chip (K-CHIP) [4]. Finally, through a gigabit optical hybrid (GOH) mezzanine [5], data are pushed out to the counting room through optical fibers.

# *B. Off-detector*

The Preshower off-detector electronics principal components are Clock and Control System (FEC-CCS) [6] and



Figure 3: Intrinsic noise in ADC counts per sensor strip measured with detector operating in high gain (HG).

Preshower Data Concentrator Card (ESDCC) [7]. The role of the CCS is to redistribute clock and trigger information to the front-end electronics through "control rings" and perform slow control of the on-detector system. It was implemented in 9U VME form factor as part of a common project for several other detectors present in CMS. The ESDCC card's main objective is to acquire data from the front-end, process them to obtain a necessary data reduction of about a factor 20 and then send the sparsified data to the central CMS DAQ system via the S-LINK common interface. This 9U VME card performs pedestal subtraction, common mode noise rejection, signal reconstruction, bunch crossing assignment and threshold application [8]. Four VME 9U crates (one for each plane of each endcap) in the counting room host in total 16 CCS and 40 ESDCC cards.

# II. ASSEMBLY AND INSTALLATION

The Preshower detectors were assembled and tested on the CERN Meyrin site - starting from micromodule assembly and finishing with endcap "Dees" (one Dee is an assembly of two half-planes). During this assembly process, each detector element underwent rigorous quality control/assurance checks, including detailed visual inspections, thermal shock/cycling and power-cycles, preceded and followed by functional tests . For example, vertical stacks of 6 ladders were placed in thermoregulated boxes and thermal cycles with power-cycles tests were performed (10 thermal cycles between -14◦C and +15◦C), followed by 24h continuous operation at -14◦C (the nominal operating temperature of the Preshower). During this latter test cosmic muons passing through the sensors were detected and used to perform a first calibration [9].

The final assembly of the detector Dees was again followed by functionality and reliability tests at ambient and sub-zero temperatures, finishing in December 2008. In March and April 2009 the four Preshower Dees were transported to the CMS cavern. Before installing them in CMS all services needed were deployed and tested, including low and high voltage power,



Figure 4: Intrinsic noise in ADC counts per sensor strip measured with detector operating in low gain (LG).

cooling and neutral gas flow systems. The Preshower is the only part of CMS that included a final assembly stage underground - attaching pairs of Dees together to form the full endcaps. This delicate operation took place with the beam-pipe in place, necessitating numerous safety precautions. The process went without any problems and according to schedule

The Preshower dectector was ready for first commisioning in-situ by the end of April.

## III. COMMISSIONING

After installation of the Preshower in CMS the most important thing to do was to check that all connectivity at the detector side for high and low voltage power lines, control cables and optical fibres was correct, prior to "closing" CMS. The complete on and off-detector systems were used for this first commissioning, including all final power supplies and two crates of off-detector electronics (the other two crates were not available at that time).

Data were taken by means of a local DAQ system based on the XDAQ framework (CMS standard) [10][11]. The resulting data (pedestals runs mainly) were analysed using on-line Data Quality Monitoring (DQM) software tools, giving prompt feedback to cabling teams in case of connection problems. Just a few problems were in fact found, and these were mostly with optical connections that were quickly repaired. One sensor was found to have a short-circuit on its HV line inside the detector so it (and its neighbour) have been disconnected. A further 2 channels are also not functioning, bringing the total working channels down to 137150 from the nominal 137216. Thus after the first commissioning more than 99.9% of the ES was fully functional. Analysis of the noise performance in HG and LG modes was also performed. Figures 3 and 4 show noise distributions in ADC counts for HG and LG respectively (with 1 MIP being equivalent to about 50 ADC counts in HG and 9 ADC counts in LG). These noise figures agree with laboratory measurements and are within the detector design specification. In HG mode





Figure 5: Common mode noise in ADC counts per sensor measured with detector operating in high gain (HG).

Figure 6: Common mode noise in ADC counts per sensor measured with detector operating in low gain (LG).

the typical noise is around 5.6 ADC counts, corresponding to a signal to noise ratio of 9 (for single MIPs), while in physics operating mode (LG) the noise average value is around 2.5 ADC counts and S/N ratio 3.5. In this latter mode of operation (fig. 4) one can see that a few hundred strips have noise above the norm, around 5 ADC counts. These strips were found to be located on specific micromodules on one type of ladder and the cause of the excessive noise traced to system clock crosstalk. This feature has no effect on overall performance of the detector itself.

Analysis of common mode (CM) noise gave reasonably low levels, reassuring us that the ES does not pick-up noise from neighbouring systems (through power lines etc.) nor from its own operation (see fig. 5 and 6). One can see that common mode noise values are bigger in HG than LG, which can be explained by PSRR ratio of charge preamplifier and switched gain shaper characteristics of the PACE3 front-end chip. Again these levels of CM noise do not affect the detector performance - indeed the ESDCC includes a CM correction algorithm.

# IV. COSMIC DATA - CRAFT'09

With LHC machine operation announced for the end 2009, the CMS experiment began operating 24/7 for 6 weeks in the summer to test its readiness. During that time the magnet was switched on at full power. In the period of so-called CRAFT09 campaign (Cosmic Run At Four Tesla), the Preshower detector was included with one plane of one endcap. After an initial period of timing-in of the detector with particles, the Preshower delivered cosmic muon information to CMS. An example of a CMS event display showing muon hits in the Preshower is shown on figure 7. One specificity of CRAFT09 runs is that particles arrive asynchronously with respect to the system master clock. This feature brings obvious difficulties for precise timing-in of the detector for LHC machine operation. However, by fitting the front-end pulse shape to 3 data time samples one can reconstruct the particle arrival time and front-end signal amplitude. A preliminary plot of the energy spectrum in the Preshower due to incident muons is shown in figure 8. Note that this plot does not include a correction for the angle of incidence of the muons.

# V. SUMMARY

The CMS Preshower detector was assembled in the second half of 2008. It was installed and commissioned successfully in CMS in the first half of 2009. The results gathered from first check-out of the detector confirms all connectivity in place and expected noise performance. Virtually all channels (>99.9%) are operational, with just a small number of noisy channels observed. The average noise is 5.6 ADC counts in HG and 2.5 in LG operation mode. This leads to signal to noise figures of 9 for HG and 3.5 in LG for single MIP detector response, fully satisfying detector design specifications. Common mode noise was found to be at reasonable low level, confirming that detector is immune to outside environment. In summer 2009, CMS Preshower joined Cosmic Run At Four Tesla campaign, that resulted with successful timing-in in respect to whole CMS and measurements of energy deposit of cosmic rays in its silicon sensors.

#### **REFERENCES**

- [1] The CMS collaboration, The CMS experiment at the CERN LHC, 2008 JINST 8 S08004
- [2] P. Aspell et al., PACE3: A large dynamic range analog memory front-end ASIC assembly for the charge readout of silicon sensors, IEEE Nuclear Science Symposium Conference Record, 2005 Vol.2, 904
- [3] G. Minderico et al, A CMOS low power, quad channel, 12 bit, 40MS/s pipelined ADC for applications in particle physics calorimetry, 9th Workshop on Electronics for LHC Experiments Conference Record, 2003, pp. 88-91.



Figure 7: CMS Event Display. Example of Cosmic muon trace recorded in Preshower detector.<br>
Figure 8: Energy deposit of minimum ionizing particles with<br>
Figure 8: Energy deposit of minimum ionizing particles with

- [4] K.Kloukinas et al, Kchip: A radiation tolerant digital data concentrator chip for the CMS Preshower detector, 9th Workshop on electronics for LHC Experiments Conference Record, 2003, pp. 66-70.
- [5] P. Moreira et al, G-Link and Gigabit Ethernet compliant serializer for LHC data transmission, IEEE Nuclear Science Symposium Conference Record, 2000, Vol. 2, pp. 9/6-9/9.
- [6] K. Kloukinas et al., FEC-CCS : A common Front-End Controller card for the CMS detector electronics 12th Workshop on Electronics For LHC and Future Experiments Conference Record, 2006, pp.179-184.
- [7] G. Antchev et al, A VME-based readout system for the



Preshower sensors.

CMS Preshower sub-detector, IEEE Trans Nucl Sci 54 623.

- [8] D. Barney et al, Implementation of on-line data reduction algorithms in the CMS Endcap Preshower data concentrator card, 2007 JINST 2 P03001.
- [9] A. Elliot-Peisert et al, Quality Assurance Issues of the CMS Preshower, Frontier Detectors For Frontier Physics, La Biodola, Isola d'Elba, Italy, 24-30 May 2009.
- [10] J. Gutleber, S. Murray, L. Orsini, Comput. Phys. Commun. 153 (2003) 155.
- [11] P. Musella, Nucl. Instr. and Meth. A(2009), doi:10.1016/j. nima. 2009.07.102.

# *TUESDAY 22 SEPTEMBER 2009 PLENARY SESSION 2*

# Low Power Analog Design in Scaled Technologies

A. Baschirotto<sup>1,2</sup>, V. Chironi<sup>2</sup>, G. Cocciolo<sup>2</sup>, S. D'Amico<sup>2</sup>, M. De Matteis<sup>2</sup>, P. Delizia<sup>2</sup>

 University of Milano-Bicocca University of Salento Milan – Italy Lecce – Italy

## *Abstract*

In this paper an overview on the main issues in analog IC design in scaled CMOS technology is presented. Decreasing the length of MOS channel and the gate oxide has led to undoubted advantages in terms of chip area, speed and power consumption (mainly exploited in the digital parts). Besides, some drawbacks are introduced in term of power leakage and reliability. Moreover, the scaled technology lower supply voltage requirement has led analog designers to find new circuital solution to guarantee the required performance.

### I. INTRODUCTION

The development of silicon technology has been, and will continue to be, driven by system needs. The continuous and systematic increase in transistor density and performance, guided by CMOS scaling theory and described in "Moore's Law" ([1], [2]), has been a highly successful process for the development of silicon technology for the past 40 years. Technological scaling-down sustains System-on-Chip (SoC) trend because it gives low cost and low power devices, suitable to operate at higher frequencies ([2]). Fig. 1 shows that standard supply voltage  $(V_{DD})$  of the analog devices embedded in deep sub-1µm CMOS technologies decreases with the transistor channel length. Low voltage supply is a necessity in scaled technologies. In fact electromigration process, leakage currents  $(I<sub>OFF</sub>)$  and the breakdown events ([3], [4]) are related with the intensity of the inside-silicon electric fields. Thus low  $V_{DD}$  bounds these physical 2nd-order effects, which affect the reliability and the robustness of the microelectronics circuits.



Despite that, Fig. 2 shows the intensity of the power-down currents increases with the technological scaling-down. Large  $I_{\text{OFF}}$  can be detrimental for portable and not telecom devices, which are in power off for the most of time. One of the possible approaches in order to break the  $I_{\text{OFF}}$  currents

 $1$  Dept. of Physics "G. Occhialini"  $2$  Dept. of Innovation Engineering

increasing is to invert the scaling-down process of the CMOS transistors threshold voltage  $V<sub>TH</sub>$ . Fig. 1 shows also that the  $V<sub>TH</sub>$  threshold voltage approaches the  $V<sub>DD</sub>$ , inverting the decreasing trend of the last years ([5]) (e.g. in 65nm CMOS). From the analog circuits point of view,  $(V_{DD}-V_{TH})$  decrease leads to operating point issues and dynamic range reduction, so that novel design solutions are needed.

The paper is organized as follows. Section II introduces an overview of main issues in scaled CMOS technology at transistor level. In Section III three analog circuit designs (a bootstrapped S&H, a multistage compensated opamp and an Active  $G_m$ -RC filter) at low voltage are presented.



#### II. CMOS TECHNOLOGY MAIN TRENDS

The evolution of the analog performance of MOS devices through technology scaling can be seen in Table I ([6]) for the most important parameters. The influence of these and other effects will be discussed in the next sections.

Table I – MOS DEVICE PARAMETER TRENDS

| <u>INOD DE VICE LAIVABLER TIVENDO</u>                 |                                |  |             |                           |                                |              |  |
|-------------------------------------------------------|--------------------------------|--|-------------|---------------------------|--------------------------------|--------------|--|
| Node                                                  | Nm                             |  |             | 250 180 130               | 90                             | 65           |  |
| $L_{GATE}$                                            | Nm                             |  |             | 180   130   92   63       |                                | 43           |  |
| $t_{OX}$ (inv.)                                       | $Nm$   6.2   4.45   3.12   2.2 |  |             |                           |                                | 1.8          |  |
| <u>Peak g<sub>m</sub> μS/μm 335 500 720 1060 1400</u> |                                |  |             |                           |                                |              |  |
| $9$ ds <sup>**</sup>                                  | μS/μm 22 40 65 100             |  |             |                           |                                | 230          |  |
| $g_{m}/g_{ds}$                                        | <b>Contract Contract</b>       |  |             |                           | <u>15.2 12.5 11.1 10.6 6.1</u> |              |  |
| $V_{DD}$                                              | <b>V</b>                       |  | $2.5$   1.8 | 1.5                       | 1.2 <sup>7</sup>               | 47           |  |
| $V_{TH}$                                              | V.                             |  |             | 0.44 0.43 0.34            | 0.36                           | 0.24         |  |
| $f_{\pm}$                                             | GHz                            |  |             | $35 \mid 53 \mid 94 \mid$ |                                | $140$   210* |  |

#### *A. Power Reduction*

In digital CMOS circuits, the power consumption is mainly due to three current components: (i) the leakage current due to the reverse biased diodes formed between the substrate, the well, and the source and drain diffusion regions of the transistors, (ii) the short circuit current due to the presence of current carrying path from the supply voltage to ground when certain PMOS and NMOS transistors are simultaneously ON for a short period due the signal transitions at the input to the logic gates, and (iii) switching current due to charging and discharging of the load capacitance. Among the three sources of power dissipation, the last component is by far the most dominant. Ignoring the internal capacitances of logic gates, the average power consumption for a logic gate due to charging and discharging of load capacitance C is given by:

$$
Eq. I \t P_{dig} f \# C \# V_{DD}^2
$$

where  $V_{DD}^2$  is the supply voltage, and *f* is the operation frequency. From Eq. 1, the power consumption in digital circuits is reduced in scaled technology.

In analog circuits, the performances are often limited by the thermal noise (this is the case, for instance, of an acquisition channel), which is inversely proportional to the bias current, i.e.:

Eq. 2 
$$
\frac{kT}{C} \approx \frac{\alpha}{I}, \quad I = P_{an} / (\beta \cdot V_{DD})
$$

where  $\alpha$  and  $\beta$  are two constants properly sized.

To achieve a target Dynamic-Range (DR) with a maximum output swing (i.e. signal amplitude) of  $SW = V_{DD} - 2V_{sat}$ , this can be written as:

Eq. 3 
$$
DR = \frac{(V_{DD} - 2V_{sat})^2}{(\alpha / I)}
$$

the power consumption in the analog circuits depends on DR and  $V_{DD}$  as follows:

$$
Eq. 4 \qquad P_{an} \propto \frac{DR}{V_{DD}}
$$

As a consequence, for a given DR,  $V_{DD}$  reduction, as required by technology scaling, brings an increase of analog power consumption. This result is the opposite than that for digital circuits. Thus, the technology scaling results detrimental for analog circuits design.

# *B.*  $(V_{DD} - V_{TH})$  Reduction

Technology scaling forces a reduction of both  $V_{DD}$  (as seen before) and  $V_{TH}$ . However  $V_{TH}$  scales faster that  $V_{DD}$ , and this reduces node by node the distance  $(V_{DD}-V_{TH})$ . From an intuitive point of view, the distance  $(V_{DD}-V_{TH})$  represents the "free" voltage space for analog design. The reduction of this space in scaled technologies makes critical the analog block design.

 pass-gate, as shown in Fig. 3, it can process a rail-to rail input Considering, for instance, the analog switch realized with a signal only with a minimum  $V_{DD,min}$  given by:

Eq. 5 
$$
V_{DD \text{min}} > 2V_{TH} + 2V_{OV}
$$
.

 V*DD* margin tapering with scaling technology till 22nm node It gives that  $V_{DD,min}$  is technology dependent. Fig. 4 shows

where analog switch won't be possible anymore.



Fig. 3 – Pass-gate functionality



# *C.*  $V_{\tau H}$  variations

 $(V_{DD}-V_{TH})$  reductions are critical for analog design. This occurs for the technology scaling and, for a given technology node, also for the analog design choices. In fact  $V_{TH}$  depends on several effects. Among them the most important ones are:

technology variations: process, supply voltage  $(V<sub>TH</sub>$  does not depends strongly on  $V_{DD}$ ) and temperature (PVT) variation;

analog design choices: device mismatch, short and narrow channel effects;

layout choice: STI effects;

#### *1) PVT Variation*

For a 65nm technology,  $V_{TH}$  variations due to PVT variations can be very large.

Table II –  $V_{TH}$  VARIATION AT CORNER SIMULATIONS

|                   | <b>Nominal</b> |      |                      | Fast        |     |                                                                    | Slow          |      |      |
|-------------------|----------------|------|----------------------|-------------|-----|--------------------------------------------------------------------|---------------|------|------|
|                   |                |      |                      |             |     | -40°C   27°C   120°C   -40°C   27°C   120°C   -40°C   27°C   120°C |               |      |      |
| $V_{\tau H}$ [mV] | 584            |      | 547 496              | 510 475     |     | 425'                                                               | 646           | 606  | 552  |
| $q_{ds}$ [µA/V]   | 34.4           |      | $34.1$ $34.1$ $50.9$ |             |     | 49.2 47.4                                                          | $19.6$   21.0 |      | 22.5 |
| $g_m[\mu A/V]$    | 548            | 486  | 432                  | 667         | 583 | 505                                                                | 392           | 370  | 348  |
| $q_m/q_{ds}$      | 15.9           | 14.3 | 12.7                 | 13.1   11.8 |     | 10.6                                                               | 20            | 17.6 | 15.5 |

For instance Table II gives the operating parameters for a NMOS device (W=650nm, L=65nm,  $V_{GS}$ =730mV,  $V_{DD}$ =1.2). The nominal value of  $V_{TH}$  (nominal case & 27°C) is 547mV. This value would change with PVT from 425mV to 646mV, i.e. ±110mV, a large amount. Considering, for instance, the

design of the cascode current mirror of Fig. 6, the minimum supply voltage required for this block operation would be larger, for all the worst-cases than the maximum  $V_{DD}$  allowed by the 65nm technology (1.2V).



This simple example shows how one of the most popular building blocks has to be reconsidered in scaled technologies.

#### *2) Analog design choices*

 $V<sub>TH</sub>$  value is affected also by statistical variation around its actual value. Mismatch observations based on transistor pairs can be described as well with a normal distribution with mean  $\mu$  and standard deviation

$$
Eq. 7 \qquad \qquad \sigma_{\Delta VT} = A_{VT} / \sqrt{W \cdot L}
$$

where  $A_{VT}$  is a technology conversion constant (in mV· $\mu$ m). The usual rule-of-thumb for  $A_{VT}$  vs. technology node is

$$
Eq. 8 \qquad A_{VT} \approx \gamma \cdot T_{ox}
$$

i.e. " $1mV \mu m$  x nm  $T_{ox}$ ", where  $T_{ox}$  is the MOS oxide thickness. This means that for the same device area (W·L) scaled technology features a better matching. Thus, all these circuits whose power consumption is limited by the device matching (for instance flash ADC, or multipath/multichannel analog systems) can exploit the improved scaled technologies where the analog designer achieves the same matching performance with lower device size.

The  $V_{TH}$  value is also affected by the device size, due to the edge phenomena (for short and narrow channel cases) that are typically negligible in larger device size ([7]).

Narrow-channel effect becomes significant when the channel width is of the same order of magnitude as the thickness of the depletion region under the gate oxide. For MOSFET's with non-recessed oxide-isolation structures, a decrease of the channel width (W) leads to a  $V<sub>TH</sub>$  increase. In fact for W large, the additional inversion layers charge at the edge of the channel (Q<sub>CHW</sub>) is negligible, while for narrow W,  $Q_{CHW}$  becomes important and results in increasing  $V_{TH}$  (see Fig. 6).

When short channel effects (SCE) occur the depletion region under the gates includes all the charge from source to drain (Fig. 7). At source and drain, a part of the charge  $(Q<sub>CHI</sub>)$ is due to the depletion region and then it has not to be generated by the gate voltage. This results in a  $V_{TH}$  reduction. This sitaution is increased by the drain voltage movement, which can further reduce  $V_{TH}$  (this is the Drain-Induced Barrier Lowering effect – DIBL). This  $V_{TH}$  reduction could reach very low  $V_{TH}$  values. To avoid this situation, some additional technological steps (typically a modified doping profile at the channel edges, like HALO) are introduced to maintain a certain  $V_{TH}$  value. In this situation the channel lenght reduction results in a larger  $V_{TH}$  value.



Fig. 6 - Depletion layer under the gate at narrow channel effect



Fig. 7 - Depletion layer under the gate at short channel effect.

#### *3) Layout design*

The device size shrinking in scaled technologies allows a strong reduction of the overall die size. In this situation other dimensions limit the die size reduction. One of the most critical limitations appeared to be the LOCOS size, which is the technology step used to separate two different active areas. The cross section of the LOCOS is shown in Fig. 8, where the "bird's beak" is an evident limitation of its size reduction. For this reason in order to reduce the separation space between two active areas, a different technology step has been adopted. This is the shallow trench isolation (STI), whose cross section is shown in Fig. 9 ([8], [9]). This process step, which consists of an oxide deposition into a trench, achieves a completely abrupt transition between the active area and the isolation. In a simplified description, this abrupt transition applies a mechanical stress to the active area edge that increases  $V_{TH}$  (in some simulation tools the STI effects is taken into account as a mobility variation).

Fig. 10 shows the STI effects for different layout design. Case (a) refers to the single device layout, where the mechanical pressure is applied to both device edges. This means that  $V_{TH0}$  is the maximum value for the threshold voltage. In case (b), in both devices an edge is immune from STI pressure and then  $V_{THI}$  is lower than  $V_{THO}$ . Finally in case (c), the external devices feature a threshold voltage given by  $V<sub>TH1</sub>$ , while the internal devices appear immune from STI and the threshold voltage  $V_{TH2}$  is lower than  $V_{TH1}$  and  $V_{TH0}$ . Notice that the STI effects is often dominant with respect to the narrow channel effect previously described and then for narrower gate size  $V_{TH}$  tends to decrease



Fig. 8 - Isolation using LOCOS process.







Fig. 10 – STI effects for different layout designs

As the STI effects can be evaluated only for a given layout design, the schematic simulations have to be deeply reevaluated after design layout, and only post-layout simulations can validate an analog design.



#### Fig. 11 - MOS output characteristics in a 65nm technology

# *D. DC-Gain Reduction*

Analog signal processing is often based on circuits

embedding opamp. An opamp key parameter is the dc-gain, which depends on the MOS device intrinsic gain (i.e. the  $g_m/g_{ds}$ ). Technology scaling introduces a  $g_m$  increase. However this is worsened by a stronger  $g_{ds}$  increase, which results in a lower intrinsic gain (see Table I). The  $g_{ds}$  increase can be see in Fig. 11 that shows the output characteristics of MOS devices of different L in a 65nm technology. The slope of these curves corresponds to the  $1/g_{ds}$ . This strong reduction of the intrinsic gain forces the development of improved opamp structures to achieve a sufficiently large dc-gain.

#### *E. Velocity saturation*

With scaling technology, the electric field across the channel increases and the carriers in the channel have an increased velocity. However at high fields there is no longer a linear relation between the electric field and the velocity as the velocity gradually saturates reaching the saturation velocity  $(v_{sat})$ , which increases the transit time of carriers through the channel. At low electric field  $(\varepsilon)$ , the velocity  $(v)$ increases proportionally to  $\varepsilon$ :



Fig. 12 - Velocity Saturation for large (a) and for small length (b).

For high electric field (i.e. small L) the velocity saturates to  $v_{\text{sat}}$  ( $\approx$ 10<sup>5</sup> m/s). The main consequence is that the current depends linearly with  $(V_{GS}-V_{TH})$  and, then, transconductance saturate to *gmsat*:

Eq. 10 
$$
I_D = W \cdot C_{ox} \cdot (V_{GS} - V_{TH}) \cdot v_{sat}
$$

*Eq. 11* 
$$
g_{msat} = \frac{\partial I_D}{\partial (V_{GS} - V_{TH})} \cong W \cdot C_{ox} \cdot v_{sat}
$$

## III. SCALTECH ANALOG DESIGN

#### *A. ScalTech at transistor level*

The reduced "free" space  $(V_{DD}-V_{TH})$  allowed in scaled technology forces to consider different MOS operation conditions where the  $V_{TH}$  "cost" has not to be fully "paid". This is the case of operating MOS devices in sub-threshold region ( $V_{GS} \approx < V_{TH}$ ). In this condition MOS device presents the advantage of the minimum overdrive, small gate capacitance, large  $g_{m}/I_{D}$  and large voltage gain (the gain is typically 25%-30% higher than the gain in saturation region). On the counterpart it suffers of larger drain current mismatch (input offset), large output noise current for a given  $I<sub>D</sub>$ , low speed. In fact the mismatch  $A_{VT}$  parameter for the device in subthreshold is typically three times higher than the value in saturation region [5]. This means that when the offset is critical, sub-threshold devices need some offset compensations scheme, while when offset can be tolerated they can fully exploited (like in band-pass sigma-delta modulators). Finally, nonetheless the sub-threshold devices exhibit a lower speed, this is compensated by the higher speed of the scaled technology and then they can be used in typical analog baseband applications.

# *B. ScalTech at circuit level*

The use of scaled technology in analog design needs some new developments. This has to be introduced for any functional block. In the following the case of the basic analog switch, of the opamp, and of analog filters are discussed.

#### *1) Analog Switch*

A critical problem in designing analog sampled-data systems (like SC circuits, ADC, etc…) operating at lowvoltage supply is the implementation of a MOS switch. Using a NMOS switch as a sampling switch in the T/H circuit has main issue of input-dependent finite ON-resistance given by:

$$
Eq. 12 \t\t R_{ON} \propto \frac{1}{(W/L) \cdot (V_{GS} - V_{TH})}
$$

Since  $V_{GS} = (V_{DD} - V_{in})$ , for W/L given,  $R_{ON}$  is signal dependent and results to be more resistive (performing lower bandwidth) at low supply voltage. This problem is more critical when  $V_{DD}$ decreases as in scaled technologies. A popular solution is the use of a "bootstrapped" switch, whose functional and circuit scheme is shown in Fig. 13. Fig. 13-(b) shows that during the on-state the gate-to-channel voltage is kept constant, guaranteeing constant switch conductance. This is done by connecting a capacitance (precharged at  $V_{DD}$  during the offstate) between the gate and source terminals of the main switch ([10]). As a results the switch during the on-state operates with a constant  $V_{GS}$ , i.e. with a constant on-resistance, as shown in Fig. 14. Several circuit implementation of the conceptual scheme of Fig. 13-(b) are present in literature. One of most popular of them is shown in Fig. 13-(c) whose complexity indicates the increased cost of this solution (in terms of area, power consumption, additional load for the

previous stage, etc….).





Fig. 13 - Bootstrapped Switch: (a-b) conceptual scheme, (c) circuit implementation.



#### *2) Operation Amplifier*

The design of an opamp in scaled technologies has to face several problems. Among them the most critical ones regards the bias point and the fequency response.

Regarding the bias point, the differential pair of Fig. 15 has to be considered, since it is the opamp input stage.



Fig. 15 - Differential input pair

At low voltage it is mandatory to maximize the dynamic range, so a rail-to-rail output signal has to be processed with large linearity. To maximize the voltage swing the input and output common mode voltage of the cell has then to be fixed at  $V_{DD}/2$ .

Eq. 13 
$$
V_i_{DC} = V_o_{DC} = V_{DDmin}/2
$$

The opamp input node operating point requirements are:

*Eq. 14*

 $V_{i\_DC} = V_{DD \, min} / 2 - V_{GS} - V_{DSsat} = V_{DD \, min} - V_{TH} - 2 \cdot V_{ov}$ 

As a consequence,  $V_{DDmin}$  is given by

$$
Eq. 15 \tVDD min = 2 \cdot VTH + 4 \cdot Vov
$$

This value can be quite large and in some cases disable the use of standard opamp topologies.

Regarding the frequency response, the dc-gain of a CMOS opamp is lowering with technology scaling, due to the reduced intrinsic gain. In addition, due to the lower supply voltage, high-gain stacked-device structures like cascode cannot be used. Thus opamps in scaled technologies uses multistage structure, where each stage introduces a pole in the overall frequency response. This means that an opamp typically presents several gain stages and then several poles. Then the frequency response compensation becomes fundamental. Several compensation schemes can be exploited which are based on capacitive feedback and/or transconductance feedforward ([11], [12]). These multistage opamp compensation topologies have to be compared in terms of ac-performances (gain, bandwidth, phase margin), load driving capability, power consumption, complexity and occupied area (since compensation capacitor is not scaling with technology).

An example of the combination of this technique is given by the three-stage opamp shown in Fig. 15 ([13]). The compensation scheme uses a Single-Miller capacitor Feed-Forward Compensation (SMFFC). It uses a transconductance feed-forward path to provide a left-half-plane (LHP) zero to compensate the second pole (first non-dominant pole).



Fig. 16 - Structure of the three-stage SMFFC amplifier

The compensation scheme is also shown in Fig. 17. In this scheme, together with the differential mode architecture the Common-mode feedback Circuit (CMFB) is shown. In fact, the feedforward paths, used for the differential mode

compensation, are not effective for the CMFB compensation. A critical point in low-voltage multistage opamp is then also the frequency compensation of the CMFB loop. In the scheme of Fig. 17, a feedforward path in the feedback loop is introduces by the "D" stage which is effective only for common-mode signals. Table III summarizes the achieved performance with this opamp.



Fig. 17 - Structure of three-stage SMFFC amplifier with the CM-control.

Table III – MULTISTAGE OPAMP PERFORMANCE SUMMARY

| Parameter                | Performance       |  |  |
|--------------------------|-------------------|--|--|
| <b>Technology CMOS</b>   | 65 <sub>nm</sub>  |  |  |
| Differential Gain/UGB    | 84dB / 200MHz     |  |  |
| Common Mode Gain/UGB     | 85dB / 136MHz     |  |  |
| PSRR@1MHz                | 60dB              |  |  |
| CMRR@1MHz                | 38dB              |  |  |
| HD3@5MHz                 | -82dBc            |  |  |
| Output Noise@1MHz        | 27nV/vHz          |  |  |
| <b>Power Consumption</b> | 10 <sub>m</sub> W |  |  |

## *C. Analog Filters*

Continuous-time analog filters are typically implemented using Gm-C, Active-RC or Active-Gm-RC topologies. The Active-RC and the Active-Gm-RC architectures exhibit a feedback structure and then they could presents a frequency response limitation (limited by the opamp GBW). However they can perform large linear range [16]. On the other hand, open-loop filters (like Gm-C) appear attractive in terms of noise and power consumption minimization, but large overdrive voltage is needed in order to perform large linear range [17].

At low supply voltage, while Active-RC and ActiveGm-RC can perform rail-to-rail signal processing capability, this is not the case of Gm-C filters, which results extremely inefficient in scaled technologies. As a consequence, closed-loop circuits (like Active-RC and Active-  $G_m$ -RC) have then to be considered. Among them, thanks to the single opamp topology, the multi-path Active-RC cell of Fig. 18 allows reducing the power consumption if compared to the typical

two-opamp biquadratic cell [18]. However, the frequency response of this cell is affected by the opamp GBW which could be reduced when high-gain multi-stage opamp structures (with compensation schemes reducing the GBW) are used. In a robust design the opamp GBW has to be 50-to-100 higher than the filter pole frequency. This problem can be solved by the correspondent Active-Gm-RC structure of Fig. 19, where the opamp frequency response is taken into account in the overall filter frequency response . In this way the opamp GBW can be only 2-to-3 times higher than the filter pole frequency. This is much less demanding than the multipath structure.







Fig. 19 - Active-Gm-RC cell



Another key problem of both Active-RC and Active-Gm-RC (and any virtual ground based structure) is the bias voltages to be applied at the filter and opamp input and output nodes. The typical approach is to bias input and output nodes at the same voltage level. This however occurs in the bias problem as shown for the differential input stage (that is at the input of the opamp). This point can be solved with the scheme of Fig. 20, where two current sources (MB*1*-MB*2*) connected at the opamp input nodes sinks a taget current in order to bias the opamp input nodes at a voltage lower than the opamp output nodes. This is done by means of the Input-CMFB that reduces  $V_{DDmin}$  because it forces  $V_{oa\ DC}$  to a value lower than  $V_{DD}/2$ ,

while maintaining  $V_i_{DC}=V_o_{DC}=V_{DDmin}/2$ . The opamp input node voltage V*oa\_DC* is given by:

*Eq.* 16 
$$
V_{oa\_DC} = \frac{V_{DD\,min}}{2} - I_1 \cdot \frac{R_1 \cdot R_2}{R_1 + R_2}.
$$

Using this structure has been possible to design a 0.55V analog filter in a 65nm technology, performing rail-to rail input and output swing.

| $V_{DD}[V]$                             | 0.55            |
|-----------------------------------------|-----------------|
| $CMOS$ [ $\mu$ m]                       | 0.13            |
| $V_{TH}[V]$                             | 0.3             |
| Current Cons. [mA]                      | 5.8             |
| Power Cons. [mW]                        | 3.5             |
| Filter Order                            | 4 <sup>th</sup> |
| $G$ [dB]                                | $\Omega$        |
| $f_{-3dB}$ [MHz]                        | 11.3            |
| In-band IIP3 [dBm]                      | 10              |
| Out-Band IIP3 [dBm]                     | 13              |
| 1dBcP [dBm]                             | 0.5             |
| Noise $\left[\mu V_{\text{rms}}\right]$ | 110             |
| DR[dB] - THD@40dBc                      | 60              |
| Area $\lceil$ mm <sup>2</sup> $\rceil$  | 0.43            |

Table IV -  $4^{TH}$ -ORDER 65NM FILTER PERFORMANCE SUMMARY

# IV. CONCLUSION

In this paper an overview of the challenges imposed by the use of scaled technologies in the analog circuit design is presented. In particular, intrinsic gain decreasing,  $V_{DD}-V_{TH}$ reduction and lower supply voltage pushed analog designers to develop new circuit solutions for the analog functional blocks. The case of analog switch, opamp and Active-RC filters is here studied to demonstrate that it is possible to develop new circuit solutions in order to guarantee the same analog performance also in scaled technologies.

#### **REFERENCES**

- [1] G. E. Moore, "Cramming more components on to integrated circuits", Electronics, Vol. 38, no. 8, April 19, 1965; "Progress in digital integrated.
- ITRS 2000, 2003, 2005 Editions. [3] Hei Wong, "A Physicallly-Based MOS Transistor Avalanche Breakdown Model", *IEEE Trans. on electronics Devices*, Dec. 1995, pp. 2197.
- [4] Qi, X.; et al. "Efficient subthreshold leakage current optimization– Leakage...", IEEE Circ. and Dev. Mag., Sept.-Oct. 2006, pp 39 – 47.
- [5] Pineda de Gyvez, J. Tunihout, "Threshold Voltage Mismatch and Intra-Die Leakage Current in Digital CMOS Circuits", *IEEE J. Solid-State Circuits*, vol.<br>39, no. 12, pp. 157–168, Jan. 2004.<br>[6] J. Pekarik, et al., "RFCMOS Technology from 0.25 µm to 65nm: The State of the
- Art, CICC 2004
- [7] Y. Tsividis, "Operation and modeling of the MOS transistor", Oxford University Press,
- [8] C. Chen, J. W. Chou, W. Lur, and S. W. Sun "A Novel 0.25 pm Shallow Trench Isolation Technology" Electron Devices Meeting, 1996, pp.837-840, Dec. 1996.
- [9] G. Scott, et al. , "NMOS drive current reduction caused by transistor layout and trench isolation induced stress", Electron Devices Meeting, 1999. IEDM Technical Digest. International, 5-8 Dec. 1999, pp. 827 - 830
- [10] Andrew M. Abo and Paul R. Gray, "A 1.5-V, 10-bit, 14.3-MS/s CMOS Pipeline Analog-to-Digital Converter", *IEEE J. Solid-State Circuits*, May 1999.
- [11] R. G. H. Eschauzier and J. H. Huijsing, Frequency Compensation Techniques for Low-Power Operational Amplifiers. Boston, MA: Kluwer, 1995.
- [12] J. H. Huijsing, R. Hogervorst, and K.-J. de Landen, "Low-power low-voltage VLSI operational amplifier cells," *IEEE Trans. Circuits Syst*. I, vol. 42, pp. 841– 852, Nov. 1995.
- [13] Ivonne Di Sancarlo, Dario Giotta, Andrea Baschirotto, Richard Gaggl "A 65-nm 84-dB-gain 200-MHz-UGB CMOS Fully-Differential Three-Stage Amplifier with a Novel Common Mode Control". ESSCIRC 2008 pp. 314 – 317, 15-19 Sept. 2008
- [14] S. D'Amico, V. Giannini, A. Baschirotto, "A 4<sup>th</sup> order Active-gm-RC Reconfigurable (UMTS/WLAN) Filter", IEEE Journal of Solid State Circuits – Vol. 41- No. 7 - July 2006 – pp. 1630-1637

# *TUESDAY 22 SEPTEMBER 2009*

# *PARALLEL SESSION A2 ASICS*

# Gossipo-3: A prototype of a Front-End Pixel Chip for Read-Out of Micro-Pattern Gas **Detectors**

Christoph Brezina<sup>b</sup>, Klaus Desch<sup>b</sup>, Harry van der Graaf<sup>a</sup>, Vladimir Gromov<sup>a</sup>, Ruud Kluit<sup>a</sup>, Andre Kruth<sup>b</sup>, Francesco Zappon<sup>a</sup>

> **<sup>a</sup>** National Institute for Subatomic Physics (Nikhef), Amsterdam, **b** Institute of Physics, Bonn University

# [vgromov@nikhef.nl](mailto:vgromov@nikhef.nl)

# *Abstract*

In a joint effort of Nikhef (Amsterdam) and the University of Bonn, the Gossipo-3 integrated circuit (IC) has been developed. This circuit is a prototype of a chip dedicated for read-out of various types of position sensitive Micro-Pattern Gas detectors (MPGD).

The Gossipo-3 is defined as a set of building blocks to be used in a future highly granulated (60 µm) chip. The pixel circuit can operate in two modes. In Time mode every readout pixel measures the hit arrival time and the charge deposit. For this purpose it has been equipped with a high resolution TDC (1.7 ns) covering dynamic range up to 102 µs. Charge collected by the pixel will be measured using Time-over-Threshold method in the range from 400 e<sup>-</sup> to 28000 e<sup>-</sup> with accuracy of 200 e<sup>-</sup> (standard deviation). In Counting mode every pixel operates as a 24-bit counter, counting the number of incoming hits.

The circuit is also optimized to operate at low power consumption  $(100 \text{ mW/cm}^2)$  that is required to avoid the need for massive power transport and cooling systems inside the construction of the detector.

# I. Introduction

A number of features make Micro-Pattern Gas Detectors [1] (MPGD) attractive to be used in particle-physics experiments, astro-particle research and medical imaging. Among those are high spatial resolution, radiation hardness and inherent low material budget. The availability of highly integrated readout electronics allows for the design of gasdetector systems with channel densities comparable to that of modern silicon detectors. Main specifications of such an IC will be compatible with requirements imposed upon the ATLAS Pixel System for Upgraded Luminosities.

In 2007 we submitted the Gossipo-2 chip [2] as a prototype of a pixel readout array featuring high resolution TDC-per-pixel architecture. Although this design was successful, it has been found that some blocks of the circuit need modifications.

The main goal of the present prototype called Gossipo-3 is to optimize the design of the building blocks for a future IC dedicated for readout of MPGD's (e.g. the Timepix2 chip). .

#### II. Micro-Pattern Gas Detectors

A Micro-Pattern Gas Detector (see Figure 1) is positionsensitive proportional counter. A gas layer is used as signal generator. Construction of the detector includes a CMOS pixel array and a Micromegas placed at the distance of 50 µm on top of it by using a wafer post-processing technology (Integrated Grid or InGrid) [3], [4]. Above this grid a cathode foil is built. The cathode foil and the grid are put at negative voltage and the pixel array surface is at ground potential. The volume between the drift foil and the pixel array is filled with a suitable gas mixture.

When a minimum ionizing (MIP) particle passes the drift gap (see Figure1), some primary electron-ion pairs will be created along the track. Driven by an electric field, primary electrons will drift towards the pixels [5]. In the InGrid-pixel gap an avalanche multiplication occurs making the charge sufficient to activate an on-pixel integrated circuit. The activated pixels will give the complete image of the track (projection of the track on the array surface). Moreover the drift time measurements at the activated pixels will indicate the vertical coordinates of the primary electrons. By combining the data of all participating electrons, a track segment can be reconstructed in space.

A point of concern are high voltage breakdowns (discharges) occurring in the InGrid-pixel gap. These can damage or destroy the read-out chip. This problem has been solved by placing an adequate protection layer on the surface of the chip. In this case the charge of the discharge is limited by the capacitance of the protection layer attributed to the pixel and is only 8 pC [6].



Figure 1: Layout of the micro-pattern gas detector with the amplification structure based on an integrated grid

The pixel readout chip is the basic component of the detector. It has to have a high densely pixel structure for accurate coordinate measurements. It should be able to provide high efficiency of detecting of single primary electrons. It is required to equip every pixel with a high resolution TDC for the drift time measurements.

# III. The Pixel.

Each pixel in the Gossipo-3 prototype has a charge sensitive preamplifier and a discriminator to generate a Hit signal when the threshold level has been reached (see Figure2).



In Time mode the digital part consists of a Local Oscillator, Fast counter, Slow counter and Time-over-Threshold counter. The Hit signal starts the data taking phase (see Figure 3). It triggers the Local Oscillator (LO) which starts to run at 580 MHz  $(T=1.7 \text{ ns})$  and activates Fast, Slow and Time-over-Threshold (ToT) counters. The LO will be stopped by the next rising edge of the external Clock signal (40 MHz). The Fast counter counts the number of signals at the output of the LO. In this way the delay of the Hit signal within one Clock period can be digitized. The Slow counter gives the number of full Clock periods (40MHz) in the time interval between the Hit and the Trigger signal. The final states of Fast and Slow counters present time position of the Hit signal in respect of the Trigger signal. The accuracy of the time measurement is determined by period of oscillation of the LO (1.7 ns). Both the period of the Clock signal (25 ns) and the number of bits in Slow counter (12 bits) determine the dynamic range (102 µs) of the measurement.

The ToT counter counts the number of full Clock periods (T=25 ns) in the time interval when the Hit signal is high. This time interval is proportional to the charge collected by the pixel. In this way the charge deposit can be digitized. The accuracy of the ToT method is limited by the noise-related time jitter on the non-steep falling edge of the signal (see Figure 3) at the output of the preamplifier  $(\sigma=27 \text{ ns})$ corresponds to 200 e<sup>-</sup>). The ToT is linear proportional to the input charge in the range from  $\theta$  ns to  $\theta$  us (corresponds to 28000 e<sup>-</sup>). This corresponds with the number of bits in ToT counter (8 bits) resulting in a 6.4 µs dynamic range (25ns $\bullet$ 2<sup>8</sup>).

With the arrival of the Token signal the data read-out phase is started. In this phase all the counters are configured into serially-connected shift registers. Driven by the Clock (40 MHz), the data will be shifted to the periphery of the chip. After the readout is completed all the counters will be reset.

In Counting mode the counters are combined into a 24-bit counter. In this mode only the information on the number of hits coming to the pixel in time interval between the Reset and the Trigger signals is available.

For the purpose of reducing dispersion of the threshold levels between the pixels, each pixel has a 4-bit DAC .



Figure 3: Time diagrams of operation in Time mode

# IV. Front-end circuit.

In gas-filled pixel detectors parasitic capacitance at the input of the read-out circuit is very low (Cpar  $\approx 10$  fF). There is no need to compensate for the detector leakage current. This allows us design a compact, low-noise, fast, low-power front-end circuit optimized for high performance time measurements.

Besides a preamplifier, the circuit (see Figure 4) also includes a special device protecting the input against microdischarges in the detector (see Chapter VI). The charge sensitive preamplifier has constant current feedback. In this topology the feedback transistor (Tfb) operates as a floating current source. The signal charge is integrated on the drain capacitance of the transistor which can be seen as a feedback capacitance in this topology. The capacitance has been chosen as small as 1 fF in order to provide high gain. The floating current source gradually discharges the feedback capacitance resulting in linear falling edge of the output signal. When no signal is present the feedback transistor operates as a 30 MOhm resistor.

In this circuit, the discriminator is capacitive coupled to the preamplifier. A constant current circuit controls the bias voltage at the discriminator input. It also provides recovery to the baseline after the hit. The time constant of the recovery has been chosen large (tens of microseconds) in order to avoid signal distortion. Channel-to-channel threshold dispersion is not influenced by the offset of the preamplifier and is determined by the mismatch in the discriminator circuit
$(\sigma = 70e^{\degree})$ . With the help of the on-pixel DAC (4 bits) it can be reduced down to  $\sigma = 5$  e.



Figure 4: The front-end circuit

Even when having a power consumption of a few microwatts only, the front-end circuit demonstrates fast response (20 ns) and low noise ( $\sigma$ =70 e<sup>-</sup>).

#### V. Local oscillator circuit

The local oscillator circuit includes a NAND gate with a chain of inverters in the feedback (see Figure 5). A positive signal at the input triggers the circuit to oscillate at the frequency determined by the delay in the feedback. The oscillation frequency (580 MHz) is 14.5 times higher than the clock frequency (40 MHz). This means that 14.5 oscillator cycles are within one clock period and that the position of the leading edge of the input pulse can be determined with an accuracy of 1.7 ns.



Figure 5. The local oscillator circuit

The gate delay of the CMOS inverters is sensitive to variations in the temperature and power supply voltage. Careful studies of the stability of the oscillator frequency show that the period of the oscillations is directly proportional to the temperature, with a slope of 2% per 10ºC, and inversely

proportional to the power supply voltage, with the slope of - 12% per 100 mV. This is in agreement with simulations.

Modeling and measurements demonstrate that channel-tochannel spread of the oscillation frequency could be kept low. when the delay components are properly sized. Then even in the worst case (when the circuit is active within the whole clock period 25 ns), the accumulated error will be less than the period of oscillations (1.7 ns). This means that there will be no discrepancy on the number of pulses generated by different local oscillator circuits within one pixel array.

On the contrary, effects caused by variation of the fabrication process are much more significant. According to simulations the oscillation frequency in the fast corner is twice that in the slow corner.

Notice that the frequency could be set to the required value by adjusting of the supply voltage (Vdd) on the circuit.

In the fast corner, Vdd should be lowered and in the slow corner a higher value of Vdd is needed. In order to be able to tune the oscillator frequency a low drop voltage regulator (LDO) has been designed.

# V. Low-drop voltage regulator

Figure 6 shows simplified schematic of the voltage regulator generating stable and adjustable (controlled by Uref) power supply voltage for all local oscillator circuits in the pixel read-out array. This topology involves an off-chip capacitor (10 µF). In order to reach required performance the capacitor has to have low equivalent serial resistance (less than 1 Ohm), on top of that the coupling to the capacitor (onchip wiring and package bonding wires) needs to be low resistive.



Figure 6. Simplified schematic of the LDO for all local oscillator circuits

The output voltage can be set in the range from 0.6 V to 1.1 V that is sufficient to adjust the oscillator circuits fabricated in all corners of the process. The circuit has a high PSRR (40 dB) in wide frequency range. The equivalent output impedance is low (less than 1 Ohm) and that allows to keep the output voltage stable when the load current changes.

#### VI. Discharge protection device

With the protection layer placed on the surface of the readout chip the size of the discharge (in the case of high-voltage breakdown between the InGrid the pixel) is reduced down to 8 pC. And yet such a signal builds-up critical voltage at the input of the preamplifier and can damage the front-end circuit. In the Gossipo-3 we use a standard N-channel transistor as a protection device (see Figure 7).



Figure 7. Layout of the device, protecting against discharges in the detector

In this device the drain, the bulk and the gate are tied to the ground and the source is connected to the node to be protected. The inversion layer under the gate and the p-n junction in the substrate form two channels draining the discharge current. So the voltage at the protected node will not exceed the critical level even when a small-size device  $(W/L=1 \mu m/0.24 \mu m)$  is used. Under typical operating condition this device does not introduce noticeable leakage current (250 pA) or parasitic capacitance (1.3 fF) into the circuit.

#### VII. Summary

The Gossipo-3 is a prototype of building blocks to be used in a future front-end pixel chip for read-out of Micro-Pattern Gas Detectors.

Every pixel will be equipped with a high resolution TDC  $(1.7 \text{ ns})$  covering a dynamic range up to  $100 \text{ µs}$  and a Timeover-Threshold counter to evaluate the charge deposit. The chip will also be able to operate in hit counting mode.

Each pixel has a low noise  $(\sigma = 70 \text{ e})$ , fast (response 20 ns) and low power  $(3 \mu W)$  front-end circuit optimized for high performance time measurements. A compact device is placed at the input of the front-end circuit providing protection against micro-discharges taking place in the detector.

For the purpose to tune oscillation frequency of the onpixel local oscillator circuit a voltage regulator has been designed. It will provide stable, adjustable and load current independent power supply voltage for all pixel oscillators in the read-out array.

Gossipo-3 has been taped-out for MPW production run in 0.13 µm CMOS technology (September 2009).

# VI. References.

[1] Maxim Titov, "New developments and future perspectives of gaseous detectors", Nucl. Instr. and Methods, A581 (2007), pp. 25-37

[2] V.Gromov, R.Kluit, H. van der Graaf, "Development of a Front-end Pixel Chip for Readout of Micro-Pattern Gas Detectors.", Proceedings of the TWEPP-08 Topical Workshop on Electronics for Particle Physics, Naxos, Greece, 15-19 September 2008.

[3] M.Campbell et al, "GOSSIP: A vertex detector combining a thin gas layer as signal generator with a CMOS readout pixel array", *IEEE Trans. Nucl. Sci.,* vol.46, No.6, 1999.

[4] Harry van der Graaf et al ["Novel gas-based detection](http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4VG5HW3-2&_user=6891091&_coverDate=06%2F01%2F2009&_alid=1043522649&_rdoc=5&_fmt=high&_orig=search&_cdi=5314&_sort=r&_docanchor=&view=c&_ct=7&_acct=C000031419&_version=1&_urlVersion=0&_userid=6891091&md5=bfe2be0df81bcbc70e8059ffd187aa9a)  [techniques"](http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4VG5HW3-2&_user=6891091&_coverDate=06%2F01%2F2009&_alid=1043522649&_rdoc=5&_fmt=high&_orig=search&_cdi=5314&_sort=r&_docanchor=&view=c&_ct=7&_acct=C000031419&_version=1&_urlVersion=0&_userid=6891091&md5=bfe2be0df81bcbc70e8059ffd187aa9a) *Nucl. Instrum. and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment*, Volume 604, Issues 1-2, 1 June 2009, Pages 5-7.

[5] V.M. Blanco Carballo et al ["Charge amplitude](http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4PJCY69-F&_user=6891091&_coverDate=12%2F11%2F2007&_alid=1043534106&_rdoc=2&_fmt=high&_orig=search&_cdi=5314&_sort=r&_docanchor=&view=c&_ct=19&_acct=C000031419&_version=1&_urlVersion=0&_userid=6891091&md5=512ce38c1ad092112e6ab1df7609311a)  [distribution of the Gossip gaseous pixel detector"](http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4PJCY69-F&_user=6891091&_coverDate=12%2F11%2F2007&_alid=1043534106&_rdoc=2&_fmt=high&_orig=search&_cdi=5314&_sort=r&_docanchor=&view=c&_ct=19&_acct=C000031419&_version=1&_urlVersion=0&_userid=6891091&md5=512ce38c1ad092112e6ab1df7609311a), *Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment*, Volume 583, Issue 1, 11 December 2007, Pages 42-48.

[6] Harry van der Graaf et al ["New developments in](http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4VXB8PM-V&_user=6891091&_coverDate=08%2F01%2F2009&_alid=1043522649&_rdoc=2&_fmt=high&_orig=search&_cdi=5314&_sort=r&_docanchor=&view=c&_ct=7&_acct=C000031419&_version=1&_urlVersion=0&_userid=6891091&md5=b82e2fef0a9f96c68dcf87b1907ccd50)  [gaseous tracking and imaging detectors](http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4VXB8PM-V&_user=6891091&_coverDate=08%2F01%2F2009&_alid=1043522649&_rdoc=2&_fmt=high&_orig=search&_cdi=5314&_sort=r&_docanchor=&view=c&_ct=7&_acct=C000031419&_version=1&_urlVersion=0&_userid=6891091&md5=b82e2fef0a9f96c68dcf87b1907ccd50)" *Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment*, Volume 607, Issue 1, 1 August 2009, Pages 78-80.

# DIRAC v2: a DIgital Readout Asic for hadronic Calorimeter

R. Gaglione<sup>ab</sup>, C. Adloff<sup>b</sup>, M Chefdeville<sup>b</sup>, C. Drancourt<sup>b</sup>, G Vouters<sup>b</sup>

<sup>a</sup> IPNL, Université de Lyon, Université Lyon 1, CNRS/IN2P3, France / MICRHAU

<sup>b</sup> LAPP, Université de Savoie, CNRS/IN2P3, France

gaglione@lapp.in2p3.fr

# *Abstract*

DIRAC is a 64 channel mixed-signal readout integrated circuit designed for Micro-Pattern Gaseous Detectors (MI-CROMEGAS, Gas Electron Multiplier) or Resistive Plate Chambers. These detectors are foreseen as the active part of a digital hadronic calorimeter for a high energy physics experiment at the International Linear Collider. Physic requirements lead to a highly granular hadronic calorimeter with up to thirty million channels with probably only hit information (digital calorimeter). The DIRAC ASIC has been especially designed for these constraints. Each channel of the DIRAC chip is made of a 4 gains charge preamplifier, a DC-servo loop, 3 switched comparators and a digital memory, thus providing additional energy information for a hit. A bulk MICROMEGAS detector with embedded DIRAC v1 ASIC has been built. The tests of this assembly, both in laboratory with X-Rays and in a beam at CERN are presented, demonstrating the feasibility of a bulk MICROMEGAS detector with embedded electronics. The second version of the ASIC, with improved noise and additional functionalities, has been tested on bench and characterisation is detailed, and foreseen associated detectors are presented.

# I. DIGITAL HADRONIC CALORIMETER AT THE INTERNATIONAL LINEAR COLLIDER

MICRO MEsh GAseous Structure (MICROMEGAS [1]), Gas Electron Multipliers and Resistive Plate Chambers detector are three candidates for the active part of a Digital Hadronic CALorimeter (DHCAL) for a high energy physics experiment at the International Linear Collider. Physics requirements lead to a highly granular hadronic calorimeter with up to thirty million channels with probably only hit information (digital calorimeter).

To validate the concept of digital hadronic calorimetry, a 1 m<sup>3</sup> technological prototype, made of 40 planes of 1 m<sup>2</sup> each will be built.

Such a technological prototype involves not less than 400 000 electronic channels, thus leading to the development of DIRAC ASIC.

## *A. Detectors signals*

Table 1 shows the signal characteristics of the three foreseen gaseous detectors. MICROMEGAS and GEM have slightly the same amplitude, whereas RPC, which have much higher gain, provide ten times more charges than Micro Pattern Gaseous Detectors. This implies different dynamic ranges in the ASIC.

Although they have different shapes, the signal bandwidth are very close, the detector capacitances are similar, so they can be handled by the same preamplifier.

Table 1: Signal characteristics of foressen detectors

|                                       | <b>MICROMEGAS</b> | <b>GEM</b> | <b>GRPC</b>   |
|---------------------------------------|-------------------|------------|---------------|
| Charge                                | $1-100$ fC        | $1-100$ fC | $0.1 - 10$ pC |
| $C_{\text{det}}$ (1 cm <sup>2</sup> ) | 60pF              | 60pF       | 60pF          |
| t,                                    | $<$ 2 ns          | $<$ 2 ns   | $<$ 2 ns      |
| Pulse width                           | complex shape     | $20$ ns    | $20$ ns       |

# *B. Collider timing*

In addition to detector's signals, ASIC must fit with beam timings. The beam is composed of bunch-crossing trains every 200 ms (Figure 1). Inside trains, bunch-crossing are periodical, according to Table 2 [2]. As the data rate for the HCAL is foreseen to be low (1 hit per channel per train), raw data will be held in front-end memory during trains, and read by data acquisition system between train. Thus, no trigger decision is needed for front end. Additionally, to save electrical power, the analog front-end will be shut down outside trains, with a ratio lower than 1%.

Table 2: ILC beam train timing characteristics

|                        | Minimum | Nominal | Maximum |
|------------------------|---------|---------|---------|
| Bunchs # per train     | 1320    | 2625    | 5120    |
| Bunchs period (ns)     | 189     | 369     | 480     |
| Train length $(\mu s)$ | 250     | 1000    | 2500    |
| ON/OFF ratio $(\%)$    | O 1     | 0.5     | 1.25    |
| Rate (Hz)              |         |         |         |



Figure 1: ILC beam structure

# II. DIRAC V2 ASIC

DIRAC ASIC have been specifically designed to comply with ILC DHCAL requirements, for both Micro Pattern Gaseous Detectors in one hand and for Resistive Plate Chamber one the other hand.

A chip is made of 64 channels, divided into 2 banks of 32 channels. Each channel is made of a gated integrator with four dynamic ranges (50, 100, 200 fC or 10 pC), a baseline restorer and three comparators. Thresholds are common for a bank of 32 channels, and each threshold is set by a 8-bit DAC, so they are six DAC inside the chip. The 2-bit result of the comparison is stored into a 8-event depth memory and stamped with a 12-bit bunch identifier. Additionaly, each channel has a trigger masking bit. A multiplexed analog readout has been implemented for fine detector measurements. A detailed schematic of the architecture of the second version of this chip is given in Figure 2.



Figure 2: DIRAC v2 architecture



Figure 3: Daisy chain of several ASIC

During the bunch crossing, the gated integrator, synchronized on LVDS clock acquires signal from detector, and, at the end of this period, the measured charge is compared to the threshold and the result stored. Outside trains, the analog frontend is shuted down, the memory emptied and, if needed, configuration performed.

Configuration and readout are LVCMOS serial digital sig-

nals, and output flag are open-drain signals. Thus, several ASICs may be chained (Figure 3) to equip a large area detector.

All the pins for digital I/O and daisy chaining are on the left side of the die (Figure 4), the analog power supplies, external voltage references and bias are on the right side, detector inputs are on the lower and upper part of the die. This allows to simplify the PCB routing, as it is a part of the gaseous detector. The dimensions of the die are only 1.6 mm $\times$ 4.8 mm, and the chosen technology is Austriamicrosystem 0.35 µm CMOS process.



Figure 4: Die photography of DIRAC v2

# III. ASIC TESTS

### *A. Bench*

The layout of this testboard is as close as possible to the future detector boards. The acquisition chain used to test DIRAC v2 ASIC consists of a PC with a Labview software, linked with USB connection to a CALICE HCAL DIF [3] board, providing state machine and clock generation for ASIC thanks to FPGA, an intermediate board, which provides reference voltages (will be used in future for detector biasing) and finally the test board, with a socket to insert the ASIC on one side, and 64 anodes, 1 cm<sup>2</sup> each, to simulate detector capacitance.

# *B. Procedure*

It is important to characterize each of the five received prototypes and measure dispersions, to check if individual channel and/or chip calibration will be compulsory for the foreseen digital calorimeter. Thus, for each channel, each comparator has to be checked, and some figure of merit to be extracted to verify comformity. The choosen figure of merit are the gain and the pedestal of each comparator, and the linearity error. The same methodology will be used in production to verify if each ASIC meets the specifications:

- Measure trigger efficiency  $vs$  thresholds  $(t)$  for different input charge  $(q)$ ;
- For each input charge, extract s-curves and fit it with a Fermi-Dirac distribution:

$$
S(t,q) = \frac{max}{1 + e^{\frac{t - \mu(q)}{w}}} \quad \begin{array}{c} max : \text{ maximum efficiency} \\ \mu : \text{ inflexion point abcisse} \\ w : \text{ inflexion slope} \end{array}
$$

One example of s-curve and fit, for an injected charge of 60 fC can be seen in Figure 5. Then, the linearities can be plotted:

•  $M(\mu(q))$ :  $\mu$  *vs* input charge for each channel;

• Linear fit:

$$
F(q) = 1/g \cdot q + b \qquad \begin{array}{c} g \\ b \\ \end{array} \qquad \begin{array}{c} \text{gain} \\ \text{pedestal} \end{array}
$$

• Linearity error:  $\frac{F - M}{M}$  normalized in %.



Figure 5: Example of s-curve

Input charge is injected with a GPIB controlled pulse generator delivering a voltage pulse to on-chip 1 pC test capacitor through -20 dB attenuator. All data are collected to PC thanks to Labview software, and analyzed offline with a ROOT/C++ framework.

# *C. Results*

The characterisation of DIRAC v1 has been described in details in [4].

DIRAC v2 brings some improvements to these results: Table 3 gives results of a gaussian fit for gain and pedestal distribution for the five prototypes. These values are good enough to avoid any channel to channel ASIC calibration in the calorimeter.

The linearity is within  $+1/-3\%$  on the 20–200 fC range.

The noise, extract from s-curve, is the width from 100% efficiency to 0% efficiency. At  $5\sigma$ , the noise is less than 5 fC.

The minimum threshold is less than 10 fC which corresponds to half of the most probable signal from minimum ionising particles in MICROMEGAS detectors developped at LAPP.

Moreover, a power-on time less than 3  $\mu$ s has been measured. To obtain this result, efficiency and inflexion point of the s-curve have been observed in function of the power-on time.

Table 3: Prototype dispersion summary

| Chip ID        |       |      |       |      |      |      |
|----------------|-------|------|-------|------|------|------|
| Gain (fC/DACU) | mean  |      | - 1.0 |      | 1.0  |      |
|                | sigma | 0.03 | 0.03  | 0.02 | 0.02 | 0.02 |
| Pedestal (fC)  | mean  | 6.2  | 4.0   | 6.7  | 6.9  | 5.6  |
|                | sigma | 2 O  | 1.6   | 1.8  | h    |      |

<sup>1</sup>In a DHCAL, this cover will be a part of the calorimeter absorber.

# IV. ASIC EMBEDDED IN MICROMEGAS **CHAMBER**

To minimize shower leakage and minimize HCAL diameter, the thickness of the active medium must be thinner than 8 mm. In that respect, the Bulk MICROMEGAS is an attractive option as the fabrication process allows a PCB with front end ASICs soldered on one side, and anode pads patterned on the other side to be equipped with a MICROMEGAS mesh. This is crucial for the construction of a real scale DHCAL for which large areas should be instrumented and dead zones minimized to insure a good hermeticity of the HCAL.

A MICROMEGAS detector consists of a gas volume separated in a drift and an amplification region by a thin mesh. An industrial technology, called bulk [5] has been chosen. The drift region is defined by a 3 mm thick resin frame manufactured by stereolithography. This frame also provides gas inlet and outlet. The drift electrode (cathode) is made of a 5 µm thick copper foil and a 75 µm Kapton insulator, glued together on a 2 mm thick steel plate, which is the MICROMEGAS chamber lid<sup>1</sup>. The amplification electrode consists of a woven mesh of stainless steel wires (18 µm diameter, 56 µm pitch) maintained by insulating pillars patterned by photolithography (400 µm diameter) at precisely 128 µm of the anodes. This technology has been chosen for its performances (especially near 100% efficiency for MIP), its robustness [6][7] and its ease of industrialisation in a PCB workshop (big amount of detectors and large areas are expected in the future). A cut view of the detector is presented Figure 6.



Figure 6: Micromegas with embedded ASIC

The first tests are made with an X-ray source  $(^{55}Fe)$  thanks to a small hole in the lid of the chamber. The energy resolution is measured and the gain of the detector *vs* high voltage is checked (Figure 7). The pedestal is at 5 ADC counts, the photopeak at about 770 ( $\sigma$ =63). This gives an energy resolution  $\sigma_E/E$  of 8.5% at 5.9 keV. A typical gain is 10<sup>4</sup> for a -420 V mesh voltage.



Figure 7: Detector characterisation

This chamber has been tested in a 200 GeV pion beam at the CERN/SPS [8]. A drift cathode voltage of 460 V and a mesh

voltage of 410 V have been used. Thresholds have been set to 24, 40 and 80 DAC code (respectively to 19, 32 and 64 fC). Figure 8 shows the beam profile in hit counts.

The measured hit multiplicity (mean number of pad hit for each trigger) is 1.10. This is in very good agreement with previous measurements performed with analog readout prototypes of same chamber geometry [9] [10].



Figure 8: Beam profile

# V. FUTURE IMPROVEMENT

Recent physics simulation may indicate that the lowest threshold for a DHCAL should be around 1 MIP-MPV. However, a 97% efficiency for the detector invloves a minimum threshold of 1.3 fC. Thus, to perform further R&D on the micromegas detector, a new preamplifier is needed, with improved gain, improved bandwidth, and improved noise, and without rising the power consumption too much. An improved design, presented Figure 9 has started. The difference with the first version are a gain boost stage, a cascoded current source (I2) and a voltage follower at the output. With  $I1=200 \mu A$ ,  $I2=20 \mu A$ , I3=I4=10  $\mu$ A, the preliminary simulation results give a gain of 108 dB, a bandwidth of 8 kHz, and a gain-bandwidth product of 2 GHz. The phase margin is 72° and the minimum phase (in the bandwidth) of 51°. With no increase in the bias current, the rms noise is 2.2 mV with  $C_{\text{det}}$ =60 pF. This value will be improved with increase of I1.



Figure 9: Preamplifier schematic

# VI. CONCLUSION

The construction of a bulk MICROMEGAS chamber with embedded ASIC has been demonstrated by producing and testing a first prototype equipped with a DIRAC v1 chip. The second version of DIRAC ASIC is currently under characterisation. Further measurements of detection efficiency and hit multiplicity are foressen to assess in more details the performance of DIRAC based MICROMEGAS chambers. For that purpose, four chambers of small size  $(8\times8 \text{ cm}^2)$  are well suited and are currently under development. These measurements will be performed with beam at CERN.In parralel, the design of medium area PCB ( $32\times48$  cm<sup>2</sup>) has began to build a 1 m<sup>2</sup> MI-CROMEGAS chamber.

# VII. ACKNOWLEDGMENTS

DIRAC v1 preamplifier has been designed by H. Mathez, acquisition board by C. Girerd and acquisition sofware by C. Combaret. The mask and the mesh lamination have been performed at CERN by R. De Oliveira and his TS/DEM group. Chamber design, assembly, handcrafting and tests have been conducted at LAPP by the LC-Detector group. Special thanks are addressed to F. Peltier for his involvement in the detector assembly.

#### VIII. REFERENCES

#### **REFERENCES**

- [1] I. Giomataris, P. Rebourgeard, J.-P. Robert, G. Charpak, *MICROMEGAS: A high-granularity position-sensitive gaseous detector for high particle-flux environments*, doi:10.1016/j.nima.2005.12.222, *NIM-A* 376-1 29– 35, June, 21, 1996.
- [2] C. Adloff *et al.*, *International Linear Collider Reference Design Report*, ILC Global Effort and World Wide Study, October, 2007.
- [3] CALICE wiki: [https://twiki.cern.ch/twiki/bin/](https://twiki.cern.ch/twiki/bin/view/CALICE/DetectorInterface) [view/CALICE/DetectorInterface](https://twiki.cern.ch/twiki/bin/view/CALICE/DetectorInterface)
- [4] R. Gaglione and H. Mathez, *DIRAC: a Digital Readout Asic for hAdronic Calorimeter*, in *IEEE-NSSS conference record*, October, 19–25, 2008, Dresden, 1 1815-1819.
- [5] I. Giomataris, R. De Oliveira *et al.*, *MICROMEGAS in a bulk*, doi:10.1016/j.nima.2005.12.222, *NIM-A* 560- 2 405–408, May, 10, 2006.
- [6] P. Baron *et al.*, *Large bulk-micromegas detectors for TPC applications in HEP*, in *IEEE-NSSS conference record*, October, 27–28, 2007, Honolulu, 6 4640–4644.
- [7] T. Alexopoulos *et al.*, *Development of large size Micromegas detector for the upgrade of the ATLAS Muon system*, doi:10.1016/j.nima.2009.06.113, *NIM-A* in press.
- [8] R. Gaglione, C. Adloff, M. Chefdeville, A. Espargilière, N. Geffroy, Y. Karyotakis and R. De Oliveira, *A MICROMEGAS chamber with embedded DIRAC ASIC for hadronic calorimeter*, First Conference on Micro-Pattern Gaseous Detectors proceedings, Kolympari, Crete, Greece, June, 12–15, 2009, *JINST* in press.
- [9] C. Adloff, A. Espargiliere, Y. Karyotakis, ` *Large surface MicroMegas with embedded front-end electronics for a digital hadronic calorimeter*, in *IEEE-NSSS conference record*, October, 19–25, 2008, Dresden, 1 1433-1435.
- [10] C. Adloff, J. Blaha, M. Chefdeville, A. Dalmaz, C. Dran-

court, A. Espargiliere ,R. Gaglione, R. Gallet, N. Geffroy, ` J. Jaquemier, Y. Karyotakis, F. Peltier, J. Prast, G. Vouters, D. Attié, P. Colas, I. Giomataris, *MICROMEGAS chambers for hadronic calorimetry at a future linear collider*, arXiv:0909.3197, *JINST* in press.

# HARDROC, Readout chip of the Digital Hadronic Calorimeter of ILC

S. Callier<sup>a</sup>, F. Dulucq<sup>a</sup>, C. de La Taille<sup>a</sup>, G. Martin-Chassard<sup>a</sup>, N. Seguin-Moreau<sup>a</sup>

a OMEGA/LAL/IN2P3, LAL Université Paris-Sud, Orsay,France

seguin@lal.in2p3.fr

# *Abstract*

HARDROC (HAdronic Rpc Detector ReadOut Chip) [1] is the very front end chip designed for the readout of the RPC or Micromegas foreseen for the Digital HAdronic CALorimeter (DHCAL) of the future International Linear Collider.

The very fine granularity of the ILC hadronic calorimeters  $(1cm<sup>2</sup>$  pads) implies a huge number of electronics channels  $(4 10^5/m^3)$  which is a new feature of "imaging" calorimetry.

Moreover, for compactness, the chips must be embedded inside the detector making crucial the reduction of the power consumption to 10  $\mu$ W per channel. This is achieved using power pulsing, made possible by the ILC bunch pattern (1 ms of data acquisition for 199 ms of dead time).

HARDROC readout is a semi-digital readout with three thresholds which allows both good tracking and coarse energy measurement, and also integrates on chip data storage.

The overall performance of HARDROC will be described with detailed measurements of all the characteristics. Hundreds of chips have indeed been produced and tested before being mounted on printed boards developed for the readout of large scale  $(1m^2)$  RPC and Micromegas prototypes. These prototypes have been tested with cosmics and also in testbeam at CERN in 2008 and 2009 to evaluate the performance of different kinds of GRPCs and to validate the semi-digital electronics readout system in beam conditions.

### I. REQUIREMENTS

# *A. "Imaging calorimetry" at ILC*

Imaging calorimetry consists of reconstructing each particle individually using the Particle Flow Algorithm (PFA). The calorimeters have to be highly granular and segmented.

 To address efficiently the R&D developments for calorimeters, the CALICE collaboration [2] has been created in 2003. This collaboration gathers around 280 physicists and engineers, 11 countries and 42 labs. CALICE has chosen to separate 2 axes of R&D. It was first decided to build "physics prototypes" in order to study the PFA, validate the simulation and check the performance of the detectors in test beam. The Electromagnetic Calorimeter (ECAL) made of W-SI and the Analog Hardronic Calorimeter (AHCAL) made of scintillating tiles readout by Si PM have been tested in testbeam at CERN, DESY and FERMILAB since 2003 up to last year, providing good physics data.

The CALICE collaboration then decided to move to large scale "technological prototypes" funded by the EUDET European program [3]. The aim of these prototypes is to study the feasibility of large scale, industrializable modules.

# *B. Electronics requirements*

Electronics requirements are very stringent. Hundred millions of large dynamic range channels have to be read out: on chip zero suppress and auto trigger on  $\frac{1}{2}$  MIP or on few fC must be performed with ultra low power: < 25µW/channel.



Figure 1: ILC bunch pattern

Moreover for compactness, chips have to be embedded inside the detector without any external circuitry.

To minimize the data lines and the power consumption, the readout architecture [4] is common to all the calorimeters and based on a daisy chain using a token ring mode as shown in Figure 2. This readout matches the ILC beam shown in Figure 1.



Figure 2: Common readout architecture

# *C. DHCAL, Digital Hadronic Calorimeter*

There are 2 options for the ILC hadronic calorimeter. The conservative analog option (AHCAL) using an analog readout and the Digital option (DHCAL) well dedicated to the PFA and

the high granularity of the detector and which allows a "semi-digital" readout.

# *D. Technological prototype*

The absorbers are made of 40 steel plates of 20 mm  $(\sim 1X0)$ . As for the active medium, 2 options are also studied: Gaseous Resistive Plate Chambers or GRPC [5] and Micromegas [6].

In both cases, the granularity is high  $(1x1 \text{ cm}^2)$  as well as the segmentation  $(5 \ 10^7 \text{ channels for the entire HCAL}).$ 

# II. HARDROC

HARDROC is the name of the chip designed in SiGe 0.35µm technology to readout the DHCAL. There have been 2 versions of this chip (Hardroc1 and 2). The main difference between these 2 versions is the package which is a plastic thin 160 pins package for Hradroc2 (Figure 3), more suitable to be embedded inside the detector.



Figure 3: TQPF160 package

The HARDROC readout is a semi-digital readout with two or three thresholds which allows both good tracking and coarse energy measurement, and also integrates on chip data storage.

The 64 channels of HARDROC2 (Figure 4) are made of:

- Fast low impedance preamplifier with a variable gain over 8 bits per channel
- A variable slow shaper (50-150ns) and Track and Hold to provide a multiplexed analog charge output up to 10pC, only used for diagnostic
- 3 variable gain fast shapers followed by 3 low offset discriminators to auto trig down to 10 fC up to 10pC. The thresholds are in a ratio 1-10- 100 for better physics performance of the semi digital and are set by 3 internal 10 bit- DACs. The 3 discri outputs are encoded in 2 bits
- A 128 deep digital memory to store the 2\*64 encoded discriminator outputs and bunch crossing identification coded over a 24 bits counter.
- Power pulsing and integration of a POD (Power On Digital) module for the 5MHz and 40 Mhz clocks management during the readout [4], to reach 10µW/channel



Figure 4: Simplified schematics

872 Slow Control registers with default configuration are integrated to set the required configuration.

### III. MEASUREMENTS

Measurements have been performed using a test board without any decoupling capacitors on bias and reference voltages as they slow down the power pulsing.

#### *A. Trigger path*

There are 3 variable CRRC fast shapers. The network feedback of each preamp can be changed independently thanks to the SC parameters. The peaking time is  $\approx$  20-25 ns.

The gain of FSB1 and 2 (Figure 5) can be varied thanks to a 4 bits current mirror gain. FSB0 is dedicated for input charges varying from 10fC up to a few hundreds of fC, FSB1 for input charges from 100fC up to 1pC, and FSB2 for input charges from 1 pC up to 30pC.

The gain of FSB0 is 2mV/fC with the typical Slow Control parameters selected as Rf=100K, Cf=100fF and Gain preamp=128. This can be varied by a factor of 10 thanks to the Slow Control parameters.



Figure 5: Fsb0,1 and 2 waveforms

The threshold of each discriminator is set by a 10 bit-DAC. The residuals to a linear fit are within  $\pm 5$  mV over a 2.2V dynamic range (Figure 6). The slope is 2.1mV/DAC unit which corresponds typically to 1fC/DAC unit for FSB0.



Figure 6: 10 bit-DAC linearity

#### *B. Trigger efficiency measurements:*

The Figure 7 displays trigger efficiency measurements performed on FSB0 when no signal is injected (pedestal measurements) and when 100 fC are injected. The 10% dispersion between the 64 channels is explained by the mismatch of the current mirrors of the variable gain preamp.



Figure 7: Scurves measurements on FSB0

This non uniformity can be corrected by adjusting the gain preamp for each channel. This gain adjustment is performed over 8 bits allowing a 1.5 % adjustment. The Figure 8 exhibits trigger efficiency measurements after gain correction and for small injected charge. The dispersion is within 1.5% after correction. This plot also shows that each channel of hardroc can easily trigger on 10fC.



Figure 8: Uniformity after gain correction

Figure 9 which displays the threshold as a function of the input injected charge shows that each channel can also auto trigger down to 4fC which corresponds to the 5σ noise limit.



Figure 9: 5σ noise limit

# *C. Analog and digital crosstalk*

The analog Xtalk has been measured (Figure 10). This 1% crosstalk is well differentiated and it has been checked that there is no long distance Xtalk.



Figure 10: Crosstalk measurement

The discriminator couples to the inputs through the ground or substrate (Figure 11). It corresponds to 8mV or 3fC, which is smaller than the noise (5fC).



Figure 11: Discriminator coupling

#### *D. Power consumption*

The maximum available power is  $10\mu$ W/channel, which corresponds to 180µA for the entire chip. The ASIC is power pulsed to achieve this requirement. All the bias and reference voltages are switched OFF during the interbunch of the ILC beam. The static power consumption of the chip is 100 mW ie 1.5mW/ch and so 7.5µW/ ch with a 0.5%beam duty cycle.

There are 3 independent signals of power-on: Analog, Digital and DAC. Each stage can be forced by slow control, overruling the power on pulse.

The "awake" time has been measured on the analog part and on the DAC part. It takes 2µs for the analog part to be operational and provide a discriminator output and 25µs for DAC part (Figure 12) to reach its nominal value within a few mV. The DAC is slower to settle as it is filtered internally to minimize its noise and inter channel coupling.



Figure 12: Crosstalk measurement

# IV. TEST BEAM MEASUREMENTS

# *A. Small prototypes*

 8x32cm2 PCBs hosting four HARDROC (Figure 13) have been designed by IPNL Lyon and LAPP Annecy to study the signal connection between the different chips before extracting it through a USB device. The PCB boards have been associated to both RPC and µMEGAS detectors in order to validate the whole concept (semi digital readout, daisy chain, stability, efficiency) through exposure first to cosmics and then to beam test at CERN.



Figure 13: 5 RPC plans of  $32x8 \text{ cm}^2$  at CERN

The RPC detector shown in Figure 13 has been used in test beam at CERN in 2008 and 2009. It was the first time that the readout could be tested in real conditions and good data have been obtained allowing detector characterisation.

#### *B. Towards technological prototypes*

The good results obtained with the  $8x32$  cm<sup>2</sup> detector and Hardroc pushed for moving to the square meter, scalable prototype in order to address large dimensions issues, as much for the detector as for the readout electronics. Such a prototype made of 6 boards of  $32x48$  cm<sup>2</sup> readout by 96 HARDROC (Figure 14) has been designed by the IPNL electronics group and associated to GRPC detector to be tested in test beam at CERN during the 2009 summer (Figure 15). The full scale readout has been exercised and up to 93% efficiency has been obtained with this 1m<sup>2</sup> PCB associated to GRPC detector.



Figure 14: 1m<sup>2</sup> GRPC prototype



Figure 17: Labview program (©IPNL)



Figure 15: Beam profile in a  $1m^2$  GRPC chamber (©IPNL)

In parallel two 32x48 pad PCBs equipped with HARDROC and associated to micromegas have been tested under  ${}^{55}$ Fe irradiation (Figure 16). A 1m<sup>2</sup> micromegas detector readout by 144 hardrocs will be tested under test beam this autumn.



Figure 16:  $32x48$  cm<sup>2</sup> micromegas chamber under  ${}^{55}Fe$ irradiation (©LAPP)

# V. SMALL PRODUCTION TEST

To equip large detectors, 300 hardroc chips have been tested using a testboard and a dedicated Labview program (written by IPNL Lyon).

DC levels, power consumption, DACs linearity, memory test and trigger efficiency measurements have been performed.

# VI. CONCLUSION

Hardroc exhibits good performance and is ready for production. The semi digital readout and the daisy chain have been tested on large prototypes in test beam. The power pulsing, tested on test bench, has to be validated in test beam. The production of 5000 chips to equip a 1m3 prototype is foreseen in 2010.

# VII. REFERENCES

- [1] web site: http://omega.in2p3.fr/
- [2] CALICE:
- https://twiki.cern.ch/twiki/bin/view/CALICE/WebHome [3] EUDET: http://eudet.org/
- 
- [4] Presentation of the "ROC" chips readout, poster by F. Dulucq.
- [5] GRPC activities led by I. Laktineh *et al*, IPN Lyon
- [6] Micromegas activities led by C. Adloff *et al*, LAPP Annecy

# Design of High Dynamic Range Digital to Analog Converters for the Calibration of the CALICE Si-W Ecal readout electronics

# L. Gallin-Martel; D. Dzahini; J-Y Hostachy; F. Rarbi; O. Rossetto

Laboratoire de Physique Subatomique et de Cosmologie, 53 rue des Martyrs, 38026 Grenoble CEDEX, France.

# laurent.gallin-martel@lpsc.in2p3.fr

### *Abstract*

The ILC ECAL front-end chip will integrate many functions of the readout electronics including a DAC dedicated to calibration. We present two versions of DAC with respectively 12 and 14 bits, designed in a CMOS 0.35 $\mu$ m process. Both are based on segmented arrays of switched capacitors controlled by a Dynamic Element Matching (DEM) algorithm. A full differential architecture is used, and the amplifiers can be turned into a standby mode reducing the power dissipation. The 12 bit DAC features an INL lower than 0.3 LSB at 5MHz, and dissipates less than 7mW. The 14 bit DAC is an improved version of the 12 bit design.

# I. INTRODUCTION

The increasing number of electronics channels involved in present and future high energy physics detectors leads to integrate in the same chip many different functions of the readout electronics: preamplifier, shaper, ADC. In the International Linear Collider (ILC) project, the design of the front-end electronics for the electromagnetic calorimeter (ECAL) is even more challenging. Due to mechanical constraints, no package will be used for the front-end chip and the dies have to be embedded within the printed circuit board. Consequently the electronics has to be fully integrated and no discrete components can be used. This multi-channel chip also requires a high dynamic Digital to Analog Converter (DAC) dedicated to its calibration [1]. Since calibration process can generally be carried out at intermediate frequency (few ksp/s to few Msp/s), the key issues for such a DAC are the integral non linearity (INL) and the power consumption. Switched Capacitor DACs (SCDAC) are well suited to meet these requirements. The linearity of a design implemented in CMOS process is limited by the matching errors of the analogue components. For more than 12 bits the required matching is difficult to obtain and linearization techniques have to be used. High resolution multi-bit delta sigma converters commonly use the Dynamic Element Matching (DEM) method to cancel the matching errors. The DEM allows such DACs to generate pure sinusoidal waveforms by turning the harmonic distortion into noise, this noise is then reduced by the converter's low pass filter [2]. When used in a calibration process the DAC has to provide a sequence of DC values, each value corresponding to a calibration point. In this case the DEM can be effective if several samples are accumulated for each calibration point. The response of the chip under calibration will be given by the mean value of the resulting distribution. A 12 bit and a 14 bit SCDACs have been designed using a CMOS 0.35µm process. This paper

presents, for each chip, the different steps of the design including the choice of the topology, the DAC modelling and simulation, the layout implementation and finally the test of the chip.

## II. DESIGN OF A 5MSPS 12 BIT DAC

This first prototype was designed to provide a MEMS sensor with pure sinusoidal waveforms. Since the sampling rate is higher than the Nyquist rate, the DEM should improve the INL and the THD of the DAC.

# *A. DAC topology*

#### *1) Capacitor network:*

A linearization based on the DEM implies the DAC to be designed using equally weighted unitary converters (thermometer DAC). A 12 bit design would lead to implement 4095 converters, inducing some difficulties. An alternative scheme is to use a segmented array of capacitors. An example of such a scheme is shown in Figure 1. This 12 bit network is divided into two 6 bit arrays (MSB and LSB) connected together through a segmentation capacitor  $C_s$ . Each sub array comprises 63 capacitors on which the DEM can be applied. The network is terminated by a unitary capacitor and the overall capacitance is equal to 64C. Compared to a "full thermometer" topology, this scheme is much easier to implement, but it has to be noticed that the DEM will have no effect on  $C_s$  matching error.



Figure 1: 12 bit segmented array of switched capacitors

#### *2) Operational trans-impedance amplifier:*

The other critical sub-circuit of the DAC is the Operational Trans-impedance Amplifier (OTA) used to process the signal provided by the capacitor network. This present work inherits the OTA designed for a 12 bit high speed pipeline ADC [3]. This low power differential OTA, based on a folded cascode architecture, includes four auxiliary amplifiers to increase the open loop gain. Consequently, this 90 dB gain insures the linearity required by a design up to 16 bits. Moreover it can be powered ON/OFF by a dedicated

circuit reducing its power dissipation to a ratio better than 1/1000. This capability is particularly interesting since the calibration process is supposed to represent a small amount of the chip operating time.

#### *3) Direct Charge Transfer:*

The OTA is connected to the capacitor array using the Direct Charge Transfer mode (DCT). The DCT principle is illustrated in the Figure 2 for a single-ended amplifier.



Figure 2: Direct Charge Transfer

The signals  $\phi$ 1 and  $\phi$ 2 are two non recovering clocks derived from the chip main clock. The signal labelled  $Th_i$  is provided by the thermometer encoder and controls the capacitor  $C_i$ . During  $\phi$ 1 the capacitor  $C_i$  is connected to Vref+ or Vref- depending on the state of the  $Th_i$  signal. Then, during 2 all the capacitors of the array are connected in parallel with the feedback capacitor  $C_f$  to perform charge sharing. The  $C_f$ capacitor can be discharged or not, during  $\phi$ 1, depending on the state of the signal labelled ena\_RAZ. The charge sharing architecture presents two important advantages. At first, the OTA does not have to charge the feedback capacitor  $C_f$  and its power consumption can be maintained low even for large values of  $C_f$ . Moreover, if the signal ena\_RAZ is not activated the charge sharing also acts as a first order low pass filter that reduces the noise induced by the DEM. The filter transfer function and its cut off frequency are given by the following expressions where  $f_e$  is the frequency of the DAC main clock.

$$
H(z) = \frac{1}{1 + b - bz^{-1}}
$$
(1)  

$$
b = \frac{C_f}{\sum C_i} = 1 \implies H(z) = \frac{1}{2 - z^{-1}}
$$
  

$$
f_{-3dB} = \frac{\ln 2}{2\pi T_e} \approx 0.11 f_e
$$
(2)

#### *4) Differential implementation:*

The differential implementation of the DAC is shown in Figure 3. During  $\phi$ 1 the network is not connected to the amplifier and stores a charge proportional to the DAC input code. The charge sharing is performed during  $\phi$ 2. Another interesting property of this topology is its small sensitivity to the parasitic capacitors. The net labelled Sum\_MSB should be very sensitive to capacitive substrate coupling but the OTA open loop gain maintains a DC voltage equal to VMC (OTA Common Mode) on this net. The net labelled Sum\_LSB is also sensitive but since it is located on the LSB side of the network, this sensitivity is small. However the network capacitors have to be connected in the right way in order to minimize the substrate coupling on this net. The other parasitic elements are in parallel with each capacitor in the two arrays. They are due to the capacitive couplings between the metallic interconnections and the capacitors themselves, inducing matching errors. The effect of these errors will be turned into noise by the DEM algorithm. Nevertheless, during the layout design, cares are needed to minimize these mismatches in order to reduce the overall noise of the chip. Finally, the sensitive component is the  $C_s$  capacitor which is not included in the DEM. This capacitor must match the MSB array mean value.



Figure 3: Differential implementation of the 12 bit DAC

#### *5) DEM algorithms:*

These algorithms aim to use each unitary converter at the same rate in order to average the matching errors. They can be either stochastic or deterministic. Data Weighed Averaging (DWA) is a deterministic algorithm entirely controlled by the data sequence [4] [5]. It rotates the elements with the maximum possible rate. The average of the errors converges to zero quickly. The Figure 4 shows how this algorithm rotates the elements of a 3 bit DAC. For each sample, the selected sub-array starts at the first previously unused position. This algorithm exhibits two other advantages. Whereas stochastic algorithms induce a white noise, DWA shifts the noise to higher frequencies. Moreover it can be directly describe in VHDL.



#### *B. DAC modelling and simulation*

The DEM efficiency can be demonstrated using either a statistical approach or a spectral analysis. In both cases a large number of samples have to be accumulated. Based on a model of the DAC, the DEM efficiency can be evaluated using high level simulation. The test bench of the chip is based on Labview, this software was also chosen for the simulation. In such a way, data acquisition, simulation and data analysis can be carried out with the same environment. The differential implementation of the capacitor network is shown in Figure 5. The output voltage Vout as a function of the input code is given by expression (3) in the case of ideal capacitors (l and m are the LSB and MSB input code).



Figure 5: Differential implementation of the 12 bit network

$$
Vout = \frac{1}{64} \left( \frac{2l - 63}{64} + 2m - 63 \right) Vref
$$
 (3)

For the MSB side of the network, the matching errors can be introduced assuming that a mC (respectively  $m<sup>*</sup>C$ ) capacitor is connected to Vref (-Vref). In this case expression (3) becomes expression (4).

$$
Vout = \frac{B}{A}Vref = f(m, l, a, b)
$$
\n
$$
A = \frac{m + m^* + b}{b} - \frac{b}{l + l^* + a + b}
$$
\n
$$
B = \frac{2l - 63}{l + l^* + a + b} + \frac{2m - 63}{b}
$$
\n
$$
m = \sum_{i=1}^{m} C_i
$$
\n
$$
m^* = \sum_{m=1}^{63} C_i
$$
\n
$$
(4)
$$

The DEM algorithm has to reorder the arrays before m and m\* are calculated. The matching errors of the segmentation and termination capacitors are introduced by the parameters a and b. The block diagram of the Labview program is shown in Figure 6. The pattern generator provides either a sinusoidal waveform or a sequence of DC values. The block labelled DEM  $&$   $\Sigma$  processes the MSB and LSB arrays using a DEM algorithm. The converter's low pass filter (LPF block) can be activated by the ena\_RAZ signal. The testing board is equipped with a 16 bit ADC that digitizes the DAC output voltage. The block Q16 performs a 16 bit quantization of the simulated values. Finally, simulated and measured data are processed in the same way by the Data Analysis block. It calculates the DAC non linearity (INL, THD) and the DAC noise (SNR, RMS noise). This block also extracts the matching errors of the capacitors for the two input arrays.



Figure 6: High level simulation block diagram

Electrical simulations have been carried out to study the effect of the parasitic capacitors, for the switches sizing and to check the overall design.

# *C. Layout considerations*

The layout of the MSB network is shown in Figure 7. It includes 63 unitary capacitors (0 to 62) and the segmentation capacitor  $C_s$  arranged in a 4x16 array. It also includes 63 switches, their control logic and 2 non recovering clock generators. This layout is not a common centroid design and each unitary element consists in a single 500fF capacitor. The matching errors due to the metallic interconnections (from Analog Extracted View) are lower than 0.1%. The DAC layout is shown in Figure 8. The size of the active part (without pads) is 1.6mm<sup>2</sup> . This Figure shows that the low sensitivity to substrate coupling is a critical issue of the design. The connection between the MSBn array and the OTA is about 1 mm long, inducing a large parasitic capacitor.



Figure 7: Layout of the MSB array



Figure 8: Layout of the active part of the DAC

#### *D. Simulation and test results*

The matching errors in the MSB array were recorded for the 15 tested chips. These errors range from 0.8% for the best DAC, to 1.8% for the worst one. The distribution of the errors over the array (normalized to C0) is shown in Figure 9. All the DACs exhibit the same distribution shape. The larger values are located at the centre of the array. The gradient is not constant over the array, so a common centroid design would not have improved the matching. The reliability of the high level simulations can be checked when the matching errors are injected in the simulator. The simulated and measured INL and RMS noise are shown in Figure 10 when

the DEM is not activated. The INL and the RMS noise, when the DEM is activated, are shown in Figure 11. The DEM improves the INL by a factor of 8 (3 bits). The remaining INL  $(0.3$  LSB) is mainly due to the  $C_s$  capacitor which does not perfectly match the MSB array mean value. The 70µV RMS noise measured without DEM is dominated by the testing board noise. Without external low pass filter, the DEM induces a 500µV RMS noise (1 LSB).



Figure 9: Matching errors in the MSB array



Figure 10: **INL and RMS noise without DEM** 



Figure 11: **INL and RMS noise with DEM** 

#### *E. Conclusion*

Whereas the matching errors are larger than expected with a 0.35µm process, the 12 bit DAC meets the requirements of the MEMS sensor application for the INL point of view. Since the output voltage will be processed by an external low pass filter, the noise was specified in different frequency bands. Spectral analyses show that the DAC also satisfies these inband noise constraints. The DEM efficiency was demonstrated with this first prototype and higher resolutions can be foreseen (14 bits). The high level simulation was also validated and appears to be a fast and reliable tool that dramatically reduces the amount of time required for the design.

#### III. DESIGN OF 5MSPS 14 BIT DAC

This section presents the design and the test results of a 14 bit SCDAC dedicated to the CALICE ECAL FEE calibration. This chip was designed using the OTA and the DEM algorithm implemented in the 12 bit DAC. The DAC modelling and simulation were carried out in the same way.

# *A. Block diagram*

A 14 bit design segmented into two sub-arrays would lead to implement at least 127 capacitors in an array. The chip area, the overall capacitance and probably the matching errors would be dramatically increased. Consequently this 14 bit DAC relies on a 3 segment network. In such a case the intermediate segment (ISB) is very sensitive to substrate coupling. A solution to overcome this difficulty is to use a second OTA. The block diagram of the 14 bit DAC is shown in Figure 12. The MSB, ISB and LSB arrays contain respectively 31 (5 bits), 31 (5 bits) and 15 (4 bits) capacitors. The simulation shows that the matching for Cf1, Cs2 and MSB array mean value must be better than 0.3%. For the layout point of view, these 3 components are located in the same 64 capacitor array. The ISB-LSB network is located in another 48 capacitor array. Taking into account the poor matching obtains for the 12 bit DAC, a trimming capability was implemented for the Cf1 capacitor. A 0.1C trimming step allows the 0.3% matching constraint to be reached.



# *B. Layout*

The layout of a capacitor array is similar to the layout used in the 12 bit DAC. The size of the dummy capacitors that surround each array has been increased. The layout of the active part of the DAC is shown in Figure 13  $(\text{area}=1.45 \text{mm}^2)$ . Its topology is also similar to the topology of the 12 bit DAC layout.



# *C. Simulation and test results*

The matching errors in the MSB array for the 9 tested chips range from 0.25% to 0.4%. No particular shape appears in the distribution over the 31 capacitor sub-array. The mismatch is entirely due the difference between inner and outer rows in the 4x8 sub-array. The matching errors due to metallic interconnections are lower than 0.1% (from Analog Extracted View). The INL and RMS noise without DEM (respectively with DEM) are shown in Figure 14 (Figure 15). Since the capacitors matching is better for this DAC, the DEM improves the INL by a factor of 2 (1 bit) and induces a small effect on the noise. This noise is dominated by the testing board contribution (about 50µV). The remaining INL (0.5 LSB) is mainly due to the 0.1C trimming step. The optimal trimming value is the same for the 9 chips. The same value is also found with the simulation. The trimming values that surround the optimal one also lead to a 14 bit resolution:

 $Copt$ : THD =  $-95dB$  $Copt+/1$ : THD = -89dB







Figure 15: **INL and RMS noise with DEM** 

# *D. Conclusion and perspectives*

The matching errors are much smaller in the 14 bit DAC compared to the 12 bit DAC, whereas the capacitor arrays are similar (larger dummies in the 14 bit DAC). The spread of the oxide thickness for the 12 bit DAC run is twice the spread for the 14 bit DAC run and is the worst among the chips submitted by LPSC in 2008/2009 with the same process.

For this prototyping run, the DAC satisfies the constraints of a 14 bit design even without the trimming capability (the optimal trimming value can be predicted in simulation). But theses results may vary from a run to another depending on the process reliability. The DAC aims to be included in the ECAL Front End Chip (SKIROC chip) and consequently its trimming process may induce some constraints to the other parts of the FEE electronics.

A self trimmed version of the 14 bit DAC will be submitted in 2010.

#### IV. REFERENCES

[1] J. Fleury et al., "SKIROC: A front-end chip to read out the imaging silicon-tungsten calorimeter for ILC", IEEE Nucl. Sci. Symp. Conf. Rec. (2007) 1847.

[2] I. Fujimori et al., "Multi-bit Delta–Sigma Audio DAC with 120-dB Dynamic Range", IEEE Journal of Solid-State Circuits, vol. 35, no. 8, pp. 1066-1073 (2000).

[3] F. Rarbi et al., "A low power 12-bit and 25-MS/s pipelined ADC for the ILC / Ecal integrated readout", IEEE Nucl. Sci. Symp. Conf. Rec. (2008) 1506.

[4] E. Najafi Aghdam, PhD Thesis, Université Paris XI (Orsay, France), June 2006.

[5] R. T. Braid et al., "Linearity enhancement of multi-bit  $\Delta\Sigma$  A/D and D/A converters using data weighted averaging", IEEE Trans. on Circuits and Systems. 2, Analog and digital signal processing, vol .42, no. 12, pp. 753-762 (1995).

# LAPAS: A SiGe Front End Prototype for the Upgraded ATLAS LAr Calorimeter

Mitch Newcomer<sup>a</sup>

For the ATLAS Liquid Argon Calorimeter Group

<sup>a</sup> University of Pennsylvania, Department of Physics and Astronomy, Philadelphia, PA

# *Abstract*

We have designed and fabricated a very low noise preamplifier and shaper to replace the existing ATLAS Liquid Argon readout for use at the Large Hadron Collider upgrade (sLHC). IBM's 8WL 130nm SiGe process was chosen for it's radiation tolerance, low noise bipolar NPN devices, wide voltage rand and potential use in other sLHC detector subsystems. Although the requirements for the final design can not be set at this time, the prototype was designed to accommodate a 16 bit dynamic range. This was accomplished by using a single stage, low noise, wide dynamic range preamp followed by a dual range shaper. The low noise of the preamp is made possible by the low base spreading resistance of the Silicon Germanium NPN bipolar transistors. The relatively high voltage rating of the NPN transistors is exploited to allow a gain of 650V/A in the preamplifier which eases the input voltage noise requirement on the shaper. Each shaper stage is designed as a cascaded differential operational amplifier doublet with a common mode operating point regulated by an internal feedback loop. Measurement of the fabricated circuits indicates their performance is consistent with the design specifications including the radiation tolerance targets.

#### I. INTRODUCTION

Although some components of the present Liquid Argon (LAr) electronics design may be adequate for use in SLHC the lack of spares and elimination of the processes that custom ASICs were designed in will mean that the complete ATLAS LAr electronics chain will need to be redesigned for operation at SLHC.

The ATLAS LAr Calorimeter is constructed of a series of cathode and anode plates submerged in liquid argon. Charged particles traversing it ionize argon atom electrons and create a current pulse on the positively charged anode that lasts for the 400ns electron drift time. The signal is conveyed to the front end electronics, located outside of the detector via a 5 meter,  $25\Omega$  cable. This part of the detector is expected to remain in the upgraded system [1]. The complete set of design goals for the upgraded detector await input from the operation of the current LAr system at high luminosity. For this work we assumed that the performance goals of the current LAr front end electronics would be sufficient with the added requirement that the front end electronics be able to withstand an exposure to 300kRad of ionizing radiation and  $10^{13}$ n/cm<sup>2</sup> [2]. Table 1 summarizes the basic design goals.

| eicen omes.              |                             |  |  |  |
|--------------------------|-----------------------------|--|--|--|
| Dynamic Range            | 16 bits in 2 ranges         |  |  |  |
| <b>INL</b>               | $0.1\%$ within each range   |  |  |  |
| <b>ENI</b>               | 75nA                        |  |  |  |
| Max Signal Current       | 5mA                         |  |  |  |
| Shaping Time Const. (RC) | 15ns                        |  |  |  |
| Shaping Function         | $(RC)^2$ -CR                |  |  |  |
| Ionizing Radiation Tol.  | 30kRad                      |  |  |  |
| Neutron Equivalent Dose  | $10^{13}$ n/cm <sup>2</sup> |  |  |  |

**Table 1 Design Goals for the upgraded LAr Front end electronics.** 

### II. LAPAS CIRCUIT BLOCKS

Figure 1 shows a block view of the LAr front end. The detector is modelled as a 1nF capacitance followed by a  $25\Omega$ transmission line, preamplifier and shaper. This work concerns the design of a prototype preamplifier and  $(RC)^2$ -CR shaper circuit on a single ASIC substrate. It is important to note that the shaping elements are constructed using the ASIC process passive components.



**Figure 1 shows the Liquid Argon Front End Electronics blocks with detector modelled as a current source in parallel with a 1nf Capacitance followed by a transmission line and decoupling capacitor. The LAPAS ASIC contains the preamp and shaping sections shown to the right.** 

# *A. Technology*

The wide dynamic range and associated low noise requirement for the preamplifier led to the selection of a bipolar technology. IBM's 8WL process that features Silicon Germanium (SiGe) bipolar transistors along with a wide selection of 130nm CMOS transistors was selected for this first prototype based on it's radiation hardness [3] low value of intrinsic base resistance and availability passive components with tightly controlled parametric spread. Figure 2 shows a Monte Carlo prediction of the output amplitude spread for the  $(RC)^2$ -CR shaper transfer function due to part to part passive component variation based on history of 50 runs of the 8WL process.



**Figure 2 MonteCarlo simulation of the shaper output amplitude variation due to passive component variation based on the history of IBM's 8wl process runs.** 

#### *B. Preamplifier Design*



**Figure 3 Super common base Preamp similar to that used in the ATLAS LAr Calorimeter. Note that the C1 and C2 are external components.** 

The schematic of the preamp is shown in Figure 3. It is based on the "super common base" architecture used on the presently installed in the LAr front-end boards described in previous publications [4],[5]. Thanks to the low spreading base resistance of the SiGe technology it employs an input transistor of manageable size (emitter length  $4 \times 20 \mu m$ , 2 emitter stripe geometry) biased at 8mA collector current. Simulations predict that the preamplifier achieves good integral non-linearity (INL< 1%) and an overall equivalent series noise of  $\sim 0.26$ nV/ $\sqrt{Hz}$ , while dissipating 42mW.

#### *C. Shaper Design*

This design, in particular the shaping function benefits from earlier work done by the LAr group to optimize the tradeoffs between the relatively long LAr drift time and the high LHC interaction rates. In this design the shaper is AC coupled by an external capacitor to the preamp or other source. To help eliminate common mode pickup on and off the ASIC a two stage cascaded differential operational amplifier design has been employed (see Figure 4). All passive components except the four 100  $\Omega$  load resistors at ADC\_A,B are fabricated on the ASIC. As shown in Figure 4 the first stage is used to accomplish some amplification and provide one of the two R-C integrations in the feedback loop. This stage is AC coupled to the second using a C-R differentiation. Placement of the differentiation here decouples the two stages allowing independent biasing of the second stage. The second RC integration is implemented in the feedback of this stage.



**Figure 4 Shaper block schematic.** 

 The amplifying element of the shaper design is a differential operational amplifier constructed using an operational transimpedance amplifier (OTA) gain block followed by a unity gain voltage amplifier. A simplified schematic of the OTA is shown in Figure 5. To maintain acceptable noise performance Q1 and Q2 are operated at a relatively high current density of 800μA each. Half of this current is removed by R1 and R2 before entering the OTA's mirror transistors U19 and U8. This allows lower power operation of only about 16mW for this stage. The supply voltage for the shaper is 5V in order to satisfy the wide dynamic range requirements. Although the SiGe NPN transistors can easily operate with this voltage across the base emitter junction, it was necessary to use thick gate CMOS devices in the current mirror structures. To achieve good matching the PMOS mirrors were cascoded (U4,U5,U6,U13,U16,U9). Intentional miller capacitance was introduced in the input stage to prevent the high bandwidth NPN transistors  $(ft \sim 60 \text{GHz})$  from introducing unwanted oscillations. In addition a fast feedback path across the outputs (OdfA and Odfb) was added.



**Figure 5 The OTA block of the differential operational amplifier.** 

A relatively low gain common mode amplifier (not shown) compares the voltage at node CCM with an internal reference to maintain a stable a DC operating point. The shaper realizes a  $CR-(RC)^2$  transfer function where the product of each coupled R and C is 15ns. By selecting low valued resistors for the shaping in the high gain (10X) stage an equivalent input voltage noise of about  $2.2nV/\sqrt{Hz}$  is achieved.

#### III. LAYOUT AND FABRICATION

The prototype ASIC fabricated through MOSIS consists of four independently powered preamplifiers and two dual gain shaper stages on a 1.6 X 2.1mm die housed in a 9X9 mm open cavity QFN64 package. Packaged ASICS were received in March. The layout of the fabricated die is shown in Figure 6.



**Figure 6 Layout of th LAPAS ASIC with 4 preamplifiers and two combination 1X, 10X shaper stages.** 

#### IV. MEASUREMENTS

Measurement of the fabricated ASIC's show that all preamp and shaper circuits are functional with gain, shape and dynamic range close to that predicted by SPICE simulation of the extracted layout. Figure 7 shows the response of the preamplifier to an input waveform shaped to mimic the detector signal after the transmission line.



**Figure 7 Measured Preamp response for a peak input current input of 250µA with an input rise of 20ns and fall time of 450ns.** 

The two traces in Figure 8 show the response of the shaper  $1X$ and 10X outputs to the same preamplifier signal. The 14.1mV and 153mV peaks correspond to a nearly 10X difference in response.



**Figure 8 Measured response of both the 1X and 10X shaping amplifier inputs for a 20mV peak preamplifier output signal. The unusual signal shape reflects the differentiation of the triangular shaped preamp signal.** 

Figure 9 shows the measured Integral Non Linearity (INL) of the high gain stage. The high gain (10X) shaper output noise for this stage has been measured to be 130μV. This corresponds to an input referred current of 34nA well below the calculated 65nA equivalent input noise of the preamp.

Given the preamplifier gain of 650V/A and the full scale calorimeter input current of 5mA the preamplifier output will be slightly larger than 3V. This range is covered by the 1X stage that is linear for inputs up to 4V while the high gain (10X) stage covers the range between 0 and 300mV. Both shaper stages exhibit a highly linear response with an INL of less than 0.1%.



**Figure 9 Measured Integral Non Linearity of the high gain shaper stage. The maximum deviation over the 300mV dynamic range is 0.05%. At 450mV the deviation ( not shown ) is 2%.** 

We have characterized 18 prototype ASICs and find the part to part gain variation is less than 3% except for one failed ASIC. Figure 10 shows the 1X shaper amplitude distribution for 17 of 18 chips with an input of 165mV. The RMS deviation is 1.8mV reflecting a 2% variation among channels, well within the measurement error of our socketed test equipment.



**Figure 10 Distribution of amplitudes among 17 ASIC's for a 165mV preamp input signal. The RMS deviation is 1.8mV.** 

#### V. PRELIMINARY RADIATION STUDIES

Three ASICs were exposed to ionization doses of 200, 500 and 1000krad in three steps. The chips were exposed at Brookhaven National Laboratory's Gamma Irradiation Facility then measured and returned for additional exposure. Our measurements indicate no significant change in linearity and little or no change in gain (See Figure 11) to within the sensitivity of our test apparatus. A small increase in gain was observed with the 500krad data but this change is consistent with what we might expect due to a change in equipment



**Figure 11 The plot above shows output amplitude measurements of Chip 8 after 0, 200, 500 and 1000krad exposure to ionizing radiation at the BNL Gamma Irradiation Facility.** 

status over a the several week period between measurements in rise time or amplitude of the pulser and charge injector inputs. Further measurements will be performed to understand to validate these results.

# VI. RESULTS AND CONCLUSIONS

We have designed, fabricated and tested a first prototype of the ATLAS LAr front end electronics for the upgraded detector. Measurements with a socketed test board have confirmed many of the design objectives. Our future plans include continued testing with a LAPAS ASIC assembled on a printed circuit board for improved testability of the preamplifier. These tests will naturally lead to inclusion of the LAPAS into a more complete readout chain. The IBM 8WL technology appears to be robust in both it's performance and radiation tolerance for use in the upgraded LAr detector.

#### VII. REFERENCES

[1] F. Gianotti et al., "Physics potential and experimental challenges of the LHC luminosity upgrade", Europ. Phys. J. Vol. C, no. 39, pp. 293-333, 2005.

[2] ATLAS Radiation Task Force predicted radiation levels, ATL-GEN-2005-001

[3] J.D. Cressler, "On the potential of SiGe HBTs for extreme environment electronics," Proc. IEEE., vol. 93, pp. 1559 – 1582, 2005.

[4] R. L. Chase and S. Rescia "A linear low power remote preamplifier for the ATLAS liquid argon EM calorimeter." IEEE Trans. Nucl. Sci., 44:1028, 1997.

[5] R.L. Chase et al., "Transmission Line Connections Between Detector and Front End Electronics in Liquid Argon Calorimetry", Nucl. Instrum. Meth. A 330 (1993) 228.

# *TUESDAY 22 SEPTEMBER 2009*

# *PARALLEL SESSION B2a PRODUCTION, TESTING AND RELIABILITY*

# Replacing full custom DAQ test system by COTS DAQ components on example of ATLAS SCT readout

M. Dwuznik<sup>a</sup>, S. Gonzalez-Sevilla<sup>b</sup>

<sup>a</sup> Faculty of Physics and Applied Computer Science, AGH University of Science of Technology, al. Mickiewicza 30, 30-059 Krakow, Poland <sup>b</sup> DPNC, University of Geneva, 24 Quai Ernest Ansermet, 1211 Geneva 4, Switzerland

## Michal.Dwuznik@cern.ch

# *Abstract*

A test system developed for ABCN-25 for ATLAS Inner Detector Upgrade is presented. The system is based on commercial off the shelf DAQ components by National Instruments and foreseen to aid in chip characterization and hybrid/module development complementing full custom VME based setups.

The key differences from the point of software development are presented, together with guidelines for developing high performance LabVIEW code. Some real-world benchmarks will also be presented together with chip test results.

The presented tests show good agreement of test results between the test setups used in different sites, as well as agreement with design specifications of the chip.

#### I. INTRODUCTION

The building blocks of a modern high energy physics experiment are complex structures, often built with minimum material budget in order to maintain required performance, which implies stringent limitations on the components' assembly way. In the later stages of operating the detector, radiation fields may further increase the effort needed for repairs and maintenance.

Due to the above requirements, the electronic components (in particular ASICs) for HEP experiments require careful testing of the chosen components, with the whole system including many self-test scenarios.

Characterisation of the prototype ASICs from first wafers as well as semi-mass testing of production wafers need robust test systems, including both fast digital communication as well as intrincate analogue measurements. Analogue measurements are performed predominantly in R&D phase of the component design, with test scope enveloping more and more complicated scenarios of digital testing in further design life cycle stages, including system level tests of completed detector assemblies. The key property of the system changes fluently from flexibility and fast startup times in the first test stages to speed and reliability for the commisioned collaboration approved test system in later stages.

Previous generation of chips to be used in LHC was usually tested with state-of-the-art custom systems built specifically for this particular purpose, with significant fraction of total project work-hours being devoted to building accompanying electronics. When developing test systems for components being currently installed in the LHC experiments there were no readily

available commercial DAQ components with performance sufficient for extensive testing of component at 40 MHz (and beyond, because for many components performance margins were checked with e.g. 50 MHz main clock frequency to account for radiation effects). VME bus was in common use for previous generation setups.

Using commercially available DAQ components for digital communication with device under test is now possible with relative ease when using digital communication between the test system and device with frequencies well above forementioned base LHC frequency of 40 MHz. The electronics designed for the upgrade of ATLAS experiment SCT is foreseen to transmit data with up to quadruple data rate of 160 Mbps (single data rate).

This paper describes the system (ABCNIDAQ) developed for testing first batches of ABCN (Atlas Binary Chip Next) integrated circuits with help of National Instruments high speed digital input-output card (NI-PCI or NI-PXI 6562) within Lab-VIEW environment. The system is fit for measurements of various components, including single chips mounted on prototype PCBs as well as chip assemblies used for exercising the various powering schemes and prototype module hybrids. Further extension of the system for tests performed at detector stave level is foreseen.

# II. COMPONENTS TO BE TESTED BY PRESENTED SETUP

# *A. ABCN-25 readout chip*

Current Atlas Binary Chip Next (ABCN) prototype flavour (ABCN-25) is a 128 channel ASIC implemented in 0.25  $\mu$ m CMOS technology, for use with semiconductor strip sensors intended for upgrade of the ATLAS Inner Detector. The block diagram of this chip is presented on Figure 1. The chip inherits large part of its architecture from ABCD chips build for current ATLAS Silicon Tracker (SCT). Those chips were built in 0.8  $\mu$ m Bi-CMOS DMILL technology [1]. The main functionality differences between the two mentioned chips are presented below.

ABCN-25 implements a slightly modified communication protocol compared to ABCD, which required modifications in the already existing test systems. Furthermore, the crucial components of the setups used for testing ABCD chips are no longer available for new users, providing the need to implement a new

family of systems based on commercial DAQ components. National Instruments LabVIEW environment has been chosen for ease of programming and interfacing to selected NI hardware presented in section III..

Later prototypes of ABCN developed in either 130 nm or 90 nm are foreseen to maintain backward compatibility needed to operate them with the system described in [2].



Figure 1: ABCN-25 block diagram

ABCN preamplifier architecture is designed to work with input signals of both polarities (coming from both  $p$  or  $n$  type strips), as well as with different strip lengths [4].

Large foreseen increase in the channel count, as well as sensor granularity of the future upgraded ATLAS Inner Detector requires novel approach to the problem of power distribution within detector volume. The document [3] describes various powering schemes to be tested in order to keep the power consumed reasonable. In order to test the feasability of various powering schemes of digital and analogue parts of the chip, onchip regulator circuitry was added.

Single Event Upset detection and correction circuits in crucial areas were also added. All but the Mask registers inside the chip are of write/read type to provide means of checking if the chips' configuration is consistent with the required one. In total, there's 13 different on-chip registers compared to 6 in the current ABCD. To facilitate the identification of chips and decrease the amount of possible mapping errors in the larger system, each chip is provided with a 16-bit fuse register with non-changeable ID number.

The ABCD-25 chip may operate in "single clock mode" with single 40 MHz clock (just as ABCD), or in "double clock mode" where there's additional 80 MHz clock signal provided for driving the data link from the chip to the outside world with doubled speed. In such scenario chip input data rate is 40 Mbps, and output data rate is 80 Mbps.

#### *B. Readout chain architecture*

At present, there is no final chosen design of the detector module nor any higher level structures. Nevertheless, the readout chain architecture seems already frozen. Most probably, the chips will be assembled on a flex hybrid with two columns, each consisting enough chips to process signals from between 1000 and 2000 strips (final channel count per chip is not yet decided). The diagrams on figures 2 and 3 show the two possible readout scenarios.



Figure 2: Readout chain diagram in standalone mode

In the standalone readout mode, very similar to the ABCD readout, the Master chip, upon receiving the trigger from the higher level DAO sends the header, L1 trigger and beam crossing clock (BC clock) counter values, its own data and token to the next chip in the chain, which adds its own data and passes the packet and the token to the next chip. The End chip (typically at the end of the 10 chip column) also adds a specific trailer to mark the end of data and the data is succesively passed back to the Master chip to be transmitted out.



Figure 3: Readout chain diagram in Module Controller mode

In the module controller mode, there is no Master chip in the chain, all the chip but the End one are Slave chips. The local Module Controller Chips (MCC) will transmit triggers to the group of readout columns, gathering the data back and combining the packages to be transmitted out at 160 Mbps rate. For brevity, figure 3 shows only one column of three ABCN chip connected to MC.

Both the standalone chains and future MC chains may be exercised using the same presented setup, for ABCN chips operating either in single or double clock mode.

# III. TEST SYSTEMS USED

# *A. SCTDAQ test system*

The most important from DAQ point of view part of the systems used for testing the ABCDs as well as hybrids and mod-

ules used for building the ATLAS SCT was the Multichannel Silicon Tracker ABCD Readout Device (MuSTARD). It was a custom-built FPGA based VME module meant for operating 12 datastreams received from ABCD chips. It's onboard circuitry receives, decodes and stores the received data, with possibility of realtime histogramming of the incoming events and transmitting the results to the host PC via VME interface. The hardware based data handling enables the possibility of issuing consecutive triggers at maximum rate, with triggers send immediately upon detecting a trailer from the previous trigger data package. The description of the whole VME system used may be found in [7].

# *B. ABCNIDAQ test system*

ABCNIDAQ test system consists of a set of LabVIEW developed procedures running on the host PC, with actual generation and acquisition of data using the National Instruments 6562 cards in either PCI or PXI flavour, complemented in different scenarios with various other equipment. In particular, for single chip test PCBs the NI-USB 6509 card (static 96 TTL I/O lines) is used for providing logic levels used for configuring the ABCN-25 chip (settings like chip address, readout/clock mode configuration, operation of the built in analogue multiplexer for DAC testing and so on). Other equipment used were the GPIB controlled generator for performing clock frequency sweeps and a multimeter used for DAC measurements (for details see [9] and [10]).

The heart of the system is NI6562 PCI/PXI module being in principle high speed, 200 MHz (up to 400 Mbps per channel in DDR mode) 16 channel digital LVDS-standard board. The device provides hardware timed, synchronous, generation and acquisition of LVDS signals, compatible with signals used by ABCN-25 chip . One readout chain as described above requires one output channel (Command on figure 2) and one input channel (LVDS data out, same figure) for data acquisition. In single clock mode BC clock is exported on a dedicated CLK line provided on the board, in double clock readout mode the faster data clock is there, with BC clock being generated on one of the output channels of NI-6562. Thus, a 16 channel card may serve up to 8 readout chains in single clock mode or up to 7 in double clock mode. The trigger lines present on the device (PFI), together with the so-called "scripted generation" provide the flexibility needed to perform fast, hardware timed tests. Scripted generation uses defined waveforms as building blocks for tests . The commands configuring the chips, followed by a series of L1 triggers are sent in one burst, with acquisition trigger issued upon generation of each L1. Thus, only the interesting part of the chip response is gathered for further analysis, with separate events being stored in different acquired records.

Unfortunately, real-time data decoding is not possible so the acquired records may be significantly longer than the real incoming data package. This is a drawback of this system, which could be overcome by using the directly accessible FPGA product from Reconfigurable I/O (RIO) family of National Instrument products, which didn't have all the necesary components readily available at the time of starting the development of the presented setup (late 2008). Such limitation may slow down significantly the tests performed for measuring the intrinsic noise

characteristics of a component.

## IV. EXAMPLE ABCNIDAQ RESULTS

The system described above was used for the part of characterization and functionality verification of the first engineering run ABCN-25 chips. In particular, the presented system was the first to confirm correct operation of the chip in double clock scheme, which shows the fast startup time of the approach based on a well tested and supported commercial DAQ component. To demonstrate the flexibility of the presented system, few example results are shown below - so called three point scan showing analogue parameters calculated from digital scans for a single chip test board, an example of a threshold DAC linearity measurement, and an example of a test showing usable range of calibration pulse delay for the two data links, each reading out a column of 10 chips on a prototype hybrid. More results performed with both ABCNIDAQ and modified SCTDAQ setups may be found in [6]. In general, the results obtained are consistent with each other and no measurement artifacts have been observed.

# *A. Three point gain*

Three point gain scan is performed by a threshold scan for 3 different input calibration charges. For each scan, the value of threshold corresponding to 50 % occupancy is found and a straight line fit is made to calculate the value of gain and offset for each charge. Output noise sigma in mV is recalculated to obtain equivalent input noise charge.



Figure 4: Three point gain scan for 1.0, 1.5 and 2.0 fC, 1000 triggers per point

The figure 4 shows the results of a three point gain scan matching the required design parameters of the chip (around 100 mV/fC gain and around 400 e input equivalent noise charge).

# *B. DAC linearity*

The figure 5 presents the voltage at the 8-bit threshold DAC output. The on-chip analogue multiplexer was set up to transfer the mentioned DAC output voltage to a test pad connected to a GPIB controlled multimeter, while the ABCNIDAQ setup provided configuration signals for the multiplexer as well as changing the DAC values.



Figure 5: Threshold DAC output voltage vs DAC code.

The range and step of the DAC match the design values of 800 mV and 3.2 mV respectively, and no significant nonlinearity is observed (output voltage varies from 2.066 V for 0 code to 1.272 V for 255 code).

# *C. Calibration delay scan*

The calibration pulses sent via internal calibration circuit to perform scans needed for deriving the analogue frontend parameters need proper timing. Figure 6 shows the dependence of the measured occupancy on the calibration pulse delay setting for each of the channels of the two readout chains mounted on a prototype hybrid described in [8].



Figure 6: Example Strobe Delay Scan

The calibration pulses sent via internal calibration circuit to perform scans needed for deriving the analogue frontend parameters need proper timing. Figure 6 show dependence of the measured occupancy on the calibration pulse delay setting for each of the channels of the two readout chains mounted on a prototype hybrid described in [8].

The two columns of ten chips each were read out using two different channels of the NI-6562 board at the same time. The results show correct decoding of the multichip data and simultaneous operation of two readout chains at the same time. Some chip to chip variation of the perceived delay range due to process variations is visible, and the overlap region betweeen the chips is sufficient for common setting of the delay for all the chips sharing the same data link. The sharp transition between full occupancy (gray) and zero occupancy (white) regions with virtually no transition area shows good performance of both the calibration circuit and the preamplifier/shaper/discriminator chain in the channel.

# V. LABVIEW PROGRAMMING TECHNIQUES USED

The software used in the presented system utilizes Lab-VIEW Professionnal Development System. XML manipularion libraries are used for handling the configuration classes. The input/output channel assignment, chip addressing, clock config, number of connected hybrids and chips, chip configuration registers are all stored in the XML config files specific to the component tested as well as to the setup location (e.g. different cable lenghts that need to be accounted for in the delay settings). The XML schema provides means of checking the config validity at any given point. The configuration objects are implemented using LabVIEW object oriented programming scheme to keep the source code uncluttered and easy to maintain. The subroutines crucial for application speed (in particular the data decoder) had to be implemented in non-object programming way for speed. As the graphical data driven programming paradigm hides some part of the implementation from view, even the experienced textbased programmers may experience difficulties in getting fast graphical code.

The development of applications running on nowadays multi-core machines with processors lets one speed up the execution speed by splitting the code into multiple threads. In LabVIEW, independent fragments of code (like separate loops) are automatically distributed to run on different CPU cores. Native LabVIEW threads include the Panel Update thread which handles updating the application UI. Care must be taken in order to avoid the performance drop caused but too frequent calls to application Front Panel (User Interface) controls and indicators. Most of the arithmetic functions used in LabVIEW are inherently polymorphic. E.g. simple arithmetic multiplication may be used for operation on two integers, two floats but also for two 1-D vectors or a 2-D array and a constant. Thanks to that, scaling a whole array of double precision data by calibration factor may be done via a single Multiply function call, not element by element. One may also use a set of included in-place array operations to avoid excessive memory usage and data copy overhead.

During the development of the ABCNIDAQ system the benchmarks for array operations in LabVIEW were performed for detecting and avoiding the possible bottlenecks. Comparison was done between dynamic and static array allocation for double precision data types. Both scenarios involve a for loop running given  $m$  number of times with each iteration placing one double precision number in the array at index  $i$ . In dynamic case InsertIntoArray function was used, which "inserts an element or subarray into n-dimensional array at the points you specify by index" according to the documentation. For static case, 1D  $m$  element array of double precision numbers has been preallocated using InitializeArray function, followed by elements being written to it inside the for loop using the ReplaceArraySubset, which "replaces an element or subarray in an array at the point specified by index". The execution times on a modern, 4 GB memory, double core CPU notebook machine are presented in table 1 The difference comes from an overhead of copying and reallocating the full array needed by InsertIntoArray function, despite its first glance similarity to Replace ArraySubset function.

Table 1: Comparison of execution time between to methods of writing data to array

| Number of array | Time taken      | Time taken         |
|-----------------|-----------------|--------------------|
| elements filled | InsertIntoArray | ReplaceArraySubset |
| 100000          | 14 seconds      | 20 miliseconds     |
| 1000000         | 56 minutes      | 50 miliseconds     |

The analysis of data from binary readout architecture chips (including the ABCN-25) involves fitting the

$$
f(V_{th}) = \frac{1}{2} - \frac{1}{2} er f(\frac{occ(V_{th}) - V_{t50}}{\sigma \sqrt{2}})
$$

to the threshold scan data, where  $V_{th}$  is the threshold voltage value and  $occ(V_{th})$  is the hit occupancy for given injected charge, and  $V_{t50}$  and  $\sigma$  being the parameters of the fit. Such fits are done multiple times for each channel and their speed is crucial for the overall performance of the developed test system. The fits are done using the Levenberg-Marquardt method, and the same machine was used for benchmarking the two implementations of the test function, with results being presented in table 2. The test function values for each threshold scan datapoint were calculated either via a Mathsript node (LabVIEW way of calling external Matlab libraries) or constructing the cumulative Gaussian distribution function using the built-in arithmetic functions and numerical point by point integration. Native function performance exceeds the external library call method by nearly 3 orders of magnitude.

Table 2: Time taken for fitting the  $f(V_{th})$  function to a 200 point threshold scan data

|            |           | Mathscript   Built-in arithmetics |  |
|------------|-----------|-----------------------------------|--|
| Time taken | 9 seconds | 12 miliseconds                    |  |

# VI. ACKNOWLEDGEMENTS

M. Dwuznik acknowledges support from the Seventh Framework Programme FP7/2007-2013 under Grant Agreement no. 21214.

# VII. SUMMARY

ABCNIDAQ, the test system to be used in the ATLAS Inner Detector Upgrade programme, based on commercial-offthe-shelf DAQ components from National Instruments and programmed within the LabVIEW environment, was built. The software part of the system proved to be robust and flexible enough to perform a broad range of tests and measurement of prototypes for mentione upgrade programme, including chip functionality verification, chip analogue parameter measurements and despite the lack of hardware-based data processing like FPGA based systems, the system speed and scalability seems sufficient for chip and hybrid level tests. The setup was used for tests ranging from single chip tests to tests of a prototype 20 chip hybrid board, soon to be followed by a 40 chip "half module" and foreseen to be extended to utilise 3 NI 6562 boards in parallel.

# **REFERENCES**

- [1] W. Dabrowski et al., *Design and performance of the ABCD3TA ASIC for readout of silicon strip detectors in the ATLAS semiconductor tracker*, Nucl. Instr. and Methods Phys. A, Vol. 552 (2005), pp.292-328
- [2] F. Anghinolfi, W. Dabrowski, *Proposal to develop ABC-Next, a readout ASIC for the S-ATLAS Silicon Tracker Module Design*, https://edms.cern.ch/document/722486/1
- [3] M. Weber, *Research and Development of Power Distribution Schemes for the ATLAS Silicon Tracker Upgrade*, https://edms.cern.ch/document/828970/1
- [4] P. Farthouat, A. Grillo, *Read-out electronics for the ATLAS upgraded tracker*, https://edms.cern.ch/document/781398/1
- [5] J. Kaplon, F. Anghinolfi et al., *The ABCN front-end chip for ATLAS Inner Detector Upgrade*, Topical Workshop on Electronics for Particle Physics, Naxos, Greece, 15-19 Sep 2008, pp.116-120
- [6] F. Anghinolfi et al., *Performance of the ABCN-25 readout chip for ATLAS Inner Detector Upgrade*, in these proceedings
- [7] P. W. Phillips for SCT collaboration, *System Performance of ATLAS SCT Detector Modules*, 8th Workshop on Electronics for LHC Experiments, Colmar, France, 9 - 13 Sep 2002, pp.100-104
- [8] A. Greenall et al., *Prototype flex hybrid and module designs for the ATLAS Inner Detector Upgrade utilising the ABCN-25 readout chip and Hamamatsu large area Silicon sensors*, in these proceedings
- [9] National Instruments Corporation, *NI 6509 User Guide and Specifications*, http://www.ni.com/pdf/manuals/372117a.pdf
- [10] National Instruments Corporation, *NI PXI/PCI-6561/6562 Specifications*, http://www.ni.com/pdf/manuals/373772b.pdf

# Integrated test environment for a part of the LHCb calorimeter– TWEPP-09

C. Abellan<sup>a</sup>, M. Roselló<sup>a</sup>, A. Gaspar de Valenzuela<sup>a</sup>, J. Riera-Baburés<sup>a</sup> D. Gascón <sup>b</sup>, E. Picatoste <sup>b</sup>, A. Comerma <sup>b</sup>, X. Vilasís-Cardona <sup>a</sup>

<sup>a</sup>LIFAELS, La Salle, Universitat Ramon Llull. Quatre Camins 2-4, 08022 Barcelona, Spain. <sup>b</sup> Departament d'ECM, Universitat de Barcelona, Diagonal 647, 08028 Barcelona, Spain.

# cabellan@cern.ch

#### *Abstract*

An integrated test environment for the data acquisition electronics of the Scintillator Pad Detector (SPD) from the calorimeter of the LHCb experiment is presented. It allows to test separately every single board or to perform global system tests, while being able to emulate every part of the system and debug it. This environment is foreseen to test the production of spare electronic boards and help the maintenance of the SPD electronics along the life of the detector. The heart of the system is an Altera Stratix II FPGA while the main board can be controlled over USB, Ethernet or WiFi.

# I.INTRODUCTION

The maintenance of the electronics for the LHC experiments should be an issue along the life of the detectors. Electronic boards will have to be repaired or tested while the original designers and testers of the production electronics may not be anymore involved in the experiment. For this reason the need of self contained, easy to use and well documented test setups becomes almost mandatory.

The LHCb calorimeter is made of four chambers namely a hadronic calorimeter, an electromagnetic calorimeter, a preshower (PS) and Scintillator Pad Detector (SPD). The role of the SPD is determining whether the crossing particle is charged or neutral to complement the information from the preshower mainly for the trigger system.

In this paper we present the design and implementation of such a test bench setup for the SPD of the calorimeter of the LHCb experiment. [\[1\]](#page-177-0)

In a first part, we will briefly describe the different electronic boards found at the SPD, describe their basic functions and the relationship between them.

In a second part, we will describe the former test boards, the ones used during prototyping and production testing. We will describe the tests these boards where capable of and some of their handicaps.

In a third part, we will describe the implemented solution and describe the improvements performed on the system.

Finally in a fourth part, we will take brief conclusions.



<span id="page-173-0"></span>Figure 1: SPD simplified diagram

# II.SPD ELECTRONICS

The SPD is formed by a plane of detecting cells made of plastic scintillator. These cells contain an optical fiber which transports the produced light to the corresponding input channel of a  $64$  channel multi-anode photomultiplier R7600-00-M64MOD from Hamamatsu (aka PMT).

For each PMT we have a Very Front End electronics card (VFE) which is in charge of performing the analogical signal processing and the digital conversion. This analogical signal processing mainly consists in integrating the signal, subtracting a fraction of the previous pulse to correct for spill over and, finally, comparing to a programmable threshold for each clock cycle. This process is sensitive to the starting integration time which is controlled by the edge of a clock signal sent through the so called control cables. Control cables also have a synchronous serial line for low latency operations such as test pattern start and stop. Every VFE sends the obtained digital information serialized by the so called Data Cable (DC) at a payload data rate of 2.6Gbps. VFE boards are powered by specific radiation tolerant regulator chips hosted in a Regulator Board (RB).

The connection of VFEs and RBs with the experiment control system as well as the clock distribution is made by Control Boards (CB). RBs use the same physical and electronic interface as VFEs so there is no difference from the CB's point of view.

The data obtained by VFEs is sent to the Preshower Front End Board (PSFEB) which are the boards in charge of processing the PS signal and sending the information to the trigger and data acquisition paths (DAQ).

CBs also contribute to the trigger system by collecting the SPD bits from a detector region and adding them up to evaluate its multiplicity. This calculation is made by adding the multiplicity coming from 4 or 7 VFEs sent by the corresponding PSFEBs. This yields to about 5.9Gbps of input data. The multiplicity is sent to the Selection Board (SB) by a dedicated optical fiber.

All these boards and their links are summarized on [Figure](#page-173-0) [1](#page-173-0) and further information can be found in [\[2\]](#page-177-4)

# III. FORMER TEST BOARDS

# A. FPGA BASELINE

The data rates of this systems make necessary the use of FPGAs to implement the test boards. The original FPGA used for all test systems in our labs was a board by a vendor called Parallax<sup>[3]</sup> that included a Stratix EP1S25F672C6. The form factor of this board makes it appropriate for prototyping purposes as it can be used with relatively small footprint and a single power supply, the connectors are simple and inexpensive and easy to exchange.

Unfortunately this product was discontinued by the vendor so our group decided to make their own enhanced version.

The original idea was to keep the board backwards compatible and add more pins and capabilities.



Figure 2: New FPGA board

<span id="page-174-1"></span>The new board has practically the double of pins, a USB connection instead of the original serial one and a much more powerful Stratix II EP2S60F484C5. (a photograph can be seen in [Figure 2\)](#page-174-1)

# B. CONTROL BOARD TEST

The CB was originally tested with a specifically designed board that emulated the backplane on which the CB is plugged. The original board can be seen in [Figure 3.](#page-174-0)



Figure 3: Former Test Board for CB

The test board communicated with a PC by a USB connection and also by a Serial Protocol for the Experiment Control System (SPECS) which is an *ad hoc* protocol. [\[4\]](#page-177-2)

The FPGA was also connected to an *ad hoc* optical receiver  [\[5\]](#page-177-1)  that could monitor data going out from the CB.

<span id="page-174-2"></span>Let us note that this setup needed two power supplies and a precision clock generator to work properly. Besides, the physical robustness was an issue.

<span id="page-174-0"></span>

Figure 4: Former Test Board for **VFF** 

# C. VERY FRONT END TEST

VFEs are more complex to test since they have an optical and a digital part.

The digital part is tested with a board that emulates the roles of the CB and PSFEB. This board has also connectors to interface with the optical testing part. In this way, the digital test system can trigger a pulse of light and receive the obtained data from the VFE, everything with accurately controlled timing. A photograph can be seen in [Figure 4.](#page-174-2)

The analogical part is tested in a separated test bench designed to pulse light into the PMT as seen on [Figure 5.](#page-175-1) This requires a high voltage power supply, a dark box, and a motorized optical fiber that illuminates only the desired pixel of the photomultiplier.



Figure 5: Analogical Test Bench for VFE

# <span id="page-175-1"></span>D. LINK TEST

There were also some other small boards designed to supervise links between different parts of the electronics. For example clock distribution was very important and a board was designed to supervise not only the timing but also the shape of the differential signal in both sides of the cables. A photograph of the board can be seen in [Figure 6.](#page-175-0)

<span id="page-175-0"></span>

Figure 6: Link Test Board

# E. MIXING DETECTOR AND TEST BOARDS

Test boards were used not only during production verification and debugging of the electronics but also to check its correct installation.

These tests showed the usefulness of connecting the electronics to test boards but also to connect a part of the electronics to the final detector environment. This is not only interesting because of the ability to isolate errors but also because during installation not all the other parts need to be installed to test the current one.

As an example we could test the VFEs when the PSFEB were not installed yet. We used the CB to control the VFEs but a test board to receive and check incoming data.

## IV. INTEGRATED TEST ENVIRONMENT

The integrated test environment solves all the problematic aspects we have found by using the former test environments and adds all the small enhancements that simplify the testing process.

Some examples are the extensive use of serigraphy on the board to ease its use and avoid having to look for information elsewhere. Another example is having a mechanically robust environment.

Another interesting feature is to have a self contained test environment requiring a minimal laboratory setup. The new board regulates its own voltage, so it is possible to operate with a single power supply like the one used in laptops. This makes the system more compact and less error prone with respect to the input voltage.



Figure 7: Integrated Test Environment for the SPD

# A. TEST CAPABILITIES

The new test board integrates all the testing capabilities that former boards have. This includes testing all parts from the CB, that are:

- Data from the VFEs to CB through PSFEBs
- Data going out of the CB by the optical link
- Control performed by the CB
- Precise clock distribution
- SPECS bus communication
- Interaction with other parts of the experiment

It also has the capabilities of the digital VFE test board, that are:

- Receiving data from a VFE
- Controlling a VFE as a CB would do
- Controlling a RB as a CB would do
- Triggering the optical test environment

And finally it is able to do all the things a Link test board was able to do:

- Inspecting communication  $CB \leftrightarrow VFE$
- Inspecting delays between clocks in a CB
- Acting as a passive VFE with terminated input

# B. CONNECTIVITY

The new test environment is able to connect separately with every board as also were the preceding test boards performing the same tests on them. But it is also able to connect to all of them at the same time being able to transfer data from one test to the other one. [\(Figure 8\)](#page-176-2)

It could be argued that with the preceding test boards, it could have been possible to connect various boards to a single PC and by software means transfer data from one controlling software to the other controlling software and this way "close the loop" of data flow. The reality is that data transmission speeds were far too low to do this kind of links, with just a couple of Mbps when it would have required thousands of Mbps.

In the new board all data flows from/to the FPGA, so all links are made inside it.



<span id="page-176-2"></span>Figure 8: Connection Diagram 1

Another interesting option is controlling a VFE through a CB controlled by the test board, this way it is possible to inspect the behaviour of the system acting mounted in the exact way it would be in the real detector. [\(Figure 9\)](#page-176-0)



<span id="page-176-0"></span>Figure 9: Connection Diagram 2

# C. ADDITIONAL FEATURES

Further improvements have been made to the system such as a new interface that supports Ethernet and/or WiFi connections. This implies a different approach to the problem of interaction between the board and the controlling PC. Instead of running software in the PC that handles the board as an instrument to retrieve data, the idea is that the board runs all the necessary software and the PC is only an interface to the user.

Having this interface has some advantages such as being Operative System independent – because the PC only accesses to a web server embedded in the board – and not requiring the PC to have any software with the problems it implies, like having different computers and keeping them with an updated software.

Another advantage is that by using this kind of interface it is possible to use the boards remotely. For example, an expert could manipulate the board under test from his homeland university. Another option would be leaving the test board in the experimental zone connected to the local network and handle it from the control room.



<span id="page-176-1"></span>Figure 10: Xport Ethernet Interface

It all is easy to do because of the use of a commercially available web servers that include all necessary electronics and are interfaced by a serial line [\[6\]](#page-177-5) [\(Figure 10\)](#page-176-1).

Another improvement is having a much more mechanically resistant structure. A 10mm thick aluminium plate has been used together with 21 fixation screws to give both strength and stability to the whole board (see [Figure 11\)](#page-177-6). The plate is also used to dissipate the heat produced by the inboard regulator.



<span id="page-177-6"></span>Figure 11: Mechanical Drawing of the Integrated Test Environment

Some more improvements have been made in the mechanical design like good fixation for the big LVDS connectors from data cabling or better fixation of the optical receiver board.

The usability has been another aspect taken into account, and thinking that future tests will probably not be used in a fully equiped laboratory, the board has been made as independent as possible. For example now the board has an embedded clock generator capable of generating its own clock – both at the exact frequency of the experiment or with small deviations – , but also capable of receiving an external one and filter it to reduce the possible jitter. Another example already mentioned is the use of a single power supply.

These are small improvements but have a large impact on the test procedure.

# V.CONCLUSIONS

As mentioned in the introduction, the main aim of the work done was to be ready for future reparations and maintenance of the different electronic boards of the LHCb SPD. The whole design was from the beginning intended for an average user, not only experts. This implies the system to be usable, stable, less lab dependant and eventually remotely operable by an expert.

This was possible because of the use of a better interface that includes net capabilities.

We have solved the problems that we have discovered during the real use of all the previous test systems.

And finally some more test possibilities have been added to be able to test more complex scenarios in prevision of future complex problems.

#### VI. ACKNOWLEDGEMENTS

This work is partially supported by the Spanish MEC (project  $FPA-2008-06271$ ) and by the "Generalitat de Catalunya" (AGAUR 2005SGR00385).

C.A.B. Would like to thank the "Generalitat de Catalunya" for an F.I. Grant.

# VII. REFERENCES

<span id="page-177-0"></span>[1] The LHCb Collaboration, A.Augusto Alves Jr. et al. *The LHCb detector at the LHC* Journal of Instrumentation Volume 3, S08005 (2008).

<span id="page-177-4"></span>[2] D.Gascón et al, *The Front End Electronics of the Scintillator Pad Detector of LHCb Calorimeter*,12th Workshop on Electronics For LHC and Future Experiments, Valencia, Spain, 25 - 29 Sep 2006, pp.153-157

<span id="page-177-3"></span>[3] <http://www.parallax.com/>

<span id="page-177-2"></span>[4] Breton, D; Charlet, D, *SPECS: the Serial Protocol for the Experiment Control System of LHCb,* LHCb public note*:*  CERN-LHCb-2003-004

<span id="page-177-1"></span>[5] Lax, Ignazio ; Avoni, G ; D'Antone, I ; Marconi, U, *The Gigabit Optical Transmitters for the LHCb Calorimeters,12th Workshop on Electronics For LHC and Future Experiments, Valencia, Spain, 25 29 Sep 2006, pp.153157*

<span id="page-177-5"></span>[6] http://www.lantronix.com/

# Picosecond time measurement using ultra fast analog memories.

Dominique Breton<sup>*a*</sup>, Eric Delagnes<sup>*b*</sup>, Jihane Maalmi<sup>*a*</sup>

aCNRS/IN2P3/LAL-ORSAY\_bCEA/DSM/IRFL

# $breton@$ lal.in2p3.fr

#### Abstract

The currently existing electronics dedicated to precise time measurement is mainly based on the use of constant fraction discriminators (CFD) associated with Time to Digital Converters (TDC). The time resolution measured on the most advanced ASICs based on CFDs is of the order of 30 ps rms. TDC architectures are usually based either on a voltage ramp started or stopped by the digital pulse, which offers an excellent precision (5 ps rms) but is limited by the large dead time, or on a coarse measurement performed by a digital counter associated with a fine measurement (interpolation) using Delay Line Loop, which exhibits a timing resolution of 25 ps, but only after a careful calibration. The overall precision of these systems includes the contribution of both elements ( $CFD + TDC$ ).

In the meantime, alternative methods based on digital treatment of the analogue sampled then digitized detector signal have been developed. Such methods permit achieving a timing resolution far better than the sampling frequency. Digitization systems have followed the progress of commercial ADCs, but the latter have prohibitory drawbacks as their huge output data rate and power consumption. Conversely, high speed analog memories now offer sampling rates far above 1GHz at low cost and with very low power consumption.

The new USB-WaveCatcher board has been designed to provide high performances over a short acquisition time window. It houses on a small surface two 12-bit 500-MHzbandwidth digitizers sampling between 400 MS/s and 3.2 GS/s. It is based on the SAM chip, an analog circular memory of 256 cells per channel designed in a cheap pure CMOS  $0.35\mu$ m technology and consuming only 300 mW. The board also offers a lot of functionalities. It houses a USB 12 Mbits/s interface permitting a dual-channel readout speed of 500 events/s. Power consumption is only 2.5 W which permits powering the board with the sole USB.

When used for high precision time measurements, a reproducible precision better than 10 ps rms has been demonstrated.

The USB-WaveCatcher can thus replace oscilloscopes for a much lower cost in most high-precision short-window applications. Moreover, it opens new doors into the domain of methods used for very high precision time measurements.

#### I. STATE OF THE ART

The currently existing electronics dedicated to precise time measurement is mainly based on the use of constant fraction discriminators (CFD) associated with Time to Digital Converters (TDC). The constant fraction technique minimizes the time walk effect (dependency of timing on the pulse amplitude). Several attempts have been made to integrate CFD in multi-channel ASICs. But the time resolution measured on the most advanced one is of the order of 30 ps rms. Moreover, the quality of the result with this technique is strongly dependent on the input pulse shape, because pure delay lines require inductances which are not designable in an ASIC, and delays are thus replaced by signal shaping.

Two main techniques are used for the TDC architectures. The first one makes use of a voltage ramp started or stopped by the digital pulse. The obtained voltage is converted into digital data using an Analog to Digital Converter (ADC). The timing resolution of such a system is excellent (5 ps rms). But this technique is limited by its large dead time which can be unacceptable for the future high rate experiments. Another popular technique associates a coarse measurement performed by a digital counter with a fine measurement (interpolation) using Delay Line Loop. Such a system can integrate several (8-16) channels on an FPGA or an ASIC. The most advanced DLL-based TDC ASIC exhibits a timing resolution of 25 ps, but only after a careful calibration.

It has to be noticed that with all these techniques, the overall timing resolution is given by the quadratic sum of those of the discriminator and of the TDC, thus leading to a degraded performance.

In the meantime, alternative methods based on digital treatment of the analogue sampled then digitized detector signal have been developed. Such methods permit achieving a timing resolution far better than the sampling frequency. For example, a 100-ps rms resolution has been reported for a signal sampled at only 100 MHz.

Digitization systems have mostly followed the progress of commercial ADCs, which currently offer a rate of 500 MHz over 12 bits. Their main drawbacks are the huge output data rate and power consumption. Their packaging, cooling, and tricky clock requirements also make them very hard to implement. Conversely, high speed analog memories now offer sampling rates far above 1GHz at low cost and with very low power consumption. They are a very interesting alternative to the use of ADCs wherever the sampling depth remains short. This will be shown again in this paper.

#### **II. THE SAM ANALOG MEMORY**

The SAM chip [1], whose block diagram is shown in Fig. 1, makes use of the 3.3V CMOS AMS 0.35um technology. Its area is only 11 mm<sup>2</sup> and it integrates  $60000$  transistors. It houses two channels, each including 256 analog storage cells.

The high sampling frequency (Fs) of SAM is obtained by a virtual multiplication of the lower frequency clock (Fp) using internally servo-controlled Delay Line Loops (DLL). But, rather using a linear structure, each analog memory channel is configured as a matrix of 16 lines with 16 capacitors each, as shown in Fig. 2. This structure was already used in the MATACO chip described in [2]. In this structure, successive capacitors in the same column contain consecutive analog samples taken at 1/Fs whereas successive capacitors in the same line contain samples taken at  $1/Fn = 16/Fs$  intervals. The sampling timing is ensured by a 16-step DLL associated with each column and of which the delay is servo controlled to 1/Fp. The input signal is split, using for each line a voltage buffer feeding the analog signal. In each line also, a readout amplifier permits reading back the analog information stored in the capacitors. During readout, the read amplifier outputs are time-multiplexed towards an external ADC. Both the write and read operations are performed in voltage mode in the capacitors in order to ensure total voltage gain robustness against parasitic elements or components mismatch.



Figure 1: Block diagram of the SAM chip.

The matrix architecture offers the following advantages compared to the standard linear sampling DLL structure:

- Better analog bandwidth for the same power consumption dissipated in the input buffers.
- Lowest switching noise during the write phase as the switching time interval between two consecutive memory cells on the same line is long  $(1/Fp)$ .
- Lower readout noise. This noise contribution is, at the first order, proportional to  $1+Cr/Cs$  where Cs is the capacitance of the storage element and Cr is the total capacitor of the readout bus connected to the negative input of the readout amplifier. Actually, Cr is smaller in a matrix structure, because the read busses are shorter than in the linear structure
- Faster readout time, as 16 consecutive capacitors are read simultaneously by the readout amplifiers.

Several useful peripheral functions have heen implemented in the chip:

A functional block memorizing the position of the last cell written before the stop signal arrival and calculating the index of the first cell to be read back using an offset register called Nd.

- A functional block selecting the 16 capacitors to be read in parallel. Actually, these capacitors are eventually spread on two consecutive columns, depending on the index of the first cell to be read,
- One 7-bit DAC for each line of the matrix to compensate the static offsets due to the use of amplifiers on each line in the matrix structure. These DACs must be set during a dedicated calibration phase with no signal on inputs.
- The DLLs have been designed to avoid unlocking during readout which would introduce extra dead-time between acquisition.
- A slow-control serial link to program various parameters of the chip (Nd, DACs settings, test mode, biasing of the amplifiers...).



Figure 2: Principle of the sampling DLL matrix.

To decrease the effect of digital switching, each analog memory channel is actually fully differential and all the potentially noisy digital input or output external signals are using LVDS standard.

#### **III. THE USB WAVE CATCHER BOARD**

The new USB-WaveCatcher board (see Fig. 3) has been designed to provide high performances over a short time window. It houses on a small surface two 12-bit 500-MHzbandwidth digitizers sampling between 400 MS/s and 3.2 GS/s. It is based on the SAM chip described above.



Figure 3: the USB Wave Catcher board
As the SAM chip memory depth is 256 samples, the acquisition time window duration depends on the sampling frequency (80 ns for  $3.2$ GS/s up to  $320$  ns for  $400$ MS/s).

The inputs are DC-coupled and a programmable DC offset covering the whole dynamic range  $(\pm 1.25V)$  can be applied independently on the two channels in order to optimize the discriminators signal measurement. Trigger with programmable thresholds are located on each input. The board also houses individual programmable channel pulsers for reflectometry applications. The precision obtained for cable length measurements is then as good as 2mm. It can be triggered either internally or externally and several boards can easily be synchronized. Trigger rates counters and dead time estimator are implemented. Charge measurement mode is also provided, through integrating on the fly over a programmable time window the signal coming for instance from photomultipliers. Photo-electron spectra can thus be realized very quickly.



Figure 4: the USB Wave Catcher box

By default, the connectors implemented are BNC, but they can be replaced by LEMO or SMA on demand.

The board is packaged in a convenient plastic box (see Fig. 4). Its reduced power consumption (below 2.5W) allows it to be powered by the sole USB. It houses a USB 12 Mbits/s interface permitting a dual-channel readout speed of 500 events/s. Faster readout modes are also available. In charge measurement mode, the sustained trigger rate can reach a few tens kHz. A 480Mbits/s version will soon be available.

#### **IV. ACOUISITION SOFTWARE**

A dedicated acquisition software has been developed with CVI/LabWindows.



Figure 5: the main graphical user interface of the Wave Catcher acquisition software.

It offers an oscilloscope-like front panel and the same kind of possibilities for data taking and displaying. It permits taking benefit of all the features available on the board. Data can be stored in files on demand. Standard running can easily be handled directly on the front panel, and advanced modes are accessible via different menus. The main graphical user interface is displayed on Figure 5.

This acquisition software will soon be available as a Windows installation package on the LAL web site at the following URL:

http://electronique.lal.in2p3.fr/echanges/WaveCatcher/index.htm

#### V. TIME MEASUREMENTS

#### Raw measurements  $\mathcal{A}$ .

In the usual configurations used for time measurement, the analog signals first have to be discriminated before being sent to TDCs as described in the first chapter. In this case, the discriminator is an additional source of error. Using analog memories permits getting rid of the discriminator. The measurement will be performed directly on the analog signal, thus permitting independence to the pulse shape. However, in order to reach the ultimate possible precision, a precise time calibration of the memory will have to be performed.

We will first present here the measurements performed without any time calibration.

An easy way to measure the jitter performance of a digitizing system consists in performing a measurement of its Effective Number Of Bits (ENOB). As shown on Fig. 6, it is expected from theory that ENOB diminishes with the increase of sine wave frequency for a given sampling jitter. The measurements performed on the Wave Catcher board, plotted as dots on the same figure, are compatible with a raw sampling precision of 16ps rms without any time correction.



Figure 6: ENOB measurement with no time calibration with a 300mV pp sine wave compared to simulation.

The usual measurement one actually wants to perform is the time distance between two pulses. In order to characterize properly this type of signal, a simple setup has been used. It consists in sending a pulse from a generator both to the board and to an un-terminated cable. The signal is reflected at the open extremity of the cable and comes back to the board. This is a way to obtain very clean repetitive signals, but with slightly different amplitudes. Anyway, this is close to real life operation.



Figure 7: Two pulses generated by an open cable.

Different cable lengths have been used. The dual-pulse (see Fig. 7) is sent totally asynchronously with respect to the board clock, thus falling wherever in the sampling matrix of the SAM chip. That way, all the types of jitter are taken into account in the measurement. The timing is measured with a fixed threshold (drawn in green on the left plot), of which the crossing instant is extrapolated thanks to a fourth degree polynom. Whatever the distance between the pulses, the jitter is of 22 ps rms, which again gives a single pulse resolution of about 16 ps rms without any correction.

There are actually two main contributions in the time jitter: the random jitter and the Fixed Pattern jitter. Random jitter is dominated by two sources: the sampling jitter which is due to the random aperture jitter of the switches, and the noise on the signal itself. The smart design of the storage cells permits obtaining a very small aperture jitter, whereas the high SNR of the analog memory reduces the noise added to the signal. The Fixed Pattern jitter actually represents the main contribution as it will be shown below. But as its position is fixed in the matrix and very reproducible, it can be corrected.

#### *B*. Methodology for time calibration

The effect of the Fixed Pattern Time Distribution along the DLL which is the main element of the time INL on the measured signal can be viewed on Figure 8. Time spread is exaggerated here in order to make the understanding of the effect easier. The later actually appears as a fixed irregular distance between samples, due to the irregularities in the silicon and to fixed coupling effects.

Two methods can be envisaged to recover from this fixed time error after time INL calibration. The first one consists in using floating points for the time coordinates of the samples. This is not always easy to deal with, because it adds a dimension to the correlated arrays in the software. The second one consists in correcting the sampled points to extrapolate the position of the equidistant points located on the real signal. This has the supplementary advantage of making the FFT calculation of the signal possible. The algorithm used for the latter method is based on a simple third degree Lagrange polynomial interpolation. Three points are enough because the distance between the samples and the equidistant points always remains small (below 15% of the distance between samples) thanks to the matrix structure. Figure 8 shows how the signal is corrected that way.



In order to perform easily this type of high precision calibration, a new technique has been developed. It consists in sending to the board channels a sine wave signal at a frequency between 100 and 200MHz (the optimum is around 135MHz for 3.2 GS/s) and of rather high amplitude (500 mV rms i.e 1.4V pp). The part of the sine used for the measurements is all the segments crossing the zero (midheight) of the curve (see Fig. 9). With the well chosen frequency and amplitude described above, these segments can be assimilated to straight lines with a systematic error remaining below 1ps rms. The mean length of said segments will give the time DNL, whereas the rms of their values will give the random sampling noise. If one integrates the time DNL and fits the result, the time INL can be calculated and then corrected online.



Figure 9: segments used for the DNL measurement.

#### Measurements after calibration  $\overline{C}$

#### 1) Calibration and characterization of the board

Thanks to the method described above (zero-crossing segments of a sine wave), fine measurements of the time characteristics of the board have been performed.

An example of raw time DNL is shown on Figure 10. Horizontal axis displays the cell number and vertical axis the segment lengths in ADC counts (with 0.61mV per ADC count). The rms on the segment length is of 9 ps. Once integrated and fit, it appears as shown on Figure 12 (vertical axis is now in ps), with an rms of 16ps, perfectly coherent with the ENOB measurement. Note that this shape is mainly chip-dependent, but very stable with time and temperature, thanks to the DLL servo-control.



Figure 10: example of raw time DNL.



Figure 11: corresponding raw time INL.

Once this time calibration is performed, the time INL can be corrected online by software with the Lagrange polynomial interpolation. When sending the same sine wave to the board, time DNL and INL can be measured again. Figures 12 and 13 exhibit the improvement in the time precision, with an rms of 0.33 ps for the time DNL and of 1.15 ps for the time INL.



Figure 12: time DNL after correction.

Of course, in order to be useful, this calibration has to be valid for every kind of input signal. Therefore, keeping the same calibration files, the jitter measurement has been performed with different frequencies of the input sine wave covering the range where the method is effective  $(\sim 100)$  to 200 MHz). The rms of the INL always remains below 2.5 ps, thus validating the method.



Figure 13: time INL after correction.

Now, we can come back to the measurement of the time difference between two pulses.



Figure 14: dual-pulse time difference distribution after correction.

Once the calibration performed, the time difference between the same pulses as in Figure 6 is measured with the same method (threshold crossing time extrapolated thanks to a fourth degree polynom). On figure 14, 5000 events are displayed (horizontal axis grid step is 3.125 ps). Whatever the distance between the pulses, the jitter is of 11ps rms, which now gives a single pulse resolution as good as 8 ps rms, improved by a factor of two compared to the raw one.



Figure 15: random jitter [ps rms].

Random jitter is a second order element in the raw jitter but as it cannot be corrected, its contribution may become more important after correction of Fixed Pattern jitter. Figure 15 shows the rms of random jitter along the different cells of the sampling matrix. It is higher at the junctions between the columns (every 16 samples), because that is where the jitter of the clock can be seen. The mean jitter value here is however lower than 2 ps rms.

#### 2) Time measurements with MCPPMTs

A preliminary test of the board on a Micro Channel Plate Photo Multiplier Tube (MCPPMT) characterization bench has been realized.



Figure 16: correlated pulses from MCPPMTs.

The bench comprises a laser feeding two quartz bars read by MCPPMTs with light pulses. The Time Transit Spread (TTS) of the MCPPMT had previously been characterized by high-end (CFD + TDC) commercial modules ( $\sim$  20 ps rms for 40 photo-electrons).

Figure 16 shows 5000 superimposed events. The FWHM of the signals is of only 1.5 ns but the pulses are cleanly sampled by the board. The high SNR and the quality of the signal stored permits an easy normalization of the pulse heights before applying by software a CFD-equivalent algorithm in order to perform an alignment of their first edge (see Fig. 17) and a measurement of their distance.



Figure 17: same pulses normalized and realigned

The distribution of the distances is plotted on Fig. 18 (main horizontal grid step is 10ps).



Figure 18: distribution of inter-pulse distance.

The sigma of the distribution of Figure 18 is of 23ps, which is in very good ad-equation with the measurements previously performed. This proves that the board can be used for the characterization of this kind of ultra fast photomultipliers. Very promising preliminary tests with SiPMs (Silicon Photo Multipliers) have also been performed.

#### VI. SUMMARY OF BOARD PERFORMANCES

The main characteristics and performances of the USB-WaveCatcher board are summarized below:

- 2 DC-coupled 256-deep channels with 50-Ohm active input impedance
- $\pm 1.25V$  dynamic Range, with full range 16-bit individual tunable offsets
- Bandwidth  $>$  500MHz
- Signal/noise ratio: 11.9 bits rms (noise =  $650 \mu$ V rms)
- Sampling Frequency: 400MS/s to 3.2GS/s
- Max consumption on +5V: 0.5A
- Absolute time precision in a channel (typical)
- without INL calibration: 20ps rms  $(400MS/s$  to  $1.6GS/s)$  $\circ$ 16ps rms  $(3.2GS/s)$
- after INL calibration: 12ps rms (400MS/s to 1.6GS/s)  $\circ$ 8ps rms (3.2GS/s)
- Trigger source: software, external, internal, threshold on signals
- 2 individual pulse generators for reflectometry applications
- On-board charge integration calculation
- Acquisition rate (full events): up to  $\sim$ 1.5 kHz over 2 channels
- Acquisition rate (charge mode): up to  $~40$  kHz over 2 channels

#### **CONCLUSION** VII.

This study proves the ability of fast analog memories to be used for high precision time measurement. Their main advantages are the very low power and cost and the ability to work directly with analog signals. Moreover, the shape of the signal is available if necessary.

Various evolutions of the SAM chip are under study. targeting either higher precision time measurements or longer time window. As a beginning, a R&D version of the chip has been recently submitted in order to study the optimization of the power and of the signal bandwidth in view of these more dedicated versions.

In a general way, the USB-WaveCatcher can replace oscilloscopes for a much lower cost in most high-precision short-window applications. Moreover, it can be used for very high precision time measurements, especially since said measurements can be performed directly on the analog signals.

#### VIII. **REFERENCES**

[1] SAM: a new GHz sampling ASIC for the H.E.S.S.-II Front-End Electronics, E. Delagnes et al, Nuclear Instruments and Methods in Physics Research Section A, Volume 567, Issue 1, p. 21-26

[2] D. Breton, E. Delagnes and M. Houry, "Very High Dynamic Range and High Sampling Rate VME Digitizing Boards for Physics Experiments", IEEE Transactions on Nuclear Science, VOL. 52, No. 6, Dec 2005, pp 2853-2860.

# *TUESDAY 22 SEPTEMBER 2009*

# *PARALLEL SESSION B2b RADIATION TOLERANT COMPONENTS AND SYSTEMS*

## **Measurement of Radiation Damage to 130nm Hybrid Pixel Detector Readout Chips**

R. Plackett, X. Llopart, R. Ballabriga, M. Campbell, L. Tlustos, W. Wong

CERN, 1211 Geneva 23, Switzerland

Richard.plackett@cern.ch

#### *Abstract*

We present the first measurements of the performance of the Medipix3 hybrid pixel readout chip after exposure to significant x-ray flux. Specifically the changes in performance of the mixed mode pixel architecture, the digital periphery, digital to analogue converters and the e-fuse technology were characterised. A high intensity, calibrated xray source was used to incrementally irradiate the separate regions of the detector whilst it was powered. This is the first total ionizing dose study of a large area pixel detector fabricated using the 130nm CMOS technology.

#### I. INTRODUCTION

This paper presents recent measurements of the performance of the Medipix3[1] pixel readout chip after exposure to large doses of x-rays. Medipix3 is the first full pixel readout chip to be fabricated in the IBM 130nm CMOS technology, thus its ability to survive irradiation is a strong indicator of the technologies expected intrinsic hardness of the technology[2], and its suitability for use in high readiation environments such as proposed sLHC[3] detector systems.

#### II. MEDIPIX3

Medipix3 is the most recent addition to the Medipix family of single photon counting pixel readout chips. It is designed as part of a hybrid pixel detector assembly. As with its predecessor, Medipix2[4], it provides individual readout channels for a 256 by 256 array of 55um square pixels. Each pixel channel is electrically connected to its corresponding structure in the sensor chip by means of a solder bump bond. Each channel provides analogue amplification, shaping and two discriminators driving two programmable binary counters. The chip as a whole is then read out with a 'shutter' signal. The primary design goal of both Medipix2 and Medipix3 is to provide single photon counting x-ray detection with high resolution, high dynamic range and high signal to noise ratio.

Medipix3 builds on the concept of Medipix2 but adds several new features and modes that extend its functionality, especially in the area of single photon spectrometry. A limiting factor in Medipix2's ability to reconstruct a spectrum was the charge sharing phenomena. A photon falling between two or four pixels will share its energy amongst them, with each of the two or four signals produced having a significantly lower chance of passing the discriminator threshold. Medipix3 contains charge summing circuitry in its analogue front end, allowing four neighbouring pixels to communicate and allocate the full charge to the pixel with the largest initial signal, before the signal is passed to the discriminator. This effectively removes the distortion of the spectrum caused by charge sharing. In addition to this Medipix3 can operate in a spectroscopic mode, whereby spatial resolution is sacrificed for a greater ability to determine photons' energies. Groups of four pixels are ganged together to form 110um square pixels, sharing each pixel's discriminators and counters between them. This gives each super pixel eight separate threshold levels and counters, which is sufficient to capture a detailed spectroscopic image. Additionally Medipix3 can be read with less dead time than its predecessor and with multiple overlapping shutter signals. It is anticipated that the 130nm fabrication technology will be significantly more radiation hard than the 250nm technology used for Medipix2.

The first Medipix3 wafers were delivered at the beginning of 2009 and have been undergoing extensive testing in the intervening period. The charge summing and spectroscopic modes described above operate as expected and the pixel front end has been shown to operate with a very low noise. The measured equivalent noise charge of a pixel being just  $\sim 60e^$ rms. This noise level was measured when running the chip in standard single pixel mode.

#### III. IRRADIATION STUDIES

The radiation tolerance of the Medipix3 readout chip is of interest to physicists working in HEP, high intensity synchrotron x-ray sources and with the Medipix3 chip in commercial products. The studies that are reported in this section were carried out with a Seifert RP149[5] calibrated xray source. Whilst it is acknowledged that the effect of single point defects caused by photons is significantly less than that of defect clusters caused by hadrons, the very large flux of xrays used means that some useful conclusions can be drawn even in comparison with hadronic irradiation. In these tests an unbonded Medipix3 chip was used to allow us to decouple the effects of sensor and readout chip irradiation.

Initially a single Medipix3 chip was exposed to 60Mrad of irradiation with the x-ray beam spot covering a majority of the pixel matrix. It was intended that the matrix would be read out continuously whilst the irradiation was in progress. This measurement demonstrated that one of the analogue voltage

levels supplied to the pixels front end by a Digital Analogue Converter (DAC) was unexpectedly sensitive to irradiation at levels below 1Mrad. It was discovered that the design of several switches in the analogue section of the pixel left a leakage current path to ground through pairs of minimum sized NMOS transistors that were susceptible to radiation damage. At dose levels of 1Mrad the cumulative leakage current on this DAC across the pixel matrix was too large to sustain the required operating voltage. This voltage drop leads to the chip ceasing to function across the whole matrix, regardless of the level of irradiation the individual pixels have suffered. In addition, it was found that the electrostatic protection diode structure on each pixel was being damaged and degrading the performance of pixels on an individual basis, specifically increasing the noise in the pixel.

A second irradiation, with an integrated dose of 400Mrad, targeting the DAC and readout structures at the chips periphery demonstrated that the effect on these structures was relatively small, compared with that on the pixel matrix. It also showed that the effect on the performance was largest at approximately 3Mrad, as demonstrated in tests by F. Faccio, and that the performance of the DACs recovers after further irradiation. There was no measurable effect on the LVDS readout drivers or e-fuse identification logic. The 400Mrad beam spot overlapped the region of irradiation on the pixel matrix giving a smaller region of pixels that received a dose of 460Mrad.

In order to further understand the leakage current problem on the pixel matrix a second chip was irradiated, this time in smaller steps of 100krad, up to the maximum damage level of 3Mrad. To reduce the total leakage current drawn by the chip, the x-ray spot was targeted at a corner, thus irradiating far fewer pixels. The chip operated up to a dose of 1500krad with a significant drift in the threshold value being recorded. The data from these measurements was used to determine the interplay of the voltages supplied to the analogue section of the pixels. By using this data to map the points where the voltage drop was causing switches to turn off, and by compensating by adjusting other balancing voltage levels, not affected by the radiation, it was possible to bring both chips back to an operating state very close to nominal.

By configuring the chip in this manner it was possible to read out the full matrices of both chips and take measurements of the increase in noise, gain and threshold variation with radiation by comparing the irradiated and unirradiated parts of the matrices.

#### IV. DAC STABILITY

As described above it was possible to read the values of



the DACs continuously during the 400Mrad irradiation. Figure 1 shows their variation with the received radiation dose. These measurements clearly show the recovery effect after the 3Mrad level, with both types of DAC stabilizing as the dose becomes higher. The small step seen at the 400MRad point is the immediate annealing effect as the x-ray tube was turned off. The rate of irradiation here was much faster than is expected in any realistic application and this immediate annealing would be a benefit in all expected applications. As can be seen from these results the NMOS and PMOS DACS have shifts of just 9mV and 33mV respectively, although the effect at 3Mrad is higher and in the case of the PMOS DAC in the opposite direction to the annealing.

Figure 1: The voltage produced by the NMOS and PMOS DACs between 0Mrad and 400Mrad.

#### **V.** PIXEL PERFORMANCE

Once the alternative operating point of the Medipix3 chips had been determined it was possible to operate the chips normally. This made it possible to measure the noise increase, gain variation and threshold stability by comparing irradiated and unirradiated parts of each pixel matrix.

The performance of the chip with the 60Mrad / 400Mrad / 460Mrad regions is shown in Figures 2 to 5.



Figure 2: The noise recorded across the pixel matrix. The dark region to the right is a yield artifact present before irradiation. The circular regiion centered on the matrix was irradiated to 60Mrad. The semicircular region centered on the bottom edge of the matrix was irradiated to 400Mrad.

The noise map shown in Figure 2 contains a yield artefact, as it was expected the chip would not survive this x-ray dose a perfect chip was not used. Once this has been accounted for the mean noise across the matrix is 71.6e- with an uncertainty of 12.9e-. This is very close to the unirradiated value of 60eand is will within operational parameters. The noise values for pixels in different irradiated regions along several columns are shown in Figure 3.



Figure 3: The noise as a function of row, showing 460Mrad (0 to 100), 60Mrad (100 to 200) and unirradiated (200 and above) regions of the chip.

By using the internal charge injection test circuitry to stimulate the analogue front end of the chips it is possible to measure the gain performance. It can be seen in Figure 4 that there is essentially no gain variation measureable with a 2kesignal between irradiated and unirradiated pixels on the same matrix.



Figure 4: The noise peak and test pulse plateau shown for irradiated and non irradiated pixels. It can be seen that the two lines completely overlap for the positive test pulse case, which replicates the nominal operating situation.

Very little increase in threshold variation can be seen in the threshold values achieved across the pixel matrix. The spread of threshold values is shown in Figure 5. The variations between the 0/60/400/460Mrad regions can be completely compensated for by the chip's five bit threshold equalisation circuitry that is designed to compensate for natural threshold variations between pixels.



Figure 5: The threshold variation across the pixel matrix.

The effects of irradiation to 3Mrad are slightly more pronounced than the effects of the higher irradiation levels, however as before the chips show gain, noise and threshold variations well within operational limits. The noise in the irradiated pixels is between 70 and 90e-, the gain variation with a 2ke- test pulse is still minimal and the threshold variation is approximately 60 DAC steps and can be automatically equalised as before. No measureable increase in the analogue or digital currents drawn by the chip was observed.

#### VI. CONCLUSIONS

The results presented above provide confirmation that the Medipix3 chip and the 130nm CMOS technology are intrinsically radiation tolerant to levels that are several orders of magnitude higher than the 250nm fabrication technology. This has implications for the designs of future pixel detectors for sLHC and high intensity x-ray sources, and indicates that 130nm is a strong contender for their fabrication technology. It was intended to find an upper limit to the Medipix3 operation, however the device seems to be operating well at 460Mrad and further measurements will be needed to find a break down point.

It should be noted that these measurements were carried out with an x-ray source and so should not be used to accurately quantitatively estimate the effect of hadronic radiation on devices.

#### VII. ACKNOWLEDGEMENTS

The authors would like to thank F. Faccio for his advice and support during this program of measurements and the group at CTU Prague, for providing the Medipix3 readout system.

#### VIII. REFERENCES

- [1] X. Llopart, et al., *"Medipix3: a 64k pixel readout chip"*, iWoRiD, CPC, (2009).
- [2] L.Gonella et al., NIM A 582, 750-754 (2007).
- [3] F. Gianotti, Physics potential and experimental challenges of the LHC Luminosity upgrade, Available online as hep-ph/0204087
- [4] X. Llopart, et al., IEEE Trans. Nucl. Sci. NS-49 (5) (2002) 2279.
- [5] F. Faccio, et al., IEEE Trans. Nucl Sci. NS- 53 (2006) 2456.

## **Radiation tests on the complete system of the instrumentation of the LHC cryogenics at the CERN Neutrinos to Gran Sasso (CNGS) test facility.**

E. Gousiou, G. Fernandez Penacoba, J. Casas Cubillos, J. de la Gama Serrano

CERN, 1211 Geneva 23, Switzerland

#### [Evangelia.Gousiou@cern.ch](mailto:Evangelia.Gousiou@cern.ch)

#### *Abstract*

**There are more than 6000 electronic cards for the instrumentation of the LHC cryogenics, housed in crates and distributed around the 27 km tunnel. Cards and crates will be exposed to a complex radiation field during the 10 years of LHC operation. Rad-tol COTS and rad-hard ASIC have been selected and individually qualified during the design phase of the cards. The test setup and the acquired data presented in this paper target the qualitative assessment of the compliance with the LHC radiation environment of an assembled system. It is carried out at the CNGS test facility which provides exposure to LHClike radiation field.** 

#### I. THE CRYOGENIC INSTRUMENTATION **ELECTRONICS**

The cryogenic instrumentation electronics are placed all around the LHC tunnel and in protected areas.

Concerning the tunnel electronics, radiation was a main constraint since the beginning of the design phase. Space or military technologies were incompatible with the budget of the project and instead, Components Off The Shelf (COTS) were selected, qualified for operation under radiation and finally used [1, 2].

Adversely, the protected areas electronics have not been designed radiation-tolerant, as the radiation levels in the protected areas were quite underestimated. Many of the components of the protected areas electronics are the same as the tunnel ones; nevertheless, there are several components for which no information exists for their performance under radiation.

The aim of the tests at the CNGS facility is to validate the complete systems (rather than individual components) in both cases: tunnel and protected areas electronics.

The cryogenic instrumentation electronics (in the cases of tunnel and protected areas as well) are divided into conditioners, measuring temperature, pressure, liquid helium level and digital status, and into actuator channels, providing AC and DC power to the areas where helium needs to be heated-up. Figure 1 shows the architecture of the system, in the case of conditioner channels. A conditioner card holds two independent channels. Each channel has a front end ASIC taking measurements on a sensor. The resulting waveform is sent for digitization to the ADC. A 16 bit word is then sent to the FPGA for the first stage of processing and the formatting of the data provided to the communication card. Up to 15

channels may be interfaced with the same communication card, which implements the WorldFIP protocol and places the data on the Fieldbus.



Figure 1: System architecture

The system offers very high accuracy, due to its auto calibrating features [1]:

- For each measurement on a sensor, there is a  $\bullet$ measurement on a high precision reference resistance which permits the correction of the gain drifts.
- The polarity of the input of the amplifier is inverted so as to correct its offset.
- Finally, the excitation current is applied in both directions in order to compensate for the thermocouple effects, as well as any dc offsets of the wiring.

#### II. RADIATION TOLERANCE STRATEGY

The radiation, in the case of the LHC tunnel electronics, was faced in two main ways: an elaborate components selection and a set of mitigation techniques [1].

#### *A. Components Selection*

- Customized development of a radiation hard front end ASIC and of a linear voltage regulator for power supplies and references.
- Use of anti-fuse FPGAs.
- Selection of a Fieldbus agent (implementing the WorldFIP protocol) that uses signal transformers instead of optical insulators.
- Qualification for operation under radiation of all the components in dedicated facilities [2,3].

#### *B. Mitigation Techniques*

- Triple module redundancy is implemented on the FPGA registers.
- The weakest part of the data acquisition chain is a SRAM within the WorldFIP agent. Since SRAMs are usually prone to SEU, in a way to reduce the probability of an error, it is regularly refreshed.



Figure 2: Timing for data transfer between the different parts of the system

As Figure 2 indicates, the exchange of data between a conditioner card and its communication card takes place every second; the same timing is applied in the case of the exchange between the communication card and the Fieldbus. Within the communication card, between the robust FPGA and the SRAM, there is however a refreshment period of 20 msec.

- All the current supplies and the thermal dissipators are overdesigned.
- Finally, during maintenance campaigns, scheduled replacements are being foreseen where needed.

#### III. THE CNGS TEST FACILITY

The CNGS test facility is housed in the service gallery of the CNGS experiment [4]. The shower of particles escaping through the ducts, connecting the main tunnel with the service gallery, is irradiating the Devices Under Test (DUT). The radiation levels depend on the position in the gallery (Figure 3). The radiation field is mixed (TID, NIEL and particles with  $E > 20$ MeV simultaneously), as in the LHC. Since the field is wide and relatively homogeneous [5], testing complete systems becomes possible.



The facility provides:

- Several connections to the mains, protected with breakers.
- Real time radiation monitors and an online system for the data extraction.
- The WorldFIP communication.
- The possibility to transfer up to 96 signals from the DUT in the radiation area to the control room of CNGS at a distance of 2 km.

#### IV. THE TEST SETUP

#### *A. Devices Under Test*

Two crates were used to house all types of electronics (conditioners, actuators, communication and power cards), representing finally 50 channels of LHC tunnel and 16 channels of protected areas electronics.

Completing the setup, fixed loads were plugged into all the conditioners and in the same way fixed set points were given to all actuator channels; this way constant measurements throughout the tests are expected.

#### *B. Data Acquisition*

Two types of on line data is acquired:

- The WorldFIP bus data, in exactly the same way  $\bullet$ as in the LHC.
- Current consumption and DC voltage levels measurements. In order to gain access to those signals from the DUT, modifications needed to be made to the crate power supply card. Briefly, the power supply card receives the mains and provides the DC voltages required by all the cards in a crate. A 1  $\Omega$  resistance was inserted in series in the PCB tracks of the power card (Figure 4). The voltage drop across this shunt resistor provided an image of the current able to be read over the 2 km cables. A measurement set-up based on LabVIEW, a DMM and a switching module located at the control room of CNGS retrieves and stores these measurements.



Figure 4: Measurements on the power supply card

Figure 3: CNGS main tunnel and service gallery

#### *C. Testing Periods*

- The tests started with 1 month of dry run. During the first half of this period, the electronics were installed in the control room of CNGS and during the second half in the radiation area, in the same position as during the irradiation. This provided a clear confirmation of the reliability of the electronics and of the measurement system as well.
- The testing continued with 1.5 months in the low dose station of CNGS. Since it was the first time the complete system was tested, it was decided to start moderately in the low dose station. The radiation levels received during this period are given in Table 1:

Table 1: Radiation levels at the low dose station



Finally, the equipment was moved to the high dose station. In 1.5 months, the radiation levels received are given in Table 2.

Table 2: Radiation levels at the high dose station



#### V. RADIATION TEST RESULTS

#### *A. Tunnel Electronics*

.

The tests confirmed that the design of the tunnel electronics is well within the LHC radiation requirements. Until now, they have received in total 125 Gy and  $4 \cdot 10^{12}$ 1MeV eq. n/cm<sup>2</sup> and at the end of the test the levels are expected to reach 185 Gy and  $6 \cdot 10^{12}$  1MeV eq. n/cm<sup>2</sup>. No influence on the output accuracy, in any of the 50 channels under test, has been noted neither an increase of the current consumption. Also, no SEE has been detected.

The extrapolation of those data to the LHC conditions [6, 7, 8], considering nominal operation, gives **more than 10 LHC years** (\*) for 95% of the cases. Regarding the remaining 5% (which represents electronics installed in the Dispersion Suppressor areas) the radiation tolerance in terms of nominal LHC years is currently estimated at  $2.5$  (\*) and the value is expected to increase by the end of the tests.

Figure 5 shows the output of 5 different channels and Figure 6 focuses on one channel adding the design specs limits.



#### *B. Protected Areas Electronics*

#### *1) Insulated Temperature Conditioners*

There are around 2400 channels of this type of electronics in the protected areas of the LHC. During the tests at CNGS, two types of failures were encountered: failures due to cumulative effects (TID, NIEL) and SEUs.

#### *i. Cumulative Effects*

Twelve channels failed simultaneously after 70 Gy and  $2 \cdot 10^{12}$  1MeV eq. n/cm<sup>2</sup>. Since the radiation field is mixed it is not possible to understand if the TID or the NIEL is the main reason of the failure. Nevertheless, as the field at CNGS is LHC-like, TID and NIEL give a correspondence to approximately the same number of LHC years. The extrapolation to the LHC conditions [9], considering nominal operation, gives for **94%** of the channels **more than 10 LHC years** (\*). For the remaining **6%** (which represents channels installed in the worst-case locations: UJ14, UJ16 and UJ56) the **nominal LHC years** are reduced to **4** (\*).

The failing component is a DC-DC converter. After its replacement, the channels were functional again.

*ii. SEU* 

#### The SEU **cross section** estimated from the test results is

 $2 \cdot 10^{-9}$  cm<sup>2</sup>. The extrapolation to the LHC conditions [9], considering nominal operation and accounting the total amount of channels, gives **6 SEU/ hr** (\*).

The implementation of a mitigation technique is already in progress and consists of a software reset to be automatically launched by the control system.

The appearance of a SEU is illustrated in the following figures:



Figure 7: Insulated temperature channel in normal operation



Figure 8: Insulated temperature channel when a SEU occurs

#### *2) AC Heater Actuator*

The AC heater actuators represent less than the 0.5% (45 channels in total) of the cryogenic instrumentation electronics and are only found in protected areas. They receive the mains and a set point and with a solid state relay provide a Pulse Width Modulation of the mains to a heater.



Figure 9: AC heater actuator channel

Three AC heater channels failed in the low dose radiation station after exposure to 5 Gy and  $7 \cdot 10^{10}$  1MeV eq. n/cm<sup>2</sup>.

The failing component was the solid state relay which functions with optocouplers. When it was replaced the cards were functional again. Figure 8 shows the three channels failing almost simultaneously.



The same results were later reproduced with four more channels in the high dose radiation station.

Considering nominal LHC operation [9] for **65%** of the channels, we get **more than 10 LHC years** (\*). In the worstcase locations (UJ14, U16 and UJ56 where **20%** of the channels are installed) the **nominal LHC years** are reduced to **0.3** (\*). However, considering the 09/10 LHC operation (where the expected radiation levels are two orders of magnitude lower) the years are increased by two orders of magnitude. Finally, as many commercial components are already installed in the worst case areas, there is a study for either a relocation or for additional shielding; this will also benefit the LHC cryogenic electronics.

#### VI. CONCLUSIONS

The tests at CNGS have provided qualitative and quantitative knowledge on the radiation tolerance of the complete system of the LHC cryogenics instrumentation. The reliability of the tunnel electronics has been confirmed, whereas the weaknesses of the protected areas electronics have been revealed. In the second case, different techniques of facing the problems are already under implementation.

#### VII. REFERENCES

[1] J.Casas et al., "The Radiation Tolerant Electronics for the LHC Cryogenic Controls: Basic Design and First Operational Experience", [Topical Workshop on Electronics for Particle](http://cdsweb.cern.ch/search?sysno=002761921CER)  [Physics](http://cdsweb.cern.ch/search?sysno=002761921CER) , Greece, 15 - 19 Sep 2008, pp.195-199. [2] J. A. Agapito et al., "RAD-TOL Field Electronics for the LHC Cryogenic System**"**, [7th European Conference On](http://cdsweb.cern.ch/search?sysno=002404503CER)  [Radiation And Its Effects On Components And Systems,](http://cdsweb.cern.ch/search?sysno=002404503CER) Noordwijk, The Netherlands, 15 - 20 Sep 2003, pp.653-7. [3] J.Casas et al. "SEU Tests Performed on the Digital Communication System for LHC Cryogenic Instrumentation", [Nucl. Instrum. Methods Phys. Res., A 485 , 3 \(2002\) 439-43](http://cdsweb.cern.ch/ejournals.py?publication=Nucl.+Instrum.+Methods+Phys.+Res.,+A&volume=485&year=2002&page=439) . [4] K. Elsener, "General description of the CERN project for a neutrino beam to Gran Sasso (CNGS)", CERN AC Note, 2000-03.

[5] M. Brugger, "FLUKA CNGS Radiation Levels @ RadMonLocation", RadWG meeting, 24/04/2009. [6] C. Fynbo, G. Stevenson, "Compendium of annual doses in the LHC arcs", LHC Project Note 251, 4 April 2001. [7] C. Fynbo, G. Stevenson, "Radiation Environment in the Dispersion Suppressor regions of IR1 and IR5 of the LHC", LHC Project Note 296, 27 May 2002. [8] C. Fynbo, G. Stevenson, "Estimation of extra dose

contribution in the LHC arcs arising from proton losses far downstream of the high luminosity interaction points IP1 & IP5", LHC Project Note 295, 27 May 2001.

[9] Radiation-To-Electronic Study Group (R2E), [https://ab](https://ab-div.web.cern.ch/ab-div/Meetings/r2e)[div.web.cern.ch/ab-div/Meetings/r2e](https://ab-div.web.cern.ch/ab-div/Meetings/r2e)

\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

**(\*)** A safety factor of 2 has been applied.

Figure 10: Three AC heater channels failing at the low dose station

### Development of new readout electronics for the ATLAS LAr Calorimeter at the sLHC

A. Straessner<sup>a</sup>

on behalf of the ATLAS Liquid Argon Calorimeter Group

<sup>a</sup> Technische Universität Dresden, D-01062 Dresden, Germany

Arno.Straessner@cern.ch

#### *Abstract*

The readout of the ATLAS Liquid Argon (LAr) calorimeter is a complex multi-channel system to amplify, shape, digitize and process signals of the detector cells. The current on-detector electronics is not designed to sustain the ten times higher radiation levels expected at sLHC in the years beyond 2019/2020, and will be replaced by new electronics with a completely different readout scheme.

The future on-detector electronics is planned to send out all data continuously at each bunch crossing, as opposed to the current system which only transfers data at a trigger-accept signal. Multiple high-speed and radiation-resistant optical links will transmit 100 Gb/s per front-end board. The off-detector processing units will not only process the data in real-time and provide digital data buffering, but will also implement trigger algorithms.

An overview about the various components necessary to develop such a complex system is given. The current R&D activities and architectural studies of the LAr Calorimeter group are presented, in particular the on-going design of the mixed-signal and radiation tolerant front-end ASICs, the Silicon-on-Saphire based optical-link, the high-speed off-detector FPGA based processing units, and the power distribution scheme.

#### I. INTRODUCTION

The Liquid Argon (LAr) calorimeters of the ATLAS experiment [1] at the Large Hadron Collider (LHC) [2] consist of 182486 detector cells whose signals need to be read out, digitized and processed. For each detector element, the signal timing and energy deposit are determined. A total of 1524 Front-End Boards (FEB) [3] are installed directly on the detector in radiation environment. In the current system, each of them performs the pre-amplification, shaping and gain-selection, analog buffering and digitization of up to 128 input channels. The analog sampling rate of the Switched Capacitance Array (SCA) of the FEB is 40 MHz, while digitization is performed at up to 100 kHz, after reception of a Level-1 trigger accept signal via the Trigger, Timing and Control (TTC) system. The FEBs also calculate analog sums of up to 32 channels as input to the Level-1 trigger system. The digitized output is transfered with optical links at 1.6 Gb/s per FEB to the 192 Readout Driver (ROD) boards of the back-end system [4]. The ROD implements digital FIR filters on Digital Signal Processors (DSP) to prepare the data for the higher-level trigger and data acquisition (DAQ) systems. The overall architecture of the current system is shown in Figure 1.



Figure 1: Architecture of the front-end electronics of the current LAr Calorimeter readout.

The main challenge for the design of the front-end system is the radiation in the ATLAS cavern. At nominal LHC operation, with a design luminosity of  $10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> and after 10 years of operation, a Total Ionising Dose (TID) of 5 krad(Si), a Non-Ionizing Energy Loss (NIEL) equivalent to  $1.6 \times 10^{12}$  n/cm<sup>2</sup>(1 MeV neutrons) and Single-Event Effects (SEE) from  $7.7 \times 10^{11}$  hadrons/cm<sup>2</sup>(> 20 MeV) are expected [5]. In the FEB, 11 types of ASICs with different kind of radiation tolerant technologies, like DMILL and deep-submicron (0.25  $\mu$ m), are used. They are qualified to function with sufficient performance after these radiation levels, including safety factors of 10-30.

The super-LHC (sLHC) upgrade foresees an increase of instantaneous luminosities up to  $10^{35}$  cm<sup>-2</sup>s<sup>-1</sup> in the years beyond 2019/2020 and a prolonged operation of the accelerator and the detectors. The current FEB electronics is therefore expected to fail or to be seriously degraded during the sLHC phase [5]. Since only about 6% of spare boards and compo-

nents are available, a continuous replacement of FEB boards or components is not feasible during sLHC running. It is therefore required to design new front-end electronics for the ATLAS LAr Calorimeters and to develop ASICs in more radition-hard technology. Along with the radation requirements some of the design aspects of the current FEB are planned to be improved. The number of ASICs should be reduced, as well as the number of different voltage levels, currently supplied by 19 voltage regulators on board. The voltage distribution system with 58 Low-Voltage Power Supplies (LVPS) is also foreseen to be replaced. The total power consumption per FEB should however not be increased.

The new design gives the opportunity to implement a globally better performing readout system. The Level-1 trigger system should be able to cope with a more challenging scenario: with a higher trigger accept rate, and a longer trigger latency. The latter may be necessary if trigger signals from the Inner Detectors of ATLAS, with a possibly longer readout time, will be included in the trigger decision. Furthermore, it is foreseen to digitize all incoming data at 40 MS/s while keeping the current effective dynamic range of 16 bits. The data shall be transfered by fast optical links to the backend system. Since each FEB produces about 100 Gb/s of data, multi-fibre links are envisaged. Still, each individual optical link must be able to operate at about 10 Gb/s and be radiation tolerant.

The large data rates are also a challenge for the back-end system. The ROD boards of the sLHC generation are planned to treat input from one front-end crate housing 14 FEBs, which corresponds to an input rate of 1.4 Tb/s. Signal processing must proceed within a short latency in the order of  $1 \mu s$ , since the digital output is foreseen to be fed via the RODs into the Level-1 trigger system. Thus, the hardware trigger will receive digital data with higher granularity which introduces a larger flexibility in the implementation of trigger algorithms. These usually sum up the energy of a given number of calorimeter cells but may also perform more complicated operations. Since the algorithms will be programmable and adjustable to sLHC running conditions, a better optimisation of the suppression of pile-up signals is possible, whose rate is expected to increase by up to a factor of 20 at the sLHC compared to nominal LHC rates. The data pipelines will be implemented in fast digital memory on the ROD board until the Level-1 trigger decision arrives and in dedicated Readout Buffers (ROB) for the higher level triggers and DAQ.



Figure 2: Layout of the front-end prototype with pre-amplifier, dualgain shaper, gain selector (GSEL), analog-to-digital converter (ADC), multiplexer (MUX) and serializer. The analog trigger sums provide a possible interface to the current trigger electronics.

#### II. DEVELOPMENT OF RADIATION TOLERANT FRONT-END ELECTRONICS

The main components of the future FEB boards are an analog front-end for signal pre-amplification and shaping, an Analog-to-Digital Converter (ADC) with serial output, and a fast optical link. At the sLHC, they have to stand a TID of 300 krad(Si) and a NIEL equivalent to  $10^{13}$  n/cm<sup>2</sup>(1 MeV neutrons). The current baseline design is shown in Figure 2. Results from prototypes are presented in the following sections.

#### *A. Pre-amplifier and Shaping*

The Silicon-Germanium (SiGe) BiCMOS technology is known for low noise and fast shaping times even after irradation with high dose levels. The 8WL 0.13  $\mu$ m process of IBM is chosen to implement prototypes for the LAr pre-amplifier and shaping stages. It is a relatively economic option with  $f_T =$ 100 GHz for fast ASIC solutions. The pre-amplifier is based on a "super common base" architecture like the one installed in the present FEBs. It achieves an overall equivalent series noise of present FEBs. It achieves an overall equivalent series noise of  $25 \text{ nV}/\sqrt{\text{Hz}}$  and dissipates only 42 mW [6]. The fully differential shaping stage is split into two gain stages ( $\times$ 1 and  $\times$ 10), each consuming 100 mW. A bipolar  $CR - (RC)^2$  shaping is chosen, like in the current FEB. Including second stage noise, the front-end has an input-referred noise of ENI=72 nA (RMS), about 28% lower than the pre-amplifier currently used. With the prototype a linearity of better than 0.2% is achieved over the full dynamic range [7]. The peaking time of the shaper is measured to be about 37 ns when a triangular pulse with 20 ns rise and 400 ns fall time is injected, as expected for a typical physics pulse. This value could be further optimized to find the best compromise between electronic and pile-up noise suppression, which, respectively, decrease and increase with longer shaping times. The layout of the front-end ASIC is shown in Figure 3.



Figure 3: Layout of the LAr analog front-end ASIC design.

The analog front-end was also tested after intake of radiation. First, a SiGe test chip was irradiated with gamma rays up to 50 Mrad(Si) and to 1 MeV neutron equivalent fluences of up to  $2 \times 10^{15}$  n/cm<sup>2</sup>. The very high doses were chosen because the

SiGe structures were tested with both high-performance transistors used in the ATLAS silicon tracker read-out and with highbreakdown transistors for LAr applications. The reciprocal gain difference before and after radiation increases linearly with increasing neutron-equivalent dose as expected, with an indication of saturation at highest doses. The same quantity shows a typical power-law dependence for gamma irradiation. The postradation gains were measured to stay above 40-50 over the dose ranges tested [6], as shown in Figure 4. SiGe BiCMOS is thus well suited for the LAr front-end electronics. Radiation tests with a full prototype of the LAr pre-amplifier and shaper were performed very recently and are reported elsewhere in these proceedings [7].



Figure 4: Gain of 8WL transistors after neutron irradiation. A cadmium shielding has been in order to avoid excess damage from thermal neutrons.

#### *B. Mixed signal front-end ADC*

The output signals of the pre-amplification stage need to be converted into digital signals at a sampling rate of 40 MS/s. This rate is originally fixed by the LHC bunch crossing rate but will be kept also for the sLHC stage even if the upgraded accelerator will operate with longer crossing intervals of 50 ns [8]. The ADC must provide 12 bit resolution in order to cover the full range of interesting energy deposits in the calorimeters from 20 MeV to 3 TeV with 15/16 bit dynamic range. Radiation tolerance and immunity to single event effects (SEE) are another requirement. Furthermore, the ADC output needs to be serialized to match the interface to the subsequent optical link component. Previously developed ADCs with similar performance of 12 bit at 40 MS/s [9] do thus not fit all requirements.

The R&D activities follow two strategies: evaluation of commercial off-the-shelf components advertised as being radation tolerant, like AD9259 , ST-RHF1201 , TI-ADS5281 , and development of a custom ADC chip based on CMOS technology. The IBM 8RF 0.13  $\mu$ m CMOS technology was shown to be sufficiently radiation hard and available at lower costs than an implementation in SiGe. The 12-bit pipeline ADC is composed of 8 stages of 1.5 bit resolution with digital error correction,

which requires calibration constants to be stored in radiation hard memory. The main building block of the ADC is an operational trans-impedance amplifier (OTA) which is at the core for the sample-and-hold (S/H) and multiplying DAC subsystems of each digitisation stage. A sampling capacitance of 1 pF is chosen in the fast S/H stage to reduce the electronic noise, which should stay below 150  $\mu$ V in total. A test chiplet was submitted via CERN to the MOSIS foundry with an OTA structure and a cascade of two track-and-hold stages. The chiplet is currently in production and expected back for tests by late autumn 2009.

Digital tests of the ADC output stage were performed using commercial non-radiation hard components. Figure 5 shows the test setup. Test signals were fed into a  $4 \times 14$ -bit ADC block. The digitized signals were input to  $8 \times 64$ -bit DRAM, where they were combined with an 8-bit bunch counter. The signals were timed with a 40 MHz clock, similar to the TTC system of ATLAS. The subsequent multiplexer received  $4 \times 16$  bit at 40 MHz converting the data to a 16 bit data stream at 160 MHz. The final serializer applied an 8b/10b encoding and was driving an 3.2 GHz optical link, at whose end a data receiver board measured the consistency of the data. For the high-speed components a 160 MHz crystal derived clock was used. The control logic of the system was based on Gray codes to be less sensitive to SEE. The DAQ chain was tested successfully and can be used to develop further concepts to reduce sensitivity to radition damage.



Figure 5: Test setup for the digital ADC logic.

#### *C. Radiation tolerant optical links*

A challenging project is the development of the very fast optical link which needs to perform at 10 Gb/s and be at the same time insensitive to radiation. To achieve these requirements, Silicon-on-Sapphire (SoS) technology was selected. The  $0.25 \mu$ m UltraCMOS process provided by Peregrine Semiconductors promises low power consumption and low cross-talk needed for the mixed-signal ASIC design. It is relatively economical or small and medium scale chip development. In 2007, first TID and SEE radiation tests were performed [10]. After irradiation with gamma rays from a  ${}^{60}Co$  source up to 4 Mrad,

only small leadkage currents of about 250 nA and small threshold voltage increase of about 0.1 V and below were measured for both NMOS and PMOS transistors. When exposing the chiplet to a proton beam with an energy 230 GeV, no SEE was observed in the shift registers at a flux of  $7.7 \times 10^8$  protons/cm<sup>2</sup>/s and they were still correcly functioning after total fluences of  $1.9 \times 10^{15}$  protons/cm<sup>2</sup>, which corresponds to 106 Mrad(Si).

A first prototype of the so-called Link-on-Chip (LOC) suffered from high jitter, which is expected to be overcome in the most recent design. The main building blocks of the new LOC2 transceiver are a 16:1 serializer with a CML driver running at 5 Gb/s. Eventually, the conversion to optical signal is planned to be performed by XFP/SFP+ or Versatile link [11] modules. The user data and clock are interfaced to an I/O buffer with 64b/66b encoding, scrambling and possibly data compression. For the prototype, the I/O buffer will be implemented into a standard FPGA with a 16-bit LDVS signal bus at the output to the serializer stage. The serializer is composed of three stages of 2:1 multiplexers running at 312.5 MHz, 625 MHz and 1.25 GHz, respectively. The last and most critical serialization step is implemented in form of two fast transmission gate D-flip-flops, operating at 2.5 GHz. The post-layout simulation of the corresponding 2.5 GHz PLL and of all other components show that the LOC2 requirements are met. In particular, a bit error rate lower than  $10^{-12}$  is achieved, and the total jitter is in the order of 35 ps using an ideal power source. The power consumption is below 500 mW or less than 100 mW per Gb/s. The LOC2 chip is submitted to the foundry and measurement results are expected soon. The layout is shown in Figure 6. In an effort towards an even higher data rate, a 5 GHz LC-tank based PLL is also being designed. In preliminary simulations random jitter below 1 ps (RMS) are observed. This component would be needed to reach the ulitmate goal of a transceiver operating at 10 Gb/s.

Figure 6: Layout of the LOC2 test structure.

#### *D. Power Distribution System*

The power supply scheme of the current LAr front-end electronics converts 380 V AC into 280 V DC which is distributed to the LVPS close to the on-detector front-end crates. Each watercooled LVPS consists of eight isolated switching DC-DC convertors, which need to run in radiation environment and a significant residual magnetic field of up to 100 mT. In the new powering concept the number of different voltages should be reduced and total power consumption is limited to the current level. Furthermore, single points-of-failure should be avoided. In the upgrade design, point-of-load (POL) regulators are foreseen to perform DC-DC conversion in close distance to the FEBs. In the Distributed Power Architecture a main converter generates a single voltage on a distribution bus where the POL are connected. In an Intermediate Bus Architecture, a second set of bus voltages is provided from the main bus, then lower voltages are given by the POL converters. Two commercial POL (LTM4602 and IR3841 ) were tested for EMI sensitivity in different positions inside and outside the front-end crate. The outcome was that shielding is necessary if the POL are placed inside the crate, on the backside of the FEBs. Radiation tests will be performed. Results on other commercial DC-DC converters are reported elsewhere [12].

#### III. HIGH BANDWIDTH BACK-END ELECTRONICS

In the R&D baseline layout, 14 FEBs are to be connected to one ROD, so that in total about 218 ROD boards will be needed. This assumes a continuous data stream at 100 Gb/s per FEB link, or 150 Tb/s in total for the whole back-end system. The possible architecture of the upgraded LAr back-end is shown in Figure 7. Modern fiber connectors in MPO/MTP style already combine 12 fibers, so that the transmission rate per link is feasable provided that the radiation hard front-end link performs at 10 Gb/s. A reduction of the number of links are being evaluated, like lossless data compression/decompression algorithms with ultra-short latency or the reduction of bits per ADC. The latter is only an option if the effect on physics results are negligible.



Figure 7: Possible back-end architecture of the upgraded LAr read-out.

In the ROD, FPGA based SERDES are applied to receive the data. Digital signal processing with modern FPGA are provided by a large number of DSP slices per module. The ROD implements a digital FIR filter, to determine the pulse height and signal timing from a given number of sampling points. It should furthermore be capable to align the signals in time and identify the bunch-crossing for the subsequent trigger algorithms. It must eventually perform a summing of signals from neighbouring cells to reduce the data size before transfer to the Level-1 trigger system. The incoming data will also have to be digitally buffered in fast digital memory on the ROD board until the Level-1 trigger decision arrives. A Readout Buffer (ROB), which is accessed by the higher-level triggers, must be implemented either on the ROD or on a separate board inside the ROD shelf system.

Prototype ROD boards are built based on the Xilinx Virtex-5 FPGA (XC5VFX70T) and 75 Gb/s fiber-optic tranceivers of Reflex Photons with SNAP12 connectors. A ROD injector board exploring the Altera Stratix GX II FPGA provided the pseudorandom data source. Both are shown in Figure 8. With this test setup, a rate of 6.5 Gb/s per fibre were found to be feasible. The FIR-filter and energy sums were successfully implemented. Timing alignment still needs to be studied. A total data processing latency of below 1  $\mu$ s was found to be achievable using parallel DSP slices on the FGPA, and taking the fiber-length of about 70 m between front-end and back-end into account. This is a first step towards a fully digital Level-1 trigger. More R&D is however needed to properly design the interface to the trigger system and to evaluate the implementation of possibly new algorithms which can profit from the higher granularity of the physics signals.



Figure 8: ROD injector board (left) and ROD prototype in ATCA format (right).

For integrating the ROD into a shelf system, the Advanced Telecommunication Computing Architecture (ATCA) is evaluated as a framework that provides shelf management protocols, power management, fast fabrics, and supports module redundancy, if needed. Developments are ongoing for fast data transfer in 10Gb Ethernet between ROD boards and from RODs to an external ROB card inside the ATCA shelf. In a recent concept, the ROB is reduced to a single PC server with fast RAM, to which the PCs of the higher level trigger farm can directly access. For the data transfer into the RAM, Remote DMA is being tested. A custom made buffer module as currently implemented in ATLAS [13] could therefore become obsolete. The data buffer may even be integrated into the ROD board, which is also being studied.

#### IV. SUMMARY AND OUTLOOK

The R&D activities for an upgraded read-out of the ATLAS LAr Calorimeter at the sLHC concentrate on the development of radiation tolerant ASICs for the analog and digitial frontend and on high-performance back-end electronics. Prototypes for the pre-amplifier and shaping system, for the ADC and the optical link-on-chip are being produced and their functionality tested with promising results. Radiation tests of the SiGe, CMOS and SoS technologies showed sufficient immunity to radiation damage. These tests are currently being repeated with the protoypes built recently. Development and tests of the new POL powering scheme are starting, first evaluating commercial components. On the back-end side, the time critical steps of the digital signal processing are studied. Also here, prototypes of the ROD board and ATCA test setups are successfully installed. They are used to evaluate the performance and to develop new algorithms for data treatment, data volume reduction, and fast data transfer on fabrics. The requirement to replace the LAr Calorimeter readout and the opportunity to implement a fully digitial calorimeter trigger, lead to a series of R&D challenges, for which promising first results were obtained and which will be further persued in the near future.

#### V. ACKNOWLEDGEMENTS

The work presented here has been performed within the AT-LAS Collaboration, and the authors would like to thank the collaboration members for their important contributions to the work presented and for helpful discussions.

This work was supported in part by the German Helmholtz Alliance *Physics At The Terascale* and by the German Bundesministerium für Bildung und Forschung (BMBF) within the research network FSP-101 *Physics on the TeV Scale with ATLAS at the LHC*.

#### **REFERENCES**

- [1] The ATLAS Collaboration, G. Aad, *et al.*, JINST 3 (2008) S08003.
- [2] L. Evans, Ph. Bryant, eds., *LHC Machine*, JINST 3 (2008) S08001.
- [3] N.J. Buchanan, *et al.*, JINST 3 (2008) p03004.
- [4] The Liquid Argon Back End Electronics Collaboration, A. Bazan, *et al.*, JINST 2 (2007) p06002.
- [5] N.J. Buchanan, *et al.*, JINST 3 (2008) P10005.
- [6] M. Ullán, et al., Nucl. Inst. Meth. A **604** (2009) 668.
- [7] F. M. Newcomer, *A SiGe ASIC Prototype for the ATLAS LAr Calorimeter Front-End Upgrade*, these proceedings.
- [8] J.-P. Koutchouk, F. Zimmermann, *LHC Upgrade Scenarios*, CERN Preprint, sLHC-Project-Report-0013, 2009.
- [9] G. Minderico, *et al.*, *A CMOS low power, quad channel, 12 bit, 40MS/s pipelined ADC for applications in particle physics calorimetry*, Proceedings of the 9th Workshop on Electronics for LHC Experiments, CERN Preprint, CERN-2003-006, 2003, p.88.
- [10] J. Ye, *et al.*, *ATLAS R&D proposal: Evaluation of the 0.25* µ*m Silicon on Sapphire technology for ASIC developments in the ATLAS electronics readout upgrade*, ATLAS R&D Proposal, ATU-RD-MN-0009.
- [11] Jan Troska, *The Versatile Transceiver Proof of Concept*, these proceedings.
- [12] S. Dhawan, *et al.*, *Progress on DC-DC converters for SiTracker for SLHC*, these proceedings.
- [13] J. Cranfield, *et al.*, J. Inst 3 (2008) T01002.

# *WEDNESDAY 23 SEPTEMBER 2009 PLENARY SESSION 3*

#### Buses and Boards

## Making the right choice

#### Jerry Gipper

#### Embedify LLC

#### Jerry.Gipper@embedify.com

#### *Abstract*

From motherboard to backplane to blade based computer systems, the choices are numerous. This paper will cover the markets and trends within those markets that influence decisions made by board suppliers. Discussion will focus on the various form factors, the development and evolution of industry standards, and the consortia that support and develop these standards, including VITA, PICMG, and others. This paper will conclude with suggestions for choosing the right form factor for your application.

#### I. CLASSES OF BOARDS

There are many classes of boards, unfortunately, they do not fit in well-defined buckets. The following is but one way of defining the different classes of boards. This is a guideline at best, many boards crossover between even these definitions.

#### *A. Reference platforms*

Reference platforms are designed by semiconductor manufacturers, provided to potential customers as a way to showcase new processors and chipsets. You can use these to shorten your hardware and software development time using their processors and chipsets. There is no set form factor for the industry and very little consistency within options from any particular semiconductor supplier.

The designs are often available for licensing. The supplier can provide you with Gerber files, bill of materials, schematics and other design aids. You can then modify the design to meet your specific goals.

On rare occasions, developers will deploy reference platforms "as is" in low unit volume products.

#### *B. Busboard*

Busboards have been around for over 30 years. They are defined as busboards because they use a parallel computer bus over a backplane for interconnection to other boards. Some, like VMEbus and CompactPCI bus, are designed to be inserted into a chassis with card guides to align them to a bused backplace. Others, like PCI bus cards insert into slots on a motherboard. Slot cards is an alternate term for busboards.

#### *C. Blade*

Blades are a relatively new class of boards. They are defined by their use of a switch fabric or high-speed serial interconnect instead of a parallel bus. A parallel bus may be used locally on the blade but it is usually not carried to the backplane. Blade configurations have emerged to better address cooling, density, interconnect, and expansion issues. Blades are often used when large amounts of data needs to be routed quickly to multiple destinations.

The big breakthrough has been improvement in Ethernet performance to the point where it has become a reasonable alternative to parallel buses. Ethernet is ubiquitous, inexpensive, and easy to use. Ethernet is the dominate fabric choice, with PCI Express second, and serial RapidIO used in some cases.

All a blade needs to operate as a standalone computer is an external power supply. In this case, it starts to cross the definition to motherboard.

Blades come in three types, general purpose, I/O or network processor, and switching blades. Switching blades are needed to configure a system into one of many different topologies from point-to-point to a full mesh with each blade connected to the next.

Examples of common blades are IBM Blade Servers, AdvancedTCA, MicroTCA, and VPX.

With the emergence of PCI Express, vendors are sure to develop even higher-performance motherboard and blade configurations.

#### *D. Carrier*

Carrier boards are designed to be host to some form of add-in module. Carriers have been gaining a lot of interest in the past few years as new mezzanines and modules with processor intelligence have been emerging on the scene, creating demand for host support.

Though most carriers are custom built for a specific application, almost any other class of board could be a carrier. It is not uncommon to see several mezzanines stacked on a VMEbus board or even a mezzanine. Designers are very creative in the use of carriers.

There is no standard for carriers but guidelines for usage are provided in many mezzanine specifications.

#### *E. Mezzanine*

Mezzanines are designed to offer modularity to some other form factor. Missing or expansion features are added through mezzanines. Sometimes mezzanines are used to gain extra board space in the third dimension.

Mezzanines have a long history. They originally were very custom to specific suppliers. Over the years, there have been several efforts to provide standards that drive conformance to form factors that could gain widespread use.

Most board companies have several options. We continue to see small custom modules on most any design where functional density is a challenge. Board designers gain real estate by using small modules that fit in any available space. At the same time, PCI boards and PMCs have become more standardized. PCI boards are used in cost-sensitive applications where space is not a major issue while PMCs are used on boards with severe height restrictions. Most slot cards and many blades allow for the addition of a PMC or AdvancedMC form factor.

The xMC series is one of the best examples of standardized mezzanines; PMC, XMC, FMC, and AdvancedMC are just a few of the many choices.

#### *F. COM: Computer on Module*

Computer on Module or COM is a newer class of board. The efforts to define standards has gained momentum in the past few years. These are small, self-contained modules that include a processor. They are designed to be small and can operate with or without a carrier. COMs often have a common expansion strategy that allows them to be nested or interconnected in a standard fashion. COM Express is the most common variation, though dozens of others fit this class of board.

#### *G. Motherboard*

Motherboards are the grandfather of computer boards. In the early 1990's efforts began to create standards for different form factors within the motherboard class. Many standards have evolved from traditional PC motherboards.

Motherboards are often installed in what many refer to as "pizza box" chassis. These 1U and 2U chassis can be stacked in racks and offer a very high computing density in a small space. They can be connected quickly via Ethernet and replaced without disrupting the entire system. However, pizza box stacks have cooling and density issues that some applications cannot tolerate.

Embedded applications tend to have more rigorous environment conditions and life cycles as part of their requirements. As a result, motherboards designed for embedded applications have more environment options and much longer life cycle commitments from suppliers

Embedded motherboards are heavily Intel Architecture influenced. They also come in a wide variety of styles and sizes.

Examples include; EBX, ETX, ITX, and PC/104.

#### II. TRENDS

Many trends impact the decisions that board suppliers make when defining their roadmaps and developing new products.

#### *A. Fragmentation of markets*

By the very nature of the wide range of usage models for embedded computing, the market is very fragmented. The fragmentation will only get worse as new uses for computers are discovered. In most cases, the needs of the users are diverging with little opportunity for convergence

In cases where form factors target specific application segments there is some convergence and the industry players are working to define products that can be used by a broad range of designers. The telecom industry is a great example of how an industry has worked with suppliers through PICMG to define board and system technologies to address the greater cause.

Market fragmentation causes the more choices to emerge but because there are so many choices, prices tend to stay high.

#### *B. Embedded is moving mainstream*

Only two percent (2%) of the world's microprocessors go into PCs; the other 98% are embedded systems according to Jim Turley, Embedded Technology Journal.

For years, the embedded market has taken a backseat to the desktop and server markets. Now, as those markets have reached saturation, suppliers are looking for new outlets for their products. Both Intel and Microsoft have made very bold moves recently that help them establish beachheads in the embedded markets. Intel is more aggressive with longer product life cycles and in developing processor technology that is more suited for embedded applications. The news of Intel's acquisition of the leading real-time operating system supplier, Wind River Systems, further strengthens Intel's position.

Microsoft is not to be left out. Recent Windows 7 announcements have included the embedded strategy at the same time as the desktop and server products announcements were made. They have also worked diligently to consolidate and improve the embedded Windows roadmap.

Becoming mainstream could lead to larger players in the market with lower prices and but with fewer options. Innovation could increase as competition heats up.

#### *C. Impact of SoCs and FPGAs*

Advances in systems-on-chip (SoC) processors and FPGAs are putting a real squeeze on board designers. In the past, it used to take complete boards of one size or another to provide the functionality required by an embedded computer. SoCs and FPGAs now have the capacity to incorporate much of this functionality and maintain the level of necessary performance at the same time. Throw in the fact that FPGAs are relatively easy to customize and off-the-shelf boards start to become obsolete except as host carriers for the SoC and FPGA silicon.

Common board types, especially mezzanines and small form factor boards can be easily replaced by either a SoC or FPGA. This could reduce further the number of commercial board suppliers and products in the market.

#### *D. Consumer electronics trends to watch*

The Consumer Electronics Association has highlighted four trends in the consumer electronics space that are having the greatest impact on electronics.

**Green as a Purchase Factor**: Materials and packaging; energy efficiency; recycling programs. Consumers are embracing the green movements and demanding products that are environmentally friendly.

**Evolving Command, Control and Display**: Touch screens; voice activation; motion sensing; 3D displays. The man-machine interface is a major challenge. As devices become smaller and more functional, connecting to humans is difficult. We will see a lot of innovation in this interface in the coming years.

**(No) Strings Attached**: Cutting cords; attaching services; shifting usage locations. It is all about being mobile. Devices of all types will have wireless connections changing the usage models.

**The Embedded Internet**: Localization; utilities and services; communication and commerce. Devices of all types are being enable with browser capability making it possible to exchange data in ways not thought possible. Intel talks about 15 billion connected devices in their embedded computing campaigns and is pushing this revolution.

The consumer market is the single biggest influence on electronics. Other application markets will need to adapt and then adjust as necessary to take advantage of the buying power of the consumer electronics market.

#### *E. Customization*

All board suppliers offer custom products and design services to some degree, some more than others. For most vendors, it is a majority of business. Mass customization is the next natural evolutionary step for boards.

Lower cost customization processes could lead to more appropriate choices for lower unit volume users. Prices could move higher if improved processes for customization are not developed and implemented by board manufacturers.

#### III. CONSORTIA AND FORUMS

There are dozens of consortia that contribute technology to the embedded computing markets. These range from those doing components such as processors and chip sets, to those doing board and system standards, and software.

**VITA**: Creates and promotes standards used by developers and users having a common market interest in critical embedded systems using real-time, modular embedded computing systems. [www.vita.com](http://www.vita.com/)

**PICMG**: Develops open specifications for high performance telecommunications and industrial computing applications. [www.picmg.org](http://www.picmg.org/)

**Blade.org**: Developers and users dedicated to expanding the blade ecosystem and to accelerating the growth and adoption of innovative technologies and solutions in the blade market. [www.blade.org](http://www.blade.org/)

**PC/104 Embedded Consortium**: Develops and promotes the PC/104 standard for embedded computers. [www.pc104.org](http://www.pc104.org/)

**VXIbus Consortium**: Supports and promotes the VXIbus for the test and measurement community. [www.vxibus.org](http://www.vxibus.org/)

**PXI Systems Alliance**: Promotes and maintains the PXI standard. [www.pxisa.org](http://www.pxisa.org/)

**Power.org**: Developers, tool providers, and manufacturers united to lead open hardware innovation for industry standards and applications based on Power Architecture technology. [www.power.org](http://www.power.org/)

For a complete list visit: OpenSystemsMedia's consortia list at [www.embedded-computing.com/consortia](http://www.embedded-computing.com/consortia).

#### IV. FACTOR THIS: BOARD SELECTION CRITERIA

What should a designer look for when selecting a board form factor? Consider the following issues:

J **Backplane versus motherboard**: This is a key decision. Factors to consider include I/O management, expansion, cooling, and ruggedness. Many of the newer backplane solutions, commonly called blades; use serial networking interconnects so that single boards can operate in physically separated boxes or in a larger chassis with several boards together. Larger systems that require a lot of expansion capability tend to lean toward the backplane choice. Smaller, more constrained applications use a motherboard of some style. Sometimes a large product line will use motherboards at the low end and backplane style for the high end. Be sure to understand the range of your product needs.

J **Size**: Size is always important. Some applications are very space constrained. Every square centimeter is prime real estate, and its use must be optimized. Larger boards tend to be less costly because they present fewer manufacturing challenges, but they do consume valuable space. Do space studies to determine which trade-offs make sense for your project.

J **Chassis choices**: Selecting a form factor is not the only step. Do you need a chassis? What will you be using for an enclosure and power supply? Will you be managing one as part of your project? Many form factors have a great selection of chassis. Busboards, slot cards, and blades are dependent on the chassis to provide the mechanical support they need for good selections. PC-style motherboards are well supported, but they may not be as appropriate for embedded applications. Many of the small form factor board standards leave the chassis decisions and design up to you.

J **Functionality Expansion**: Will you need to add more functionality later? How that decision is made can dramatically influence the form factor choice. Does the board use a standard interface that already has a large selection of add-in options, or is it proprietary or limited in choices? Some expansion options take a large chunk of valuable real estate while others are low profile and space efficient. Be sure to

check how well accepted the expansion option is if you think you need or will need it. Expansion options are a great way to add functionality to an already deployed system, improving chances to gain revenue from upgrades.

I/O management: Some form factors are better than others on I/O management. Improvements in the location, number, and type of I/O have shaped the evolution of PC motherboards. As new types of I/O such as USB, flash cards, SATA, and IEEE 1394 have taken over the serial and parallel connections of the past, board designers have made appropriate changes in the way I/O is managed by the board. Small form factor boards have even more unique choices better suited for embedded applications.

J **Power**: Something seemingly as straightforward as getting power to the board can be a huge obstacle. Backplanebased boards have power pinouts as part of the standard, but motherboard-based solutions are all over the map. Some are better than others when it comes to defining the power connectors and required voltages. The best solutions allow designers to use commonly available power supplies and connectors. It can be frustrating to have a "paperweight" that cannot be conveniently powered for lab development, so be sure the power solution is understood beforehand.

**Thermal Management**: How does a particular form factor handle cooling? For some applications, this is a minor concern, but the majority will have some issues to consider, especially if you are using high-end processors for the project. Some form factors give you the choice of air, conduction, or even liquid cooling. Some are built into the board specification while others require some creative mechanical design and plumbing.

J **Ruggedness**: Into what type of environment will your product be deployed? Standard PC market boards do well in benign home or office environments but are not suitable in mobile, industrial, or military applications. Picking a form factor that can handle your environment is high on the list of items to consider. Some form factor specifications have shock and vibration options over a range of environments. Again, form factors designed specifically for embedded applications tend to do a much better job managing rugged requirements.

J **Standards**: Form factors endorsed and managed by a standards organization can be very important to many applications. A standard-supported form factor is more stable, well thought out, qualified, and usually has a planned evolution path. All this can help you manage future life-cycle issues as you improve and evolve your design. Standards developed by an established organization have the inputs of many technical experts who have had a chance to test and vet the design inputs. Ecosystems for well developed standards tend to be larger and more robust giving you better product choices.

J **Suppliers and Support**: Having choices in suppliers is just as important as choice in form factors. While our focus is mostly on de facto and true standards-based form factors, many are proprietary to a single company. This is less of a risk for one-off products that have a limited life span, but having a solution supported by several suppliers gives you options for prices, support, and life-cycle management. Who

is the real target audience of the supplier you choose? PC market board suppliers are by definition focused primarily on the PC market. In choosing these boards for embedded use, you may be stuck with difficult revision management issues with these suppliers. Many companies with an embedded computing focus offer PC-style motherboards, providing support and life-cycle management while still leveraging PC motherboard technology. This comes at a slight cost premium, but the return on investment can be beneficial farther down the road.

J **Operating Systems**: The software landscape has evolved to keep up with the changing needs of the embedded computing industry.

Real-time operating system choices blossomed in the late 1990's. They number of suppliers has consolidated the past few years but the solutions are very mature.

Linux became a solid embedded operating system solution with release 2.6 where a number of key real-time features were made part of the base kernel. Now several companies are building on that base plus adding some extensions of their own. Linux has firmly gained a foothold as a solid choice.

Embedded Windows has gone through several improvements since Windows was first considered for embedded applications in the mid 1990's. Now Embedded Windows 7 is a key part of the Microsoft operating system strategy.

#### V. SUMMARY

As you can see, the choices abound. If all else fails, many of the vendors also will customize a board leveraging an existing design and adding features and sizes suitable to your application. In fact, many of the "standard" form factors emerge from such projects. Choose wisely.

# *WEDNESDAY 23 SEPTEMBER 2009*

# *PARALLEL SESSION A3 TRIGGER*

#### Integrated Trigger and Data Acquisition system for the NA62 experiment at CERN

B. Angelucci<sup>a</sup>, C. Avanzini<sup>a</sup>, <u>G. Collazuol<sup>\*,b</sup></u>, S. Galeotti<sup>c</sup>, E. Imbergamo<sup>d</sup>, G. Lamanna<sup>b</sup>, G. Magazzù<sup>c</sup>, G. Ruggiero<sup>b</sup>, M. Sozzi<sup>a</sup>, S. Venditti<sup>a</sup>, on behalf of the NA62 Collaboration

<sup>a</sup> Dipartimento di Fisica dell'Universita e sezione dell'INFN di Pisa, I-56100 Pisa, Italy ` <sup>b</sup> Scuola Normale Superiore e Sezione dell'INFN di Pisa, I-56100 Pisa, Italy <sup>c</sup> Sezione dell'INFN di Pisa, I-56100 Pisa, Italy

<sup>d</sup> Dipartimento di Fisica dell'Universita e Sezione dell'INFN di Perugia, I-06100 Perugia, Italy `

\* Corresponding author, e-mail address: gianmaria.collazuol@pi.infn.it

#### *Abstract*

The main goal of the NA62 experiment is to measure the branching ratio of the  $K^+ \to \pi^+ \nu \overline{\nu}$  decay, collecting O(100) events in two years of data taking. Efficient online selection of interesting events and loss-less readout at high rate will be key issues for such experiment. An integrated trigger and data acquisition system has been designed. Only the very first trigger stage will be implemented in hardware, in order to reduce the total rate for the software levels on PC farms. Readout uniformity among different subdetectors and scalability were taken into account in the architecture design.

#### I. INTRODUCTION

The NA62 experiment at the CERN SPS aims at measuring O(100)  $K^+ \rightarrow \pi^+ \nu \overline{\nu}$  events in two years of data taking. The

theoretical cleanness of the Standard Model (SM) branching ratio (BR) predictions for this decay mode makes it very attractive both as a powerful test of the CKM paradigm and as a probe for new physics beyond the SM. Experimentally, the detection of this process is very difficult due to the smallness of the signal (in the SM the expected BR is at level of  $0.85 \times 10^{-10}$ ) and the presence of a very sizeable concurrent background, mainly from  $K^+ \to \pi^+\pi^o$  decays. The present measurement of this decay channel is based on 7 candidates collected by E949 and E787 Brookhaven experiments[1] leading to a value of  $BR = (1.47^{+1.30}_{-0.89}) \times 10^{-10}$ .

NA62 is a fixed target experiment in which beam of positively charged hadrons, including a fraction of  $\sim 6\%$  of kaons, will be produced from 400 GeV/c primary protons from the SPS accelerator. Kaon decays in flight will be observed in a fiducial region ∼ 100m long, in vacuum.



Figure 1: NA62 layout

Decay products and primary particles will be measured by spectrometers, respectively exploiting straw chambers (STRAWS) and silicon pixel detectors (GIGATRACKER), in order to achieve high resolution momenta and angles measurement and consequently good rejection of kinematically constrained background. An efficient veto system for photons and charged particles (LAV, LKr and SAC) and the PID system for primary particles and decay products (CEDAR, RICH and MUV), will guarantee the identification of decay modes not kinematically constrained. In fig.1 the layout of the experiment is shown.

In order to collect the required number of events in a reasonable amount of time, a very intense hadron beam will be employed  $(3 \times 10^{12} \text{ protons per SPS pulse will produce } 5 \times 10^{12} \text{ K}^+ \text{ per}$ year). An efficient on-line selection of candidates represents a very important item for this experiment, because of the large reduction to be applied on data before tape recording. On the other hand a loss-less data acquisition system is mandatory to avoid adding artificial detector inefficiencies, e.g. when vetoing background particles. This paper will focus on the general architecture of the integrated DAQ and trigger system for the NA62 experiment.

#### *A. Requests to DAQ and Trigger System*

The rate of events in the decay region is strongly dominated by background. According to simulations the rate on the main detectors is around  $10MHz$  (table1).

Table 1: Rates on principal detectors

| Detector      | Rate (MHz) |
|---------------|------------|
| <b>CEDAR</b>  | 50         |
| <b>GTK</b>    | 800        |
| LAV           | 9.5        |
| <b>STRAWS</b> | 8          |
| <b>RICH</b>   | 8.6        |
| LKR.          | 10.5       |
| MUV           | 9.2        |
| SAC.          | 15         |

An additional rate of at least  $1MHz$  of muons coming from the beam production target, must be taken into account. In this environment the requests to the DAQ and trigger systems are:

- Very low DAQ inefficiency  $(< 10^{-8})$ ;
- High trigger efficiency ( $> 95\%$ );
- Fully monitored systems;
- Readout without zero suppression for candidates;
- Low random veto probability at trigger level;
- Scalability in terms of bandwidth;

The first request is uncommon in other DAQ systems, but it's crucial for the NA62 experiment, where the full reconstruction of the background is an important issue. For the same reason zero suppression, mainly in the veto detectors, must be avoided as much as possible during the acquisition process. A good trigger efficiency can be obtained using information coming from several detectors with an excellent time resolution, in order to reduce the random veto probability. The final acquisition rate will be of the order of tens of  $kHz$ .

#### *B. NA62 trigger and DAQ architecture*

A fully digital and integrated DAQ and trigger system has been designed to fulfill the requirements presented in the previous section. The digitization in the early stage of the readout system allows efficient monitoring of each stage of the chain, in order to detect any possible source of losses. The trigger system will be split in two levels: the first stage (L0), implemented in hardware (for instance using FPGAs), will be used to reduce the total rate to  $\sim 1 MHz$ , while the second and third stages (L1 and L2) will be completely software based exploiting powerful PC-farms with large input bandwidth. The data accepted by the L2 will be directly transmitted to the EB (event builder) PC-farm, to be permanently recorded.

The factor ∼ 10 in rate reduction at the first stage, will be obtained by a L0 trigger processor (L0TS) using information coming from RICH, LAV, LKr calorimeter and MUV detectors. The trigger primitives from each detector involved in the L0 trigger decision, will be built directly in the same data acquisition board devoted to digitization and monitoring. For all the detectors (apart from GIGATRACKER) the building block of this system will be the TELL1 mother board developed for the LHCb experiment[2].

The TELL1 board (9U format) houses 5 Altera Stratix FPGAs allowing a fully customable configuration. A total RAM memory of 384 MB gives the possibility to store the data in a first buffer stage, waiting for the trigger decision delivered to the board through the TTC[3] interface. A credit card PC (CCPC) allows to control all the functionality of the board. The output stage uses a quad Gigabit Ethernet card (total output bandwidth of  $\sim$  4 $Gb/s$ ). The input stage can be adapted to different purposes using 4 custom daughter boards. On this daughter boards, for instance, the analog data coming from the detector front end could be digitized. The use of uniform system for all the subdetectors allows to have a common fully integrated trigger and readout architecture, exploiting the possibility to use the same data chain to monitor the whole system and avoiding the complications due to independent trigger and readout chains.

#### *C. The TDC board*

For the definition of trigger primitives and offline data analysis, several subdetectors will provide the time of arrival of a given events. Time resolution of  $O(100ps)$  have to be guaranteed at event rates of  $O(10MHz)$  and a good on-line time resolution is also important for the trigger. For this reason we have developed a daughter board (10 layers PCB) for the TELL1 motherboard, providing 128 TDC channels with 100ps time resolution. Each mezzanine houses 4 HPTDC chips (developed at CERN[4]) controlled by an Altera Stratix II FPGA used for preprocessing (an on board static RAM memory is also provided for this purpose) and monitoring. Miniaturized connectors are present on both sides of the board, allowing the connection of

128 channels from the subdetectors front-end. Particular care has been used to assure a very good clock stability. The  $40MHz$ clock coming from the TELL1 is stabilized by the Stratix II PLL and an external quartz controlled QPLL[5]. After filtering residual noise from DC-DC converters, detailed tests showed that the level of the jitter in the clock is below 40ps. The intrinsic time resolution of the whole chain for the single hit is measured at level of 50ps. The time resolution has also been measured in a test beam with a RICH prototype with 400 photomultipliers, and found in agreement with the expectations. A very compact readout system of 512 TDC channels is obtained by mounting four TDC boards on a TELL1. In the TELL1 FPGAs the fine time multiplicity is computed, crucial to define the trigger primitives, by exploiting the high time resolution given by the TDCs. In case the subdetectors need more than one TELL1 for readout the TELL1s will be connected together in a daisy chain using two Gigabit links dedicated to send and receive trigger information.

#### *D. LKr calorimeter readout and trigger*

The LKr calorimeter was built for the NA48 experiment[6] to provide excellent energy, time and space resolution. In the NA62 experiment it will be mainly used as veto counter for forward photons from the decay region, but still we want to profit from the good performance of the calorimeter both for background studies and for adding other interesting physics cases to the NA62 main program. Thus LKr calorimeter electronics will provide both time and pulse-height information. An effective approach, already used in NA48, is to perform a continuous sampling with flash ADCs instead of using two separated time and charge measurement. The LKr is composed by  $\sim 13500$ channels sampled at  $40MHz$  with an effective resolution of 14 bits. No zero suppression applied at the L0 trigger rate of  $1MHz$ , would require a  $\sim 1TBs$  bandwidth, which the existing NA48 LKr readout cannot stand. The system has thus been modified in order to exploit large buffers (0.5 GB DDR2 per channel) and faster links. The old CPD boards, used to digitize and compute analog sums of groups of cells for trigger purposes, will be reused in ∼ 200 "CARE" modules connected with  $\sim$  900 Gigabit links to a readout farm ( $\sim$  200 processor nodes). The 892 analog sums for the trigger (groups of 8x2 cells) will be sent to a system of 28 TELL1 boards housing 32 channels of ADCs each, to provide the first layer of the calorimetric trigger. A second layer of 3 TELL1 boards equipped with Gigabit mezzanine receivers (under design) will produce the LKr trigger primitives for the L0 central processor.

#### *E. L0 central processor*

The L0 central processor or L0 trigger supervisor (L0TS) will collect the information from all the detectors participating to the L0 trigger and take the final decision. Montecarlo simulations showed that a factor 10 in rate reduction can be obtained using RICH, LAV, LKr and MUV information. The trigger decision will be dispatched synchronously to the TELL1 boards and other readout systems trough TTC. Two solutions are under investigation to realize the L0TS:

- exploiting parallel processing by Graphics Processing Units (GPU) on a real-time linux High Performance PC with fast I/O connections;
- Custom dedicated board with FPGAs and fast I/O connections;

The first solution is limited by the request to take decisions with a stable latency of one ms, depending on the front end buffer size in some critical detectors. The possibility to have such a latency, given by the large buffers in the TELL1, will be exploited to compensate the ethernet intrinsic latency and the computing time in the GPU-HPPC's solution.

#### *F. L1 and L2 levels*

The L1 trigger will be totally software. For each subdetector a dedicated PC (or a small cluster of PCs) will be used to implement fast reconstruction to apply single subdetector standalone algorithms (clusters presence in the LKr, tracks direction and momentum in the STRAWS, etc.). The input event rate for these PCs will be  $1MHz$ . The data will arrive at the L2 PC farm through a commercial GBE switch. At this level the full event will be completely reconstructed and more sophisticated high level trigger algorithms will be implemented, with the request of reduction at total rate of tens  $kHz$  for permanent recording on tape. Assuming a single event size of  $10kB$  (heavily dominated by LKr and GIGATRACKER) the total bandwidth at the end of the chain will be of the order of  $100MB/s$  to be recorded.

#### **REFERENCES**

- [1] S. Adler *et al.*, *Phys. Rev. D* 77 (2008) 052003.
- [2] G. Haefeli *et al.*, *Nucl. Instrum. Meth.* A560 (2006) 494.
- [3] B. G. Taylor, *Prepared for 8th Workshop on Electronics for LHC Experiments, Colmar, France, 9-13 Sep 2002*
- [4] http://tdc.web.cern.ch/tdc/hptdc/ docs/hptdc\_manual\_ver2.2.pdf
- [5] http://proj-qpll.web.cern.ch/proj-qpll/
- [6] V. Fanti *et al.*, *Nucl. Instrum. Meth.* A574 (2007) 433.

## A digital calorimetric trigger for the COMPASS experiment at CERN

J. Friedrich<sup>a</sup>, S. Huber<sup>a</sup>, B. Ketzer<sup>a</sup>, M. Krämer<sup>a</sup>, I. Konorov<sup>a</sup>, A. Mann<sup>a</sup> and S. Paul<sup>a</sup>

<sup>a</sup>Physik Department E18, Technische Universität München, 85748 Garching, Germany

mkraemer@e18.physik.tu-muenchen.de

#### *Abstract*

In order to provide a trigger for the Primakoff reaction, in 2009, the trigger system of the COMPASS experiment at CERN will be extend by an electromagnetic calorimeter trigger. Since it was decided to gain from various benefits of digital data processing, an FPGA based implementation of the trigger is foreseen, running on the front-end electronics, which are used for data acquisition at the same time. This, however, includes further modification of the existing trigger system to combine the digital calorimeter trigger, with its higher latency, and the analogue trigger signals, which will also make use of digital data processing.

#### I. THE COMPASS EXPERIMENT AT CERN

The COmmon Muon and Proton Apparatus for Structure and Spectroscopy (COMPASS), is a fixed target experiment at CERN, which uses Muon and Hadron beams from the Super Proton Synchroton (SPS) to address a wide variety of physic programs. Thereby the beam is provided in Spills, having a slow extraction from the accelerator, which last around 5 sec, followed by approximately 30 sec without extraction. COMPASS is a  $60 \, m$  long, two staged magnetic spectrometer (see Figure 1), where both stages are equipped with hadronic and electromagnetic calorimeter. Due to the two electromagnets, having an integrated magnetic field of  $1 Tm$  and  $4 Tm$ , respectively, COMPASS has a large acceptance range. [1]



Figure 1: *Rendered view of the compass spectrometer (muon setup).*

The physics program of COMPASS addresses, among other topics, some reactions like Primakoff or Deeply virtual Compton scattering, which either directly or indirectly produce high energetic photons. Therefore an electromagnetic calorimeter

trigger is desirable. However the electromagnetic calorimeter of the second spectrometer stage was not equipped with trigger logic, so far. Thus in December 2009 the decision was taken to design a trigger system including this detector. The following section will give a short overview of the calorimeter and readout and discus the trigger logic and implementation in particular.

### II. ECAL2 - ONE OF THE ELECTROMAGNETIC CALORIMETERS OF COMPASS

#### *A. Signal detection*

The electromagnetic calorimeter, which is placed more downstream in the COMPASS spectrometer and provides calorimetry for the second stage, ECAL2, consist of 3068 cells,  $3.8 \times 3.8 \, \text{cm}^2$  each, which are organized in a  $64 \times 48$  grid. It has a hole of  $2 \times 2$  cells allowing the beam to pass by. The central part is equipped with 860 Shashlik modules, while the outer part is completed with GAMS and radiation hard GAMS modules. Photo multipliers are used to amplify the signals, which are feed through shaper cards to the readout electronics.

#### *B. Readout*

The readout, based on Field Programmable Gate Arrays (FPGAs), utilizes versatile sampling Analog to Digital Converters (ADCs) mounted on mezzanine cards (see [2]), which themselves are mounted on 9U VME carrier cards (Figure 3). Using 12 Bit ADCs capable of sampling at 40 MHz in a interleaved mode, one mezzanine card reads out 16 channels at a combined sampling frequency of 80 MHz (Figure 2). Four of this mezzanine cards are mounted on one carrier card, which therefore provide 64 channels and is equipped with another FPGA to manage the mezzanine sampling ADCs. In total 3072 channels are readout like this.



Figure 2: *Mezzanine sampling ADC module*



Figure 3: *VME carrier card*

#### III. THE CALORIMETRIC TRIGGER

The calorimetric trigger, which is implemented for ECAL2, is tightly integrated into the readout system running mostly on the FPGAs, which handle the readout. Only the dedicated backplane had to be developed.

#### *A. Concept of the digital calorimetric trigger*

The concept of the trigger, optimized for a planed Measurement of the Primakoff reaction in Autumn 2009, foresees summing up the energy of all signals, which belong to a certain time slice and occur in a selected part of the calorimeter. Thereby, the part can be chosen freely and can as well cover all calorimeter cells. Most efforts are spend detecting signals on channel level, using a digital constant fraction algorithm after an initial pedestal subtraction, which provides amplitude and timing of detected signals (see Section *B.*). The amplitudes of retrieved signals are normalized for each channel individually using energy calibrations, while the dispersion of signals is corrected using time calibration. This makes fine tuning on the hardware side, i.e. fine adjustment of high voltage bases and cable length, unnecessary, and therefore simplifies hardware adjustment. Both, energy and time calibrations are monitored and updated continuously, using CPU based online data processing (see Section *III.*). The summation of signals is implemented in several stages. 16 channels are summed on the ADC mezzanine card, while the outputs of the four mezzanine cards, which are mounted on one carrier card, are summed on that carrier card. Finally a custom VME back plane combines the data from eight carrier cards. Additionally multiple back planes can be interconnected, thus one or more back planes can provide a global sum. The VME back plane applies two threshold, setting the level on two independent outputs, which give triggers synchronous to the internal 80 MHz clock.

#### *B. The constant fraction discriminator*

The main component of this digital trigger is the pulse shape analysis, done on channel level, which consists of a digital implementation of a Constant Fraction Discriminator (CFD).

#### Implementation

The digital CFD, calculates for each sample  $i$  the difference  $d_i$  between the signal  $s_i$  and a delayed and amplified version of the signal itself  $a \cdot s_{i-n}$  (see Figure 4).



Figure 4: *The digital constant fraction discriminator: Shown is the signal, the delayed and amplified signal and the difference of both. The time of the signal is extracted by linear interpolation to the point, where the difference crosses zero.*

$$
d_i = s_i - a \cdot s_{i-n} \tag{1}
$$

Thereby the CFD triggers a signal under following conditions:

$$
d_{i-1} > 0 \, AND \, d_i \, \textless = 0 \, AND \, s_{i+m} > \, thr,\tag{2}
$$

where thr is a programmable threshold, which should be high enough to suppress noise. The time of the signal is made of a coarse time, which is given by the sample index,

$$
t_{coarse} = i,\tag{3}
$$

and a fine time, which is estimated by linear extrapolation to the zero crossing of  $d$ 

$$
t_{fine} = \frac{d_i}{d_{i-1} - d_i}.\tag{4}
$$

Note, that the fine time is negative, thus the time of the signal  $t_{signal}$  in units of clock cycles is given by the sum of coarse and fine time.

$$
t_{signal} = t_{course} + t_{fine}
$$
 (5)

In order to correct for the difference of signal generation and propagation in the analog part of the readout, a time shift  $t_{shift}$ , which is measured and continuously monitored using CPU based online data processing for each channel individually (see Section *C.*), is applied to the signal time.

$$
t_{signal,sync} = t_{signal} + t_{shift} \tag{6}
$$

The  $t_{signal,sunc}$  is used to determine the coincidence of signals in different calorimeter channels by filling a normalized amplitude to time bins. The normalized amplitude thereby is given by

$$
a_{normalized} = c_{ecalib} \cdot s_{i+m}, \tag{7}
$$

where  $c_{ecalib}$  is an integer coefficient, which depends on the energy calibration of the calorimeter and is optimized in respect to the desired dynamic range and trigger threshold as well as to the calibration constants of the calorimeter. It is set for each channel individual.

#### **Performance**

In order to determine performance of this algorithm, the algorithm was modeled in C, respecting the limitations of the FPGA logic. Using various COMPASS raw data from 2008 and 2009, the time resolution and especially the possible temporal alignment of all calorimeter channels are determined by fitting the temporal residual (Figure 5) of all 3068 channels calorimeter channels with a double Gaussian function and a constant background. Thereby the trigger time, which is used as reference for the signal time, is measured by several TDC. In case of the calorimeter the uncertainty of the trigger time measurement is negligible in comparison to the uncertainty of the signal time. The time resolution is determined to  $\sigma_t \approx 0.9 \text{ ns}$  by calculating the weighted mean of both Gaussian contributions.

$$
\sigma_t = \frac{A_0 \cdot \sigma_{t,0} + A_1 \cdot \sigma_{t,1}}{A_0 + A_1} \approx 0.9 \, ns \tag{8}
$$



Figure 5: *Temporal residual of all 3068 ECAL2 channels after applying time shifts. Thereby a channel threshold of 10 ADC channels is used. The residual is fitted with a double Gaussian and a constant back ground.*

#### *C. Monitoring*

Since the quality of the recorded data depends on the quality of the trigger, there are several mechanisms foreseen to monitor the digital calorimetric trigger.

#### Monitoring of calibration

Energy and time calibrations, which are loaded into the FP-GAs at runtime, are monitored and updated using online data processing. This task is addressed with *Cinderella*, the online filter of the COMPASS experiment, which is part of the readout system and is running on a computer farm on the experimental site ([3]). Thereby monitoring of the energy calibration is done using LED pulses, which are injected into the calorimeter, while time calibrations are extracted comparing signal times, which are extracted by pulse shape analysis, to the measured trigger time.

#### VME registers

Several VME registers, which are read out and written to a database once per spill, are utilized for online error detection. This registers include pedestals, which are updated upon each spill, and scalers for each individual channel. Comparing this with references provides online information about instabilities of the readout system and failing hardware.

#### Encode CFD information in the data stream

To provide more information for offline and online analysis the results of the CFD trigger, i.e. signal time and amplitude, are encoded into the data stream, which is written to tape. Comparison of the parameters from the FPGA based CFD with CPU based pulse shape analysis allows to detect misbehavior of the hardware. This task is addressed with the *Cinderella* online filter.

#### IV. INTEGRATION INTO THE TRIGGER SYSTEM

To form the trigger decision in the FPGAs a time of  $500 ns$  is required. Signal generation, conversion and transport as well as the time of flight, which a particle needs to reach the calorimeter when passing the target, adds another  $500 ns$ , which increases the latency of the digital trigger to  $\approx 1 \mu s$ . Having a latency of 500 ns for the analogue triggers in COMPASS, those have to be delayed by  $0.5 \mu s$  in addition, which in 2009 is achieved by adding delay cables. However for future prospects a digital solution based on FPGA is planned.

#### **REFERENCES**

- [1] COMPASS COLLABORATION, *Nucl. Instr. Meth. A* 577, 455 (2007).
- [2] A. MANN, A Versatile Sampling ADC System for On-Detector Applications and the AdvancedTCA Crate Standard, 15th IEEE-NPSS Real-Time Conference, 2007.
- [3] T. NAGEL, Cinderella: an Online Filter for the COMPASS Experiment, diploma thesis, E18 TU Munich, 2005.

### The Level 0 Trigger Decision Unit for the LHCb experiment

H. Chanal, O. Deschamps, R. Lefèvre, M. Magne, P. Perret

Laboratoire de Physique Corpusculaire de Clermont Ferrand IN2P3/CNRS 24 avenue des Landais 63177 Aubière

chanal@clermont.in2p3.fr

#### *Abstract*

The Level 0 Decision Unit (L0DU) is one of the main components of the first trigger level (named level 0) of the LHCb experiment. This 16 layers custom board receives data from the calorimeter, muon and pile-up sub-triggers and computes the level 0 decision, reducing the rate from 40MHz to 1MHz. The processing is implemented in FPGA using a 40MHz synchronous pipelined architecture. The L0DU algorithm is fully configured via the Experiment Control System without any firmware reprogramming. An overall L0DU latency of less than 450ns has been achieved. The board was installed in the experimental area in April 2007 and since then has played a major role in the commissioning of the experiment.

#### I. INTRODUCTION

The LHCb experiment [1] is dedicated to b physics. It is installed at one interaction point of the Large Hadron Collider (LHC) at CERN. It is designed to exploit the large number of bb-pairs produced in pp interactions at  $\sqrt{s}$ =14 TeV at the LHC, in order to perform precise measurements and to search for new physics in CP asymmetries and rare decays in b-hadron systems. As the b and  $\bar{b}$  are produced at small angles and correlated, the detector has been designed as a single arm spectrometer. Figure 1 shows the layout of the experiment. The Vertex Locator and the tracking system (TT, T1-T3) provide very good vertexing and tracking capabilities while excellent particle identification is achieved thanks to two ring imaging Cherenkov detectors (RICH1 and RICH2), to the calorimeters (SPD/PS, ECAL and HCAL) and to five muon stations (M1-M5).



Figure 1: the LHCb detector

The interesting b decays account for a very small fraction of the 10MHz of visible interactions (around 1Hz for a branching ratio of  $10^{-4}$ ). In order to get an accurate selection of the events, a high performance versatile trigger has been developed.

This contribution will first introduce briefly the LHCb trigger with an emphasis on the level 0. The level 0 decision unit (L0DU) board and its internal processing will then be presented. The last part will finally focus on the project from the first prototype to the first data.

#### II. OVERVIEW OF LHCB TRIGGER

The whole LHCb detector runs with a 40MHz clock. It is not possible to store the data at such a high rate and most events are useless for the physics analysis (for example when there is no collision). Figure 2 presents the two stage trigger [2] which has been developed in order to reduce the rate from 40MHz to 2kHz for persistent storage. The first level is based on custom electronic and has to reduce the rate to 1MHz with a fixed latency of  $4\mu s$ . The second level (HLT) is a cluster of about 2000 PC which further reduces the rate to 2kHz. A lot of flexibility and good performances are needed at both stages.



Figure 2: the LHCb trigger

#### III. THE L0 TRIGGER

Only the fastest sub-detectors can take part in the L0 event selection:

- The pile-up trigger which sends the reconstructed primary vertexes to be able to remove events with more than one interaction;
- The calorimeter trigger which sends the highest  $E_T \gamma$ , electron,  $\pi^0$ , and hadron as well as the  $\sum E_T$  and SPD multiplicity;
- The muon trigger which sends the two highest  $p_T$  muons per quadrant.


Figure 3: the L0DU mezzanine

The data are merged and processed in the L0DU. If the data fulfil a set of simple conditions the L0DU issues a validation signal sent to the Readout Supervisor (RS) [3][4] where it can be broadcasted (L0Accept signal) to the whole experiment. The data from the whole detector is then sent to the HLT farm for the processing of the next trigger level. To ensure a high flexibility of the L0 trigger, the conditions to be used in the L0DU are fully configurable.

## tera) is used to access to the registers and for synchronization tasks while the two bigger (EP1S40 from Altera) are doing the processing. One of the processing FPGA (FPGA1) deals with the calorimeter and pile-up inputs. The other (FPGA2) copes with the muon trigger inputs. The core of the L0 algorithm is executed on FPGA1 where all the relevant data are centralized. FPGA1 is the most heavily used of the two processing FPGA: more than 65% of its logical resources are used.

# IV. L0DU ARCHITECTURE

# *A. TELL1*

As shown on Figure 4, the L0DU is a mezzanine of the TELL1 board [5][6]. The TELL1 has been designed for the LHCb experiment to handle the DAQ output and the common part of the interfaces. The data is sent to the DAQ via a Gigabyte Ethernet mezzanine. The TELL1 also provide an Experimental Control System (ECS) access using a small embedded Credit Card PC (CCPC) running a Linux system connected with standard Ethernet.

In our case, the TELL1 allows to have a remote access to the board registers with a dedicated  $I<sup>2</sup>C$  bus and to remotely reprogram the 3 L0DU FPGA with a JTAG bus.

# *B. The L0DU mezzanine*

Figure 3 shows the L0DU mezzanine. It is a 16 layers 9U board. It relies on three FPGA. The smaller (EP1S10 from Al-



Figure 4: the L0DU mezzanine plugged on a TELL1 board



Figure 5: functional schematic of the L0DU

The optical part is composed of 24 deserializers (TLK2501 from Texas Instrument) and two optical transceivers (HFBR-782BE from Agilent). It allows the connection of two fiber ribbons of 12 optical fibers each. 7 single optical fibers are used by the calorimeter trigger, 2 by the pile-up trigger and 8 by the muon trigger. There are 7 spares. The links between the optical transceivers and the deserializers are running at 1.6GHz. Between the deserializers and the FPGA, a 384 data bit bus is running at 80MHz. This part of the PCB has required special care and an accurate simulation with the SpectraQuest software from Cadence.

The clock and the L0Accept signals are received by an embedded TTC mezzanine [7]. The clock is broadcasted to the three FPGA using a dedicated LVDS network while the synchronization signals are sent to the control FPGA where they are treated and sent to the two processing FPGA.

The L0DU is linked to the TELL1 using 200 pins connectors. Only two of the four processing FPGA of the TELL1 are used as the data is sent by the two L0DU processing FPGA.

### V. L0DU PROCESSING

The processing done in FPGA1 and FPGA2 can be decomposed in several blocks as shown on Figure 5. First the data coming from the sub-trigger systems are treated by preprocessing block which include the time alignment and some data sorting. The decision making block computes the decision and constitutes the core of the L0 algorithm. It is highly configurable and can be easily changed remotely without any change in the FPGA firmware.

#### VI. PRE-PROCESSING

#### *A. Time alignment*

The time alignment can be decomposed in two independent parts. In the first stage, the data coming at 80MHz from the 24 optical deserializers have to be demultiplexed and put in the same 40MHz clock domain. In our case, we have a lot of incoming clocks (24) which can not be routed in our FPGA clock networks. Figure 6 presents the acquisition principles. We use a single 160MHz internal clock to acquire at both raising and falling edges the incoming data from the sub-detector inputs. A multiplexer allows the selection of a given edge. In a second step, the 16 bits LSB and MSB are acquired using two enable signals, one being delayed by 12.5ns with respect to the other. The result is then resynchronized with the local 40MHz clock.

To get the right phase for the acquisition of the incoming data, the corresponding clock is acquired by steps of 3.125ns with the local 160MHz clock (using both edges). Each value is digitalized 256 times and averaged to get an accurate representation of the incoming clock cycle. According to the digitalization enable signal used in the data acquisition module.

In the second stage, all the sub-detector data are aligned on the same event. 24 dual port RAM with a depth of 256 are used to introduce the necessary configurable delays.



Figure 6: clock synchronization principle

#### *B. Data sorting*

In FPGA2, the muons are sorted using a merge sort algorithm in three steps. It allows the processing of the data in one clock cycle using simple comparators and an associated multiplexer. Only the three largest muons are sent to FPGA1.

## VII. DECISION BUILDING

The decision building flow is given on Figure 7.

#### *A. Compound data*

Compound data are created by combining the sub-trigger inputs. It can be either the sum or the difference of two elementary data, such as the  $E_T$  of two calorimeter candidates or the  $p_T$  of two muon candidates, or a mask applied on the address of

results, a LUT allows an automatic choice of the edge and of the a candidate. 36 pre-synthesized blocks containing one of these operations are available in the latest L0DU firmware.



Figure 7: decision building flow

A 8 bit wide and 3564 depth RAM has been introduced to eventually apply different L0 conditions according to the position of the event within the LHC cycle.

#### *B. Elementary conditions*

An elementary condition block has been designed to define simple cuts on sub-trigger data. Each block is the combination of a data input, an operator  $(\geq, \leq, =, \neq)$ , and a threshold. 128 elementary condition blocks are available in the L0DU.

#### *C. Trigger channel*

Elementary conditions are combined in so called trigger channels, each trigger channel being an "and" of any of the 128 elementary conditions. Up to 32 trigger channels can be defined.

## *D. Decision*

The L0 decision is defined as an "or" of any of the 32 trigger channels.

#### *E. Special trigger bit*

Two flags are implemented :

- A force trigger bit (FTB) which indicates a problem in the data time alignment or an error in the L0 processing either from the L0DU itself or from a sub-trigger. This flag may be used to force the storage on disk of the event for further analysis.
- A timing trigger bit (TTB) which is used to flag special se-

quences in a time window of  $\pm 2$  bunch crossings around the current bunch crossing, for instance to get isolated events. Here two modes are possible. The first option is to base the flag on the L0 decisions obtained for any of the 5 bunch crossings. The second is to look at the results, for any of the 5 bunch crossings, of simple comparisons of the  $\sum E_T$  input coming from the calorimeter with two programmable thresholds.

# VIII. TEST BENCH

We developed at the laboratory a dedicated test bench in order to test the firmware and stress the various links such as the optical fibers. A synoptic of the test bench is given on Figure 8.



Figure 8: the L0DU test bench

This test bench use two PC to control the boards: the first one runs the user interface while the second one provide access to the various registers and handles the booting of the CCPC. A RS board broadcasts the clock and the synchronization signals to the whole test bench. A specifically developed board, the GPL (Figure 9), is used to send known patterns in optical format and to receive the decision via a 16 bit LVDS link.



Figure 9: the GPL board

## IX. PROTOTYPING AND COMMISSIONING

Three evolutions of the board have been designed. The first prototype [8] was produced in 2001 to validate the various concepts. The second prototype [9] was built in 2005 and had all the required functionalities. To cope with the expected algorithmic flexibility, the FPGA size has been increased in the final boards which were received in 2007.

The final L0DU board was installed in the experimental area and connected to the RS in February 2007. The first combined tests were made with the calorimeter trigger in April 2007. The calorimeter system together with the L0DU triggered on their first cosmics at the end of 2007. Cosmics involving both the muon trigger and the calorimeter trigger have been recorded in April 2008. The L0 system then provides the triggers on the first beam induced particles in August 2008. Lastly, the pile-up joined the L0 trigger path in December 2008.

# X. CONCLUSION

A very flexible L0 trigger board has been developed. It is in use in the experimental area since 2007. Specific algorithms have been extensively used to commission the detectors of LHCb and to record millions of cosmics and even thousands of VELO tracks during test of the transfer line from the SPS to the LHC.

The L0DU board is ready for the beam restart in November 2009.

#### **REFERENCES**

- [1] The LHCb Collaboration, The LHCb detector at the LHC, JINST 3 (2008) S08005
- [2] The LHCb Collaboration, Trigger Technical Design Report, CERN-LHCC/2003-31 (2003)
- [3] R. Jacobsson et al., Readout Supervisor Design Specifications, LHCb Note 2001-12 (2001)
- [4] R. Jacobsson et al., Timing and Fast Control, LHCb Note 2001-16 (2001)
- [5] G. Haefeli, Contribution to the development of the acquisition electronics for the LHCb experiment, PhD Thesis, Ecole Polytechnique Federale de Lausanne, 2004
- [6] G. Haefeli et al., The LHCb DAQ interface board TELL1, Nucl. Instr. and Meth. A 560, 494 (2006)
- [7] J. Christiansen et al., TTCrx Reference Manual, CERN-EP/MIC (http://ttc.web.cern.ch/TTC/TTCrx manual3.10.pdf)
- [8] R. Cornat, Conception et réalisation de l'électronique frontale du détecteur de pied de gerbe et de l'unité de décision du système du premier niveau de déclenchement de l'expérience LHCb, PhD Thesis, Université Blaise Pascal Clermont-Ferrand 2, 2002
- [9] J. Laubser, Conception et réalisation de l'unité de décision du système de déclenchement de premier niveau du détecteur LHCb au LHC, PhD Thesis, Université Blaise Pascal Clermont-Ferrand 2, 2007

# Performance of the CMS Regional Calorimeter Trigger

# P. Klabbers, M. Bachtis, S. Dasu, J. Efron, R. Fobes, T. Gorski, K. Grogg, M. Grothe, C. Lazaridis, J. Leonard, A. Savin, W.H. Smith, M. Weinberg

University of Wisconsin Madison, Madison, WI, USA

pamc@hep.wisc.edu

#### *Abstract*

The CMS Regional Calorimeter Trigger (RCT) receives eight-bit energies and a data quality bit from the HCAL and ECAL Trigger Primitive Generators (TPGs). The RCT uses these trigger primitives to find e/γ candidates and calculate regional calorimeter sums that are sent to the Global Calorimeter Trigger (GCT) for sorting and further processing. The RCT hardware consists of one clock distribution crate and 18 double-sided crates containing custom boards, ASICs, and backplanes. The RCT electronics have been completely installed since 2007.

The RCT has been integrated into the CMS Level-1 Trigger chain. Regular runs, triggering on cosmic rays, prepare the CMS detector for the restart of the LHC. During this running, the RCT control is handled centrally by CMS Run Control and Monitor System communicating with the Trigger Supervisor. Online Data Quality Monitoring (DQM) evaluates the performance of the RCT during these runs. Offline DQM allows more detailed studies, including trigger efficiencies. These and other results from cosmicray data taking with the RCT will be presented.

#### I. INTRODUCTION

The Compact Muon Solenoid (CMS) is a generalpurpose detector operating at the Large Hadron Collider (LHC). It was commissioned at the European Laboratory for Particle Physics (CERN) near Geneva, Switzerland. This large detector is sensitive to a wide range of new physics at the high proton-proton center of mass energy  $\sqrt{s}$  =14 TeV [1]. First beam was seen September 2008 [2].

At the LHC design luminosity of  $10^{34}$  cm<sup>-2</sup> s<sup>-1</sup>, a beam crossing every 25 ns contains on average 17.3 events. These  $10<sup>9</sup>$  interactions per second must be reduced by a factor of  $10^7$  to 100 Hz, the maximum rate that can be archived by the on-line computer farm. This will be done in two steps. The level-1 trigger first reduces the rate to 75 kHz, and then a High Level Trigger (HLT), using an online computer farm, handles the remaining rate reduction.

The CMS level-1 electron/photon, τ-lepton, jet, and missing transverse energy trigger decisions are based on input from the level-1 Regional Calorimeter Trigger (RCT) [3]. The RCT plays an integral role in the reduction of the proton-proton interaction rate  $(10^9 \text{ Hz})$  to the High Level Trigger input rate  $(10<sup>5</sup> Hz)$  while separating physics signals from background with high efficiency. The RCT receives input from the brass and scintillator CMS hadron calorimeter (HCAL) and  $PbWO<sub>4</sub>$  crystal electromagnetic

calorimeter (ECAL), that extend to  $|n|=3$ . An additional hadron calorimeter in the very forward region (HF) extends coverage to  $|\eta|=5$ . A calorimeter trigger tower is defined as 5x5 crystals in the ECAL of dimensions 0.087x0.087 (∆φx∆η), which corresponds 1:1 to the physical tower size of the HCAL.

## II. RCT HARDWARE

## *A. PRIMARY RCT CARDS*

Eighteen crates of RCT electronics process data for the barrel, endcap, and forward calorimeters. There is another crate for LHC clock distribution. These are housed in the CMS underground counting room adjacent to and shielded from the underground experimental area.

Twenty-four bits comprising two 8-bit calorimeter energies, two energy characterization bits, a LHC bunch crossing bit, and 5 bits of error detection code are sent from the ECAL, HCAL, and HF calorimeter electronics to the nearby RCT racks on 1.2 Gbaud copper links. This is done using one of the four 24-bit channels of the Vitesse 7216-1 serial transceiver chip on calorimeter output and RCT input, for 8 channels of calorimeter data per chip. The RCT V7216-1 chips are mounted on mezzanine cards located on each of 7 Receiver Cards and the single Jet/Summary Card for all 18 RCT crates. The eight mezzanine cards on the Receiver Cards are for the HCAL and ECAL data and the single mezzanine card located on the Jet/Summary Card is for receiving the HF data. The V7216-1 converts the 1.2 Gbaud serial data to 120 MHz TTL parallel data, which is then deskewed, linearized, and summed before transmission on a 160 MHz ECL custom backplane to 7 Electron Isolation Cards and one Jet/Summary Card. The Jet/Summary Card receives the HF data and sends the regional  $E_T$  sums and the electron candidates to the Global Calorimeter Trigger (GCT). The GCT implements the jet algorithms and forwards the 12 jets to the Global Trigger (GT).

The Receiver Card (shown in Figure 1), in addition to receiving and aligning calorimeter data on copper cables using the V7216-1, shares data on cables between RCT crates. Lookup tables are used to convert the incoming calorimeter energy into several scales and set bits for electron identification. Adder blocks begin the energy summation tree, reducing the data sent to the 160 MHz backplane.



Figure 1: Front of a Receiver Card showing two Receiver Mezzanine Cards in place and Adder ASICs.



Figure 2: Electron Identification Card showing 4 Sort ASICs (right) and 2 EISO ASICs (left).

The Electron Isolation Card (shown in Figure 2) receives data for 32 central towers and 28 neighboring trigger towers via the backplane. The electron isolation algorithm is implemented in the Electron Isolation ASIC described below. Four electron candidates are transmitted via the backplane to the Jet/Summary (J/S) Card. The electrons are sorted in Sort ASICs on the J/S Card and the top 4 of each type are transmitted to the GCT for further processing. The J/S Card also receives  $E_T$  sums via the backplane, and forwards them and two types of muon identification bits (minimum ionizing and quiet bits – described later) to the GCT. A block diagram of this dataflow is shown in Fig. 3.

To implement the algorithms described above, five highspeed custom Vitesse ASICs were designed and manufactured, a Phase ASIC, an Adder ASIC, a Boundary Scan ASIC, a Sort ASIC, and an Electron Isolation ASIC [3]. They were produced in Vitesse  $FX^M$  and  $GLX^M$  gate arrays utilizing their sub-micron high integration Gallium Arsenide MESFET technology. Except for the 120 MHz TTL input of the Phase ASIC, all ASIC I/O is 160 MHz ECL.

The Phase ASICs on the Receiver Card align and synchronize the data received on four channels of parallel data from the Vitesse 7216 and check for data transmission errors. The Adder ASICs sum up eight 11-bit energies (including the sign) in 25 ns, while providing bits for

overflows. The Boundary Scan ASIC copies and aligns tower energies for e/γ algorithm data sharing and aligns and drives them to the backplane. Four 7-bit electromagnetic energies, a veto bit, and nearest-neighbor energies are handled every 6.25 ns by the Electron Isolation ASICs, which are located on the Electron Isolation Card. Sort ASICs are located on the Electron Isolation Card, where they are used as receivers, and are located on the J/S Cards for sorting the e/γ candidates. All these ASICs have been successfully tested on the boards described, and procured on in the full quantities needed for the system, including spares. The boards described have been produced using these ASICs and sufficient quantity has been obtained to fill 18 crates and create a stock of spares.



Figure 3: Dataflow diagram for an RCT crate, showing data received and transferred between cards on the 160 MHz differential ECL backplane. Brief explanations of the card functionality are shown. For more details see the text or ref. [4].



Figure 4: The Master Clock Crate and cards. Central is the CIC, receiving the fibre from the TTC system, and moving outwards, 2 CFCm cards, and 7 CFCc cards.

A Master Clock Crate (MCC) and cards are located in one of the 10 RCT racks to provide clock and control signal distribution (Figure 4). Input to the system is provided by the CMS Trigger Timing and Control (TTC) system [5]. This provides the LHC clock, Bunch Crossing Zero (BC0), and other CMS control signals via a optical fibre from a TTCci (TTC input card) which can internally generate or receive these signals from either a Local Trigger and Control board (LTC) or the CMS Global Trigger.

The MCC includes a Clock Input Card (CIC) with a LHC TTCrm mezzanine board [5] to receive the TTC clocks and signals via the fibre and set the global alignment of the signals. The CIC feeds fan-out cards, a Clock Fanout Card Midlevel (CFCm) and a Clock Fan-out Card to Crates (CFCc) to align and distribute the signals to the individual crates via low-skew cable. Adjustable delays on these 2 cards allow fine-tuning of the signals to the individual crates.

# III. INPUT AND OUTPUT OF THE RCT

## *A. Trigger Primitive Generators - Input*

The HCAL Trigger Readout (HTR) Boards and the ECAL Trigger Concentrator Cards (TCCs) provide the input to the RCT using a Serial Link Board (SLB), a mezzanine board with the V2716-1 mounted on it. The SLB is configurable, with two Altera Cyclone® FPGAs for data synchronization at the V2716-1, Hamming code calculation, FIFOs, and histogramming. The clocking for the SLB is separate from the HTR and TCC primary clocking to ensure data alignment at the RCT. The HTR can have up to 6 SLBs and receives data from the front end on fibres into its front panel. The TCC has up to 9 SLBs and also receives front-end data via a fibre to its front panel.

## *B. GCT Source Cards – Output*

Each RCT crate is connected to GCT Source Cards, which convert the parallel ECL output of the RCT to optical, so that it may be sent easily to the lower floor of the underground service cavern where the main GCT crate is located. They are located in the RCT racks, directly above the RCT crates.

# IV. OPERATION AND MONITORING

### *A. Commissioning the RCT at CMS*

Installation of the RCT is complete. The RCT has 10 racks that hold a total of 21 RCT crates, 6 GCT Source Card Crates, and a crate for clock distribution to the SLBs (See Figure 4). The MCC and eighteen of the 20 standard RCT crates are part of the final system. The remaining 2 RCT crates will be used for local testing and storage. In each rack is a custom monitoring and power distribution system, a description can be found in reference [6].

# *B. RCT Trigger Supervisor*

The Trigger Supervisor (TS) is an online framework to configure, test, operate, and monitor the trigger components and to manage communications between trigger systems [7]. Individual cells are set up for each system, with a central cell interacting with multiple systems at one time using SOAP [8] commands.

The RCT Trigger Supervisor enables system configuration via a pre-defined key. A state machine allows actions to be defined for transitions between states.

For data taking these states are controlled for all detector subsystems, including trigger, with CMS Run Control. For internal and interconnection tests configuration can be done centrally or standalone for a subsystem. Figure 5 shows the panel for RCT trigger key input and the state machine.

If needed, before configuration, large sections of the calorimeter trigger towers or individual trigger tower "slices"  $(1 \eta x 4 \phi)$  can be masked via masking tools included in the TS (one panel is shown in Figure 6). The information is stored in a database and retrieved during the RCT configuration and can be obtained for use in offline analyses.

The RCT Trigger Supervisor also monitors the system status (Figure 7). Link and clock error states are checked and can be masked if needed using a database or flat file. Error history is stored in a database. Alerts and alarms are implemented in an expert mode for now, but a system to send alerts and alarms to CMS Run Control is currently in place.

Another panel (Figure 8) can display a run history for the RCT. This displays a time ordered list of runs in which the RCT was included. Included is the trigger key used at the time of configuration, as well as a run settings key including current masks. The run settings key can be used to obtain the list of masked channels for use in offline analyses.



Figure 5: RCT Trigger Supervisor window for programming the RCT based on a pre-defined key (middle). The state machine and defined transitions (shown with arrows) is on the right.



Figure 6: RCT Trigger Supervisor panel for masking trigger tower slices. Masking details are written to a database and retrieved during configuration and also offline for data analysis.

|                    | <b>OPLL Lock Status: OK</b><br><b>TTC Error Bit on MasterClockCrate: OK</b> |                         |              |                          |                                         |                      |              | Thu Mar 5 19:57:54 2009 (RCT twiki, RCT Monitoring Explained). |                                 |                                                  |                          |                                |                                                    |                   |                               |                                                   |              |                    |                                   |                      |               |
|--------------------|-----------------------------------------------------------------------------|-------------------------|--------------|--------------------------|-----------------------------------------|----------------------|--------------|----------------------------------------------------------------|---------------------------------|--------------------------------------------------|--------------------------|--------------------------------|----------------------------------------------------|-------------------|-------------------------------|---------------------------------------------------|--------------|--------------------|-----------------------------------|----------------------|---------------|
|                    | Crate O                                                                     |                         |              |                          | Crate 1                                 |                      |              |                                                                | Crate 2                         |                                                  |                          | Crate 1                        |                                                    |                   |                               | Crate d                                           |              |                    | Crate S                           |                      |               |
|                    | Sammary Bitz CIKEN                                                          |                         |              |                          | Summary \$3, OK/38                      |                      |              |                                                                | Summary 12, OK/16               |                                                  |                          | Summary 13: OK/M               |                                                    |                   | Summer 12, OK/16              |                                                   |              |                    | Sammary Bitz OKYM                 |                      |               |
| Cast               | <b>Kell</b>                                                                 | <b>Also</b>             | <b>Frank</b> | <b>Cast</b>              | <b>COL</b>                              | <b>Also</b>          | <b>Frank</b> | <b>Cast</b>                                                    | <b>Cell</b>                     | <b>Also</b><br><b>Burg</b>                       | <b>Cast</b>              | <b>Cell</b>                    | the Pare                                           | <b>Cast</b>       | <b>Cell</b>                   | <b>Alex</b>                                       | <b>Phone</b> | <b>Cart</b>        | <b>Kell</b>                       | site Parte           |               |
| RCD                | <b>DOCTOR DOCTOR</b>                                                        |                         |              | <b>RCD</b>               | <b>DOM: DOM: DOM:</b>                   |                      |              | RCD                                                            |                                 | <b>DOM: COMPLETE</b>                             | RCD                      |                                | <b>COLLEGE ROOM</b>                                | RCD               |                               | <b>CONTROL</b> COMPUTERS                          |              | RCO                | <b>COLLEGE BOX</b>                |                      |               |
| RC1                | <b>CONTRACTOR</b>                                                           |                         |              | 9C1                      | <b>CONTRACTOR</b>                       |                      |              | BC1                                                            |                                 | <b>CONTRACTOR</b>                                | 0C1                      |                                | <b>CONTRACTOR</b>                                  | RC1               |                               | <b>CONTRACTOR</b>                                 |              | RC1                | <b>CONTRACTOR</b>                 |                      |               |
| RC <sub>2</sub>    |                                                                             | <b>CK III</b> DK III DK |              | RC <sub>2</sub>          |                                         | or lines lines       |              | R <sub>C</sub> <sub>2</sub>                                    |                                 | or illinor il linor                              | RC <sub>2</sub>          |                                | <b>DOMESTIC: NO</b>                                | <b>RC2</b>        |                               | or Fire Fire                                      |              | RCZ                |                                   | or il lips il lips   |               |
| 0 <sup>2</sup>     |                                                                             | or and three of three   |              | 0 <sup>2</sup>           | <b>Control Control Ford</b>             |                      |              | BC3                                                            |                                 | <b>County of the County of the County</b>        | 0 <sup>2</sup>           |                                | <b>Country Country Country Country</b>             | RCS               |                               | or little little                                  |              | <b>DCT</b>         | <b>State of Contract State</b>    |                      |               |
| <b>RC4</b>         | CONTRACTOR CONTRACTOR                                                       |                         |              | <b>RC4</b>               | OCON DON'T DON                          |                      |              | <b>RC4</b>                                                     |                                 | OCON DON'T DOW                                   | <b>RC4</b>               |                                | DOM: LOOM DOOR                                     | <b>RC4</b>        |                               | OUNT TOUR DUN                                     |              | RCA                | <b>CONTROLL OVER SOCKS</b>        |                      |               |
| RCS.<br><b>RCG</b> | DOM: DOM: DOM                                                               |                         |              | <b>RCS</b><br><b>RCS</b> | OCOL DOW OCOL<br><b>COUNT DOWN DOOR</b> |                      |              | <b>RCS</b><br><b>RC6</b>                                       |                                 | OCON DOUGH DOOR<br><b>Courses Chevrel Course</b> | <b>RCS</b><br><b>RCS</b> |                                | OCON COLOR DOOR<br>Corporate Concession Concession | RCS<br><b>BC6</b> |                               | OCON OCON DOOR<br>Corporate Concession Concession |              | RCS.<br><b>ROS</b> | DOW: DOW: DOW                     |                      |               |
|                    | CERCITY OFFICE LOCAL<br>ISC <b>EXAMPLE OF THE</b>                           |                         |              |                          | ISC DOCT CONTINUES                      |                      |              | ISC.                                                           |                                 | <b>COLLEGE BOOT</b>                              | R <sub>C</sub>           |                                | <b>DOCTOR DOCTOR</b>                               | <b>ISC</b>        | <b>COLLEGE BOX</b>            |                                                   |              |                    | DOM: DOM: DOM<br>ISC FOC LOC FIRE |                      |               |
|                    |                                                                             |                         |              |                          |                                         |                      |              |                                                                |                                 |                                                  |                          |                                |                                                    |                   |                               |                                                   |              |                    |                                   |                      |               |
|                    | Creie 6                                                                     |                         |              |                          | <b>Stele 7</b>                          |                      |              |                                                                | Creie B                         |                                                  |                          | Creie 9                        |                                                    |                   | Creix 10                      |                                                   |              |                    | Crete 11                          |                      |               |
| $64 - 2$           | Sammery Bit: CK(M)<br>to.                                                   | <b>List</b>             | Projet       | card                     | Summey \$1: OK/10<br><b>DO</b>          | <b>MAR</b>           | 7900         | <b>Card</b>                                                    | Summer \$1: 0X/10<br><b>LED</b> | <b>PMOS</b><br><b>List</b>                       | Card.                    | Stewart 11: 0X/14<br><b>RE</b> | PMOS<br><b>List</b>                                | $64 - 2$          | Syrway 11: 03/10<br><b>to</b> | <b>Livia</b>                                      | PNGC         | $64 - 2$           | Sammers Bit: CK(M<br>to           | <b>CHA</b>           | <b>PMSS</b>   |
| RCO                |                                                                             | 0<10<10<1               |              | RC0                      |                                         | OF REPORT FOR        |              | <b>RCD</b>                                                     |                                 | <b>OCTOBER 1889</b>                              | RCD                      |                                | <b>CONTRACTOR</b>                                  | RCD               |                               | <b>DOMESTIC: UNITS</b>                            |              | RCO <sup>®</sup>   |                                   | or il line il line.  |               |
| RC1                | <b>Contract Form</b>                                                        |                         |              | RC1                      | or cell or                              |                      |              | RCT                                                            |                                 | of cell or                                       | RCT <sub>1</sub>         |                                | <b>College Local Box</b>                           | RCT               |                               | <b>College Lincoln</b>                            |              | RCT                | <b>COLLEGE BOX</b>                |                      |               |
| <b>BC2</b>         | <b>STATE CARD LACE</b>                                                      |                         |              | BC2                      | <b>CONTRACTOR</b> CONTRACTOR            |                      |              | <b>BC2</b>                                                     |                                 | <b>CONTRACTOR</b>                                | BC2                      |                                | <b>CONTRACTOR</b> CONTRACTOR                       | BC2               |                               | <b>CONTRACTOR</b>                                 |              | RC2                | <b>STATE CONTRACTOR</b>           |                      |               |
| RC3                | <b>CK CK CK CK</b>                                                          |                         |              | <b>RC3</b>               |                                         | or il lines il lines |              | <b>RC3</b>                                                     |                                 | <b>DOM: NOW IN THE ONE</b>                       | <b>RC3</b>               |                                | <b>COLLEGE TOO</b>                                 | RC3               |                               | <b>CONTROL</b> COMPUTERS                          |              | <b>RC3</b>         | <b>CONTROLLED CONTROL</b>         |                      |               |
| RC4                | TOOTH DOLLARS DOLLARS                                                       |                         |              | RC4                      |                                         | OLDER LOCKER DECK    |              | <b>RC4</b>                                                     |                                 | Other Lotter Done                                | <b>RC4</b>               |                                | DON'T LOCKE DON                                    | <b>RC4</b>        |                               | DOWN TOOM TOOM                                    |              | RC4                | DOM: DOM: DOM:                    |                      |               |
| RCS                | <b>CONTRACTOR</b>                                                           |                         |              | <b>RCS</b>               |                                         | OCCUPATION COMPANY   |              | <b>RCS</b>                                                     |                                 | OCCUPATION CONTROL                               | <b>RCS</b>               |                                | DOM: LOOM DOM:                                     | RCS               |                               | OCHE LOCIME DOCHE                                 |              | 209                | DOM: DOM: 10024                   |                      |               |
| <b>RCG</b>         | <b>TOCHET LOCIME ROCKY</b>                                                  |                         |              | RCG                      | <b>DOME TOOM TOOM</b>                   |                      |              | RCG                                                            |                                 | OCON DONE DOOR                                   | RCh                      |                                | <b>INFORMATION CONTINUES</b>                       | RCG               |                               | <b>INCHEST CHARGE CHARGE</b>                      |              | RCG                | <b>DOM: Torrell Torre</b>         |                      |               |
|                    | ISC DOMESTIC: DOC:                                                          |                         |              |                          | <b>ISC. Brand Hotel Hotel</b>           |                      |              | <b>ISC</b>                                                     |                                 | <b>CONTRACTOR</b>                                | <b>ISC</b>               |                                | <b>COLLEGE BOX</b>                                 | <b>ISC</b>        | <b>CONTRACTOR</b>             |                                                   |              |                    | <b>ISC DOMESTIC: LOC</b>          |                      |               |
|                    | Craw 12                                                                     |                         |              |                          | Craw 13                                 |                      |              |                                                                | Crate 14                        |                                                  |                          | Crass 15                       |                                                    |                   | Crass 16                      |                                                   |              |                    | Craw 17                           |                      |               |
|                    | Sammary Blc: OK (M)                                                         |                         |              |                          | Summary 83: OK/16                       |                      |              |                                                                | Summary 83: OK/38               |                                                  |                          | Summary Bit: OK/16             |                                                    |                   | Summary Bit: OK/16            |                                                   |              |                    | Sammary Bit: OKOA                 |                      |               |
| Card               | xe.                                                                         | ster.                   | Porte        | Cant                     | <b>COL</b>                              | sing.                | <b>Frank</b> | Cant                                                           | <b>COL</b>                      | also.<br><b>Press</b>                            | Cant                     | <b>CO</b>                      | sing.<br><b>Press</b>                              | Card              | <b>KG</b>                     | sing.                                             | <b>Press</b> | Card               | KO.                               | sing.                | <b>County</b> |
| RCD                | <b>Course I don't like</b>                                                  |                         |              | RCD                      | <b>DOCTOR DOCTOR</b>                    |                      |              | RCD                                                            |                                 | <b>College Lincoln Direct</b>                    | RCD                      |                                | <b>College Lincoln Direct</b>                      | RCD               | <b>CONTRACTOR</b>             |                                                   |              | RCO                | <b>COLLEGE DOG</b>                |                      |               |
| RC1                |                                                                             | or in the filling.      |              | <b>RCL</b>               | <b>CONTRACTOR</b>                       |                      |              | <b>RC1</b>                                                     |                                 | <b>CONTRACTOR</b>                                | <b>RCL</b>               |                                | <b>CALL CALL DOC</b>                               | <b>RC1</b>        |                               | <b>CALL CALL OF</b>                               |              | RC1                | <b>COLLEGE DOG</b>                |                      |               |
| RC <sub>2</sub>    | <b>CK CK CK CK</b>                                                          |                         |              | RC <sub>2</sub>          | <b>DOM: DOM: DOM:</b>                   |                      |              | RC <sub>2</sub>                                                |                                 | <b>COLLECTED</b>                                 | RC <sub>2</sub>          |                                | <b>COMPUTER COMPUTER</b>                           | RC <sub>2</sub>   |                               | <b>Collin I</b> collineed                         |              | RCZ                | <b>COLLEGE DOG</b>                |                      |               |
| <b>RCS</b>         | <b>STATE CARD LOCAL</b>                                                     |                         |              | BC3                      | <b>CONTRACTOR</b> COMPANY               |                      |              | BC3                                                            |                                 | <b>CONTRACTOR</b> COMPANY                        | BC3                      |                                | <b>CONTRACTOR</b>                                  | BC3               |                               | <b>CONTRACTOR</b>                                 |              |                    | <b>BOX MONEY LINE ROOM</b>        |                      |               |
| RC4                | <b>TOONE LOCIN LOCAL</b>                                                    |                         |              | <b>RC4</b>               | <b>DOM: DOM: DOM:</b>                   |                      |              | <b>RC4</b>                                                     |                                 | OCON DON'T DOW                                   | RCA                      |                                | DON'T LOCKET DON                                   | RCA               |                               | OCON COLUMN DOOR                                  |              | RCA                | DOM: DOM: 10000                   |                      |               |
| RCS                | <b>DOM: LOOKE BOOK</b>                                                      |                         |              | 0<                       | OLDER DEPT TODAY                        |                      |              | <b>BCS</b>                                                     |                                 | Other Liberary Tother                            | <b>RCS</b>               |                                | DOWN TOOM TOOM                                     | RCS               |                               | DOM: TOOM DOOR                                    |              | RC5<br>ROS         | <b>DOM: DOM: DOM</b>              |                      |               |
| <b>RCG</b>         | <b>CONTRACTOR</b>                                                           |                         |              | <b>RCF</b>               |                                         | OUN LOCKE DON        |              | <b>RCF</b>                                                     |                                 | OCOE! EDCINE OCOE                                | <b>RCG</b>               |                                | DOM: LOCKE DOM:                                    | <b>RCF</b>        |                               | OCHE COIN DOM:                                    |              |                    |                                   | <b>DOM: DOM: DOM</b> |               |

Figure 7: RCT Trigger Supervisor window for monitoring the RCT links and clock status. Problems appear highlighted in red, and are on a per-card basis. Holding the pointer over a specific error type provides information about which link is in error.



Figure 8: RCT Trigger Supervisor run history panel listing key used during run and a specific run settings key to obtain masking information from the database.

## *C. RCT Intercrate Tests*

The RCT is able to cycle the addresses of its LUTs on the Receiver and Jet/Summary Cards to emulate up to 64 LHC bunch crossings. To debug the internal connections of the RCT all 18 crates are programmed and the GCT Source cards are used to capture the output.

A pattern is chosen, written to the LUTs, and the output is captured. This pattern is also fed to the Trigger Emulator (next section) and the output predicted is compared to the output captured and errors logged.

The bulk of the tests done so far have been internal, testing the timing of data sharing in and between the RCT crates. Patterns like walking zeros and ones, random, and simulated data were used. A number of small problems were found and fixed, and the timing was refined.

Currently this is a stand-alone program, but it will be integrated into the Trigger Supervisor at the RCT level and centrally. Expansion of these tests to use the pattern capability of the HTR and TCC boards to test the links is also underway.

## *D. Trigger Emulator*

The trigger emulator is a software package designed to reproduce the hardware response of the trigger exactly. It replicates all of the on-board logic including all configurable options such as hardware registers and Look Up Tables (LUTs). It is used for hardware validation and monitoring.

The trigger emulator is very versatile and can either use real data or pattern files to predict output. The files used by the HCAL and ECAL can be used as input to their TPG pattern generators and files of data captured by the RCT, GCT, and GT as output can be compared directly. In this way errors are tracked down in the software, hardware, and firmware. In reverse, the validation of the algorithms can be done by injecting physics patterns into the hardware pattern generators and verifying the output. Additionally, using the emulator with input from the HCAL and ECAL TPG emulators generates the RCT LUTs. This is saved to files, and written to the physical LUTs via the Trigger Supervisor during configuration.

## V. DATA TAKING AND VALIDATION

# *A. Global Runs*

In order to integrate the detectors, trigger, data acquisition, and to be ready for data taking when the beam restarts, there have been a series of "Global Runs" with most of the CMS detector included. In order to not interfere with the ongoing commissioning of CMS, these were designated periods of a few days to more than a month. Recently there was a month-long run with the CMS magnet at 4T: CRAFT09 (Cosmic Run at Four Tesla 2009). The goal of this run was reached, and 300 million muon triggers were collected with the full detector. During this period, over 400 million calorimeter triggers were taken as well.

Various subsystems participated in the early runs, depending on their commissioning status, but by the CRAFT09 run all subsystems participated. The RCT took part with the HCAL and ECAL providing TPGs, GCT receiving all RCT output and the final decision made at the Global Trigger. The flexibility of the RCT LUTs allowed partial and complete calorimeters to be out of a run if needed, but still accumulate calorimeter triggers. Separate keys for the Trigger Supervisor were created for these different LUT configurations. Data was studied offline and later checked online to validate algorithms and detect any problems (next section).

## *B. RCT Data Quality Monitoring*

#### *1) Online Data Quality Monitoring (DQM)*

In order to monitor the RCT as data is taken, real-time histograms are created and filled in the CMS High-Level-Trigger filter farm during data taking at a rate of about 10 Hz. A small set of selected histograms allows the shift crews to see if any problems have arisen. These include data validity checks with the emulator and comparisons to reference histograms that are highlighted if in error. One can also retrieve older runs with the same tool. A screen shot of the L1 Trigger summary page is shown in Figure 9, showing muon and calorimeter trigger plots.



Figure 9: Online L1 Trigger DQM summary page for a recent run. Muon trigger system plots are shown as well. Calorimeter triggering was with the ECAL barrel and entire HCAL.

#### *2) Offline DQM*

For more detailed analysis of the RCT performance, offline DQM is very valuable. Access to a greater number of events is possible, and more histograms and a data array are stored for more detailed analysis. Raw data from recent runs is available within hours on the mass storage systems and can be analyzed promptly.

The trigger emulator is fed the TPGs from the data and the RCT response is predicted, providing efficiencies at the RCT region level (Figure 10 and Figure 11). Plots of energy distributions and additional one-dimensional plots are able to show subtle differences and problems with triggering thresholds. In this way problems can be traced back to the hardware that caused them.



Figure 10: Isolated e/γ candidate efficiency for ECAL barrel as a function of eta index (horizontal,  $\eta$ =0 at 10 and 11 boundary) and phi index (vertical) of the RCT regions.



Figure 11: RCT region efficiency with  $E_T$  matching, same coordinate system as for Figure 10. Minor inefficiencies are light green and under investigation. The red blocks were due to a swapped fiber.

# *C. RCT performance*

During CRAFT09 the RCT was operated 24 hours a day, 7 days a week. During this period the RCT was configured repeatedly. The RCT consists of 18 crates, each with over  $20 \times 2^{17}$  locations in the LUTs, and no configuration errors occurred due to RCT hardware problems. There were occasional software-related problems, but new version of software packages and bug fixes have addressed this.

The monitoring of the RCT performance online and offline was performed on a daily basis. This caught problems early. The efficiencies of the RCT show nearperfect hardware performance and minor problems are being repaired in time for the planned restart of the LHC.

#### VI. CONCLUSIONS

The commissioning of the Regional Calorimeter Trigger at CMS is complete and the RCT is now used almost daily to collect calorimeter triggers. This is because a suite of tools for operation and monitoring the RCT has been developed and is easy to use. Overall the performance of the RCT has been solid and tools have ensured that the RCT is ready for the restart of beam.

# VII. REFERENCES

- [1] CMS Collaboration, *The CMS experiment at the CERN LHC*, JINST 3:S08004, 2008..
- [2] http://cms-project-cmsinfo.web.cern.ch/cms-projectcmsinfo/news.html
- [3] W. Smith et al., CMS Regional Calorimeter Trigger High Speed ASICs, *Proceedings of the Sixth Workshop on Electronics for LHC Experiments*, Krakow, Poland, September 2000, CERN-2000-010.
- [4] CMS, The TRIDAS Project Technical Design Report, Volume 1: The Trigger Systems, CERN/LHCC 2000-38, CMS TDR 6.1.
- [5] http://ttc.web.cern.ch/TTC/intro.html <http://cmsdoc.cern.ch/cms/TRIDAS/ttc/modules/ttc.html>
- [6] P. Klabbers et al., Operation and Monitoring of the CMS Regional Calorimeter Trigger Hardware, *Proceedings of the 13th Topical Workshop on Electronics for Particle Physics*, Naxos, Greece, September 2006, CERN-2008-008.
- https://twiki.cern.ch/twiki/bin/view/CMS/TriggerSupervisor
- [8] http://www.w3.org/2000/xp/Group/

# Analogue Input Calibration of the ATLAS Level-1 Calorimeter Trigger – TWEPP-09

J.D. Morrisa,b

# <sup>a</sup> On behalf of the ATLAS TDAO Collaboration [1] <sup>b</sup> Queen Mary University of London, Mile End Road, London, E1 4NS, UK

john.morris@cern.ch

## *Abstract*

The ATLAS Level-1 Calorimeter Trigger is a hardware-based pipelined system using custom electronics which identifies, within a fixed latency of 2.5  $\mu$ s, highly energetic objects resulting from proton-proton interactions at the LHC. It is composed of three main sub-systems. The PreProcessor system first conditions and digitizes approximately 7200 pre-summed analogue calorimeter signals at the bunch-crossing rate of 40 MHz, and identifies the specific bunch-crossing of the interaction using a digital filtering technique. Pedestal subtraction and noise suppression are applied, and final calibrated digitized transverse energies are transmitted in parallel to the two subsequent processor systems, which perform the algorithms and calculate the variables the trigger menu is tested against. Several channeldependent parameters require setting in the PreProcessor system to provide these digital signals which are aligned in time and properly calibrated. The different techniques which are used to derive these parameters are described, along with the quality tests of the analogue input signals and the status of the energy calibration.

# I. THE ATLAS LEVEL-1 CALORIMETER

# *A. The ATLAS Trigger*

The ATLAS trigger system consists of three separate components. The task of the ATLAS trigger is to reduce the event rate from 40 MHz to 200 Hz. A schematic of the ATLAS trigger can be seen in Figure 1.

The Level-1 trigger consists of a calorimeter trigger which operates on reduced information from the calorimeters and a muon trigger that works on special trigger chambers within the muon detectors. The Level-1 system has a requirement that the latency be less than 2.5  $\mu$ s. The Level-1 calorimeter and muon triggers have two outputs, the real time data path which transmits information on the multiplicity of each trigger menu item to the central trigger processor (CTP). The CTP generates a Level-1 accept or reject decision, deciding if the event is of interest or not. The other output is sent to the Level-2 trigger in the form of regions of interest (RoIs), which are small energetic regions of  $\eta$ and  $\phi$  that are used as the input seeds for the Level-2 algorithms.

The Level-2 trigger and HLT is software based comprised of approximately 500 CPUs taking the Level-1 RoIs as its input. The Level-2 trigger has access to the full granularity of the ATLAS detector and has a requirement that the latency be, on average, less than 40 ms.

The Event filter, or Level-3, is a software trigger comprised

of approximately 1600 CPUs. The event filter has access to the full event information, calibration constants and offline algorithms.



Figure 1: The ATLAS trigger system.

#### *B. The Level-1 Calorimeter Trigger*

The ATLAS Level-1 calorimeter trigger (L1Calo) is fully described elsewhere [2]. L1Calo is a 1  $\mu$ s fixed latency, pipelined, hardware based system which uses custom electronics. The additional 1.5  $\mu$ s comes from cable delays. L1Calo consists of nearly 300 VME modules of 10 different types housed in 17 crates. L1Calo is located entirely off detector in the service cavern USA15.

Around 250,000 calorimeter cells are summed to 7168 L1Calo trigger towers. The granularity of L1Calo is described in Table B..

| Position             | $\Delta \eta$ x $\Delta \phi$ |
|----------------------|-------------------------------|
| $ \eta  < 2.5$       | $0.1 \times 0.1$              |
| $2.5 <  \eta  < 3.1$ | $0.2 \times 0.2$              |
| $3.1 <  \eta  < 3.2$ | $0.1 \times 0.2$              |
| $3.2 <  \eta  < 4.9$ | $0.4 \times 0.4125$           |

Table 1: Granularity of L1Calo trigger towers/

L1Calo has three processor types. The PreProcessor (PPr) digitizes the analogue calorimeter pulses, performs bunchcrossing identification and converts ADC counts to energy. The Cluster Processor (CP) identifies electrons, photons and single hadrons. The Jet/Energy-sum processor (JEP) does jet finding and energy sums.

## *C. The PreProcessor (PPr)*

The calorimeter pulses are obtained through the receiver system which provides input signal conditioning via variable gain amplification. Due to the different hardware configurations of the different calorimeters, some signals are transmitted to L1Calo proportional to E and some proportional to  $E<sub>T</sub>$ . The gain on individual receivers is set such that, if necessary, a  $\sin (\theta)$  correction from  $E \rightarrow E_T$  is performed. The calorimeter pulse is sampled at 40 MHz by 10-bit flash-ADC. The calorimeter pulse is sampled over five bunch-crossings with the pedestal set at 32 ADC counts.

Bunch-crossing identification is performed using a peak finder, which uses a special algorithm for saturated pulses. A finite impulse response (FIR) filter aids the peak finder by sharpening the signal and improving the signal to noise ratio. The final  $E_T$  is calculated using a look up table which removes the pedestal and provides noise suppression.

## *D. The Processors*

Both the CP and JEP processors work on the  $E_T$  values provided by the PPr. Both processors use sliding window algorithms which provide local  $E_T$  maxima with multiple thresholds and isolation criteria. The CP uses a 0.1 x 0.1 granularity and operates in the  $|\eta| < 2.5$  region, while the JEP uses a 0.2 x 0.2 granularity, sums the electromagnetic and hadronic layers of L1Calo and operates over the whole of the ATLAS  $|\eta| < 4.9$ region.

#### II. TIMING CALIBRATION

The precise timing of all L1Calo trigger towers is important for the identification of the correct bunch-crossing and also for the correct measurement of the deposited energy. The timing of L1Calo is critical, if the timing is wrong ATLAS will not record the correct event. In a physics event, the *pp* collisions take place at the interaction point and the time-of-flight of final state particles to the calorimeters is  $\eta$  dependent. When a signal is sent from the calorimeters to L1Calo transmission along the cables takes time. Due to the cabling of ATLAS, the length of time taken for a signal to travel from the calorimeters to L1Calo varies greatly and is both  $\eta$  and  $\phi$  dependent.

L1Calo timing is calibrated in two ways. Coarse timing, in steps of 1 bunch-crossing (25 ns) allows identification of the correct bunch-crossing. Fine timing, in steps of 1 ns, allows L1Calo to sample the calorimeter pulse at its peak.

#### *A. Coarse timing*

The coarse timing of L1Calo is set by a FIFO, in steps of 25 ns. The timing is calibrated using repetitive calorimeter pulser runs, provided by the calorimeter calibration system. The analogue signals sent from the calorimeters are lined up in L1Calo by adjusting the FIFO settings.

The calorimeter calibration system provides pulser runs for each calorimeter partition, these consist of the Barrel and different end-caps. This allows for a common FIFO setting to be established for each partition and relative FIFO settings for all L1Calo channels within a partition. It is not possible to check the timing of one partition against another, so cosmic data is employed. Cosmic rays occasionally leave calorimeter deposits which span two calorimeter partitions, and the timing can be checked to line up the different partitions.

A priority for first beam will be to establish the correct coarse timing for L1Calo. Initial collision events will be triggered by a logical *AND* between the beam pickup system and L1Calo, allowing L1Calo to study the timing for every channel in which a signal is observed. The FIFO settings will be adjusted so that every L1Calo channel observes the peak of the calorimeter pulse in the correct bunch-crossing.

#### *B. Fine timing*

The fine timing of L1Calo is set by the PHOS4 chip, which varies the timing of each channel by 1 ns. Using the calorimeter calibration system, repetitive pulses are received by the L1Calo system. The PHOS4 setting of each channel is varied through all 25 settings and the signal shape is reconstructed and fit offline to determine the peak of the signal. This methodology allows the timing of each L1Calo channel to be set to 1 ns. Figure 2 shows the output of a PHOS4 scan for a typical L1Calo channel.



Figure 2: A PHOS4 scan. The amplitude of a calorimeter pulse measured over 125 ns. A Landau-Gaussian is fitted to the data to determine the peak position.

#### III. INTERNAL CALIBRATION

L1Calo must be calibrated internally. This means that all channels should behave the same way. All channels should have the correct receiver gain setting and a similar pedestal. Bunchcrossing identification is optimized with internal finite impulse response (FIR) filter settings.

## *A. Setting the pedestal*

L1Calo chooses to set the pedestal of each channel to  $2^5 =$ 32 ADC counts. As each L1Calo channel has a different response, a DAC scan is performed which determines the linear relationship between the DAC value and the ADC counts. The DAC scan shifts the analogue pulse into the sensitive voltage window of the ADC. Such a relationship can be seen in Figure 3 Each channel has a different slope and offset. When the pedestal is set the slope and offset values are used to set each pedestal close to 32 ADC counts.



Figure 3: A DAC scan. The DAC value is varied to determine a linear relationship with the number of ADC counts. The slope and offset needed to set a pedestal of 32 ADC counts is determined.

#### *B. Checking the pedestal*

Once the pedestal of each channel has been set with a DAC scan, the value and width of each channel is checked with a pedestal run. This provides a check and ensures that L1Calo is setting the pedestal correctly. Shown in Figure 4 is the RMS of the pedestals of the electromagnetic section of L1Calo. The colour scale is in ADC counts, the pedestal width can be seen to decrease with increasing *n*, due to  $\sin(\theta)$  attenuation.



Figure 4: Pedestal RMS of the electromagnetic section of L1Calo.

# *C. Finite Impulse Responce (FIR) Filter*

L1Calo makes use of Finite Impulse Responce (FIR) filters to improve bunch-crossing identification and to aid in noise suppression. The calorimeter signal pulses span many bunchcrossings and the FIR filters have the effect of sharpening the signal prior to bunch-crossing identification. An optimal performace is achieved when the filter coefficients match the pulse shapes. The FIR filter coefficients are individually settable for each L1Calo channel. A schematic of the FIR filter logic is shown in Figure 5.



Figure 5: FIR filter logic.  $d_{1,..,5}$  represent the input pulse and  $a_{1,..,5}$ represent the FIR filter coefficients.

A Monte Carlo study of the effect of different sets of FIR filter coefficients has been performed. The efficiency of the bunchcrossing identification is defined as

$$
\epsilon = \frac{\text{\# pulses with correct peak}}{\text{All pulses}} \tag{1}
$$

Three different sets of FIR filter coefficients were used, and are shown in Figure 6. Set A, shown in stars, is just the peak finder with the filter in pass-through mode. Set B, the optimal FIR filter, shown in circles, has the FIR filter coefficients of each channel individually defined. Set C, shown in triangles, has the same FIR filter coefficients for all channels are derived from channels which sit in the region with the highest noise ( $\eta = 0$ ).



Figure 6: Monte Carlo bunch-crossing identification efficiency. 1 ADC count corresponds to approximately 250 MeV. Shown for 3 differtent FIR filter coefficient settings.

As shown in Figure 6, Set A shows the least efficiency, while Set B and Set C perform similarly. The L1Calo strategy for early collision data is to start with a relatively simple system and understand it before moving onto more complex environment settings. Therefore, based on Set C, FIR filter coefficients will be defined for the hadronic, electromagnetic and forward calorimeter regions.

#### IV. ENERGY CALIBRATION

The number of ADC counts measured by L1Calo does not immediately translate to an energy in MeV. This requires calibration, the goal is to calibrate the system so that 1 ADC count corresponds to 250 MeV on the electromagnetic scale.

The calorimeter calibration system provides pulser ramp runs where the pulses are provided in a sequence of different discrete amplitudes. Approximately 200 pulses per energy step are taken, this number can be changed if required. The energy given by the calorimeter is correlated with the energy measured by L1Calo and the energy ramp is fitted offline. The slope and offset of the fit are determined for each L1Calo channel. Shown in Figure 7 is the trigger tower energy, the calorimeter energy and the energy ramp. Each L1Calo channel is being tuned and

the calibration constants are becoming increasingly stable.



Figure 7: Calibration of L1Calo. Trigger tower energy(left), calorimeter energy(middle) and the correlation between the two(right).

#### V. PLANS FOR FIRST COLLISIONS

Once the LHC delivers collisions to the ATLAS detector, physics calibration will be a clear priority for L1Calo. Early events will be triggered by the beam pickup system, which detects when a bunch-crossing takes place. This will allow L1Calo to quickly determine the coarse and fine timing from physics events.

Offline analysis comparing reconstructed physics objects and L1Calo regions of interest will feed back into the overall calibration of L1Calo. The analysis of electrons and photons will enable L1Calo to determine the electromagnetic scale of the system.

Once L1Calo has been understood sufficiently, the plan is to increase the complexity of the calibration. L1Calo will increase the number of FIR filter coefficient settings if required. The hadronic scale will be determined and dead material corrections will be applied.

#### **REFERENCES**

- [1] The ATLAS Trigger/DAQ Autholist, version 3.0 ATL-DAQ-PUB-2009-007 CERN, Geneva, 2009 *http://cdsweb.cern.ch/record/1207077*
- [2] R. Achenbach et al., The ATLAS Level-1 Calorimeter Trigger, 2008 JINST 3 P03001. (ATL-DAQ-PUB-2008-001) *http://www.iop.org/EJ/abstract/1748-0221/3/03/P03001*

# Precise Timing Adjustment for the ATLAS Level1 Endcap Muon Trigger System

Y. Suzuki<sup>a</sup>, O. Sasaki<sup>a</sup>, H. Iwasaki<sup>a</sup>, M. Ikeno<sup>a</sup>, M. Ishino<sup>a</sup>, S. Tanaka<sup>a</sup>,

T. Kawamoto<sup>b</sup>, H. Sakamoto<sup>b</sup>, S. Oda<sup>b</sup>, T. Kubota<sup>b</sup>, K. Kessoku<sup>b</sup>, Y. Echizenya<sup>b</sup>,

C. Fukunaga<sup>c</sup>, M. Tomoto<sup>d</sup>, T. Sugimoto<sup>d</sup>, Y. Okumura<sup>d</sup>, Y. Takahashi<sup>d</sup>,

S. Hasegawa<sup>d</sup>, Y. Itoh<sup>d</sup>, S. Kishiki<sup>d</sup>, T. Takeshita<sup>e</sup>, Y. Hasegawa<sup>e</sup>,

Y. Sugaya<sup>f</sup>, H. Kurashige<sup>g</sup>, A. Ishikawa<sup>g</sup>, T. Matsushita<sup>g</sup>,

A. Ochi<sup>g</sup>, C. Omachi<sup>g</sup>, T. Hayakawa<sup>g</sup>, T. Nishiyama<sup>g</sup>,

S. Tarem<sup>h</sup>, E. Kajomovitz<sup>h</sup>, S. Ben Ami<sup>h</sup>, A. Hershenhorn<sup>h</sup>, S. Bressler<sup>h</sup>,

Y. Benhammou<sup>i</sup>, E. Etzion<sup>i</sup>, N. Hod<sup>i</sup>, Y. Silver<sup>i</sup>,

D. Lellouch<sup>j</sup>, L. Levinson<sup>j</sup>, G. Mikenberg<sup>j</sup>

<sup>a</sup>KEK, High Energy Accelerator Research Organization, Tsukuba, Japan b ICEPP, University of Tokyo, Tokyo, Japan <sup>c</sup>Tokyo Metropolitan University, Hachioji, Japan <sup>d</sup>Nagoya University, Nagoya, Japan <sup>e</sup>Shinshu University, Matsumoto, Japan <sup>f</sup>Osaka University, Osaka, Japan <sup>g</sup>Kobe University, Kobe, Japan hTechnion Israel Institute of Technology, Haifa, Israel <sup>i</sup>Tel Aviv University, Tel Aviv, Israel <sup>j</sup>Weizmann Institute of Science, Rehovot, Israel

Presented by Yu Suzuki  $y$  z k rn.

# *Abstract*

The ATLAS level-1 endcap muon trigger system consists of about 4000 Thin Gap Chambers (TGC) with 320,000 input electronics channels in order to find level-1 trigger candidates for muons in both endcap regions. Three TGC stations are deployed with about 1m interval with 15m apart from the interaction point in z-direction on each endcap side and the radius of the station (disc form) is about 25m. Usually hit signals are not timely aligned because of different cable length and different time of muon flight. In order to supply reliable level-1 endcap trigger signals, we must adjust timing of hit signals for all the channels with the precision of 2.5ns. In the meantime we have to adjust also the bunch crossing phase used in the TGC system with one from LHC. We need, however, actual bunch crossing signals in order to accomplish this. In this paper we discuss strategies for timing alignment of individual channels with the timing adjustment facility embedded in the TGC electronics system and for the adjustment of the phase shift of the bunch crossing signals.

#### I. INTRODUCTION

For supplying the level-1 endcap muon signals, we have installed about 4000 Thin Gap Chambers (TGC) to cover almost full region of both endcaps of the ATLAS detector  $(1.05 <$  $|\eta| < 2.4$ [1].

In order to make a trigger signal with various coincidence logic operation, all hit signals from tracks generated at pp collisions in a bunch should be aligned in principle in the same timing, our detected signals will be usually spread in total from 65 to 116 ns owing to

1. the time of flight of particles (45 64ns), and

2. the signal propagation delay (9 60ns).

Even if we adjust this spread of signal timing for individual channels, we have to identify a bunch in which all the signals are produced. We call this operation the bunch identification. The bunch crossing signals (40 MHz) arrived at the TGC electronics as the basic clock pulses supplied by the TTC system[2] are also delayed and fluctuate among channels within 25ns. We have to adjust the bunch crossing signals in all the channels. We have to synchronize the TGC bunch crossing signal with the one comes from LHC. Since this operation consumes the luminosity, we have to estimate carefully the statics necessary for this operation to minimize the luminosity dedicated for this work.

In the next section, we discuss how to cope the timing spread caused by delay with the time of flight of particles and signal propagation delay in cables. Lining up all the signals in one timing, we then have to adjust a bunch phase with one from LHC. In section III. , we discuss this clock phase adjustment. For smooth and quick scan to find the best adjusted clock phase, we needed to develop a new VME module which is called delay module. This module will be installed in between the ATLS central trigger processor which gives TGC the bunch crossing

signals and the TGC TTC system in order to supply the delay timing of the phase shift to all the channels uniquely at once. In this section we discuss also the role of this module in detail. For doing the clock phase scan, we need actual beam, namely we consume the luminosity. We have to fix carefully a scenario to do this in the most efficient way. We discuss the strategy established and the statistics needed from the simulation study. Finally in section IV., we summarize the work for the precise timing adjustment we have done since the first beam circulation in September 2008, and the outlook for the collision which will be foreseen in the end of 2009 or the beginning of 2010.



Figure 1: Schematic diagram of test pulse Generation and detection between ASD and PP ASIC chips

#### II. ADJUSTMENT OF INDIVIDUAL HIT SIGNALS

The TGC electronics system is divided into the on-detector and off-detector parts. The on-detector part contains several kinds of homemade ASIC chips. The ASD (Amplifier, Shaper and Discriminator) chip and the PP (Patch Panel) ASIC are two of these homemade ASICs. The ASDs are mounted in vicinity of the TGC as a front-end electronics while PP chips are installed in the beginning part of the on-detector part. The PP ASIC has a delay circuit to adjust the timing of a hit signal in 0.83ns step up to 26ns, test pulse generator to check the ASD connectivity, and a synchronization circuit of the hit signal with the bunch crossing signal (clock). All these circuits are installed commonly in each signal channel. A 16-ch ASD board at the TGC side and PP ASIC at the on-detector part are connected with LVDS cables of 834 different types whose lengths vary from 1.8m to 12.5m. The total number of cables used are about 10000. Since coincidence circuits to generate trigger candidates are placed just behind the PP ASIC, one of our important tasks for timing adjustment is to align the timing of signals for all the channels in the PP ASIC.

The test pulse generator in a PP chip is used to simulate the timing (time of flight and the propagation delay) of particles for the ASD. If a test pulse trigger signal arrives at the PP ASIC via the TGC TTC system, a test pulse is generated after a predefined delay interval and is sent to the ASD which sends back the ASD output to the PP chip immediately. The delay interval time can be set in two step modes; a coarse mode with one clock (25ns interval) step from 0 to 7 clocks, and a fine mode with 0.83ns step from 0 to 26ns maximum (the same precision as the signal delay circuit), namely we could set the delay from 0 to 200ns with 0.83ns step. If we set the predefined delay interval as "the cable length the propagation speed the time of flight", we can simulate a signal generated in a particular channel by muon from the interaction point.

Since the time of flight is known from the geometrical position of TGC region covered by the ASD, we can examine the delay timing caused only by the propagation time of signals in a cable between the ASD and PP ASIC.

In fig. 1, a schematic diagram of the cable connection between two chips and the test pulse generator is shown.

Beside simple measurement for the propagation speed of signals, we must also consider the smearing effect due to the attenuation of signals which pass through long cables. As shown in fig. 2, we can clearly see this effect; longer is a cable, more the effect is enhanced for both the test pulse input at the ASD and the returned ASD signal through LVDS and observed at the PP.



Figure 2: Attenuation effects observed after long propagation in cables of length 2.8m and 48.1m. "LVDS" indicates the ASD output of a TGC signal observed just in front of the PP ASIC while "Test Pulse" is Test pulse generated in PP and observed in front of ASD. The full width (time range) of the both scope pictures is 400ns. It is divided into 10 subunits of 40ns interval.

We have made systematic study of this effect, and found additional delay factors of roughly 0.2ns/m for Test Pulses observed in ASD though the dependence is not linear (we have estimated this additional delay factor with a polynomial function as the cable length). We have included this effect in the precise delay adjustment beside the standard propagation delay unit of 5ns/m.

With this optimization of propagation speed in a cable, we have measured the trigger timing distribution. For the correction of the timing, we must also know the length of all the cables (about 10000). We believed simply the cable length (from 1.8 to 12.5m) from the information given by the cable production company. In this case signals were distributed broadly from - 4ns to 10ns with the standard deviation of 1.5 (1.23ns) as shown with the (blue) slashed hatch pattern histogram in fig. 3. As the distribution is unexpectedly broad, we then carefully treat the length of cables. We have measured delay timing of cables with

each type for all 834 types. We have found that the delay timing was fluctuated and its average was shifted from the expected one which comes from the nominal cable length in every type. The shift value depends on the type (cable type dependency). We have incorporated the actual shift values for the cable length estimation for all the types. Histogram with the 25% darkened (red) pattern is the timing distribution corrected with this cable type dependency. Although the distribution has been significantly improved than simple cable length correction, several channels have been corrected insufficiently yet as we see entries outside the 4ns region in this histogram. We speculated some cables in a particular type have had quite different length rather than nominal length. For all those cables which shows the timing shifts bigger than 2.5ns (data fallen outside of central 6 bins), we then have measured individually actual lengths (individual cable correction). and used this information for the cable length correction, and finally gotten the timing distribution as indicated in the histogram with 50% darkened (black) pattern. In this ultimate correction of the cable length, we have adjusted the signal timing alignment with 2.5ns precision as we can see from the figure for all the channels.



Figure 3: Trigger timing distributions. Three histograms are resulted with three different cable length corrections. See text in detail. The width of one unit on the abscissa is 0.83ns; the smallest PP delay adjustment timing.

# III. BUNCH CROSSING IDENTIFICATION

As shown in fig. 4, TGC signals are intrinsically distributed in 25ns interval. TGC electronics system tries to catch the hit signals if a level-1 trigger is given in a bunch crossing. Some hit signals will be lost easily, however, if the bunch crossing signal in the TGC system is not adjusted correctly to one which the LHC machine produces for a specified level-1 trigger signal.



Figure 4: A typical TGC hit distribution[3]

The TGC electronics records always hit signals observed in three contiguous bunches around the triggered bunch (previous, current and following bunches). Number of hit signals observed in the previous or following bunches will not be ignorable if the bunch crossing signal adjustment is made slightly advanced or delayed to one given by LHC. In order to measure the phase shift of the bunch clocks of the TGC system and LHC, therefore, the ratio of the number of hit observed in the triggered bunch to total number of hits observed in all three bunches must be useful quantity. If this phase shift adjustment is correctly done, the number of hits counted in previous and following bunches are in principle zero. It turns out that the timing difference which gives the maximum ratio closest to one must be an actual shift existed between two systems. We can adjust thebunch crossing phase in this way with LHC. Figure 5 shows a simulation result of the dependency of this ratio with the phase shift. While LHC makes stable beam-beam collision, we will be able to find a point of the maximum ratio as demonstrated in the figure if we plot this ratio with changing the TGC bunch clock phase artificially with 1000events per point with 1ns step.



Figure 5: The ratio of the number of hits observed in the triggered bunch to total numbers observed in all three bunches versus the bunch crossing difference between TGC and LHC (simulation)



Figure 6: The delay module we have made to adjust the phase shift between the TGC and LHC bunch crossing signals. The circuit is housed in a VME 6 unit module

The phase shift is just one unique parameter to let whole the TGC system synchronize with the LHC system. As discussed in section II., we have achieved to align the timing of all individual TGC channels in 2.5ns precision. In this process, we have naturally adjusted also the bunch phase for all the channels. One last parameter we have to adjust is, therefore, this phase shift. Since, at the moment to write this manuscript, LHC has not yet supplied any bunch signal, we could not adjust it yet. Our precise timing adjustment will be completed when we optimize this phase shift. We expect the phase shift adjustment will be done smoothly and quickly if LHC works constantly. We had had a problem of how to reflect this unique parameter at once to whole the TGC system. An LHC bunch crossing signal is delivered to the TGC system from the ATLAS Central Trigger Processor (CTP). At the TGC side, TTCvi modules [2] are used to receive this signal, fan-outed and distributed to TTCrx installed in the various parts of the TGC system. Since TTCvi has no facility to delay the received bunch signal before its fan-out, what we have had to do is to build newly a delay module by ourselves to install it between CTP and TTCvi to insert an amount of seconds equal to the phase shift. There has been no such a module prepared in the standard TTC module set. We have made a module for this purpose. It can insert a delay span in 0.5ns step precision with total 64 steps (0-31.5ns). The delay is generated simply using coaxial cables of different length. A picture of this module is shown in fig. 6. The circuit is installed in a VME 6U module and its VME control mode is A24D16.

## IV. SUMMARY

We have made timing adjustment of individual channels using embedded test pulse function and delay adjustment system which we can tune the signal delay in 0.83ns precision. With these facilities we tried to adjust timing of hit signals which are usually widely distributed due to difference of signal cables and geometrical positions even the origin (muon track) of signals is produced at once. Otherwise we could not make trigger signals using the trigger coincidence logic. We have used about 10000 LVDS cables with 834 different types (1.8m to 12.5m length difference) to connect front-end ASD chips with corresponding PP ASICs on the on-detector electronics in the TGC system. After precise estimation of the propagation speed of signals in a cable, which includes also the attenuation effect of signals, and with the optimization of length of individual cables, we could manage to adjust hit signals within 2.5ns for all the individual channels as shown in fig. 3.

Although we need definitely beam collision in LHC to adjust the phase shift of bunch crossing clocks between LHC and the TGC electronics, we have been ready to estimate this phase shift with enough precision and the least consumption of the luminosity. From the simulation study, we may be able to find the phase shift if we take 1000 events per data point by artificially changing amount of the shift from -15ns to 9ns with 1ns step for 25 points. If the expected L1A rate will be 500Hz which will be achieved if the LHC luminosity is  $10cm \text{ }sr$ , we can take 2s to calculate the ratio which has been discussed in section III. for one data point. The phase shift parameter is a unique one between the TGC system and LHC to adjust if the phase shift in the individual channels has been adjusted. We had, however, had no delay adjustment facility to modify the phase shift for all the channels at once. We have made a delay module to do that with 0.5ns step for total 64 steps range.

As the timing adjustment for individual channels has been finished. If LHC will supply beam stably, we expect that we optimize the best phase shift in about half an hour. In such a way, our timing adjustment will be completed, and the TGC trigger system will supply reliable L1A candidate signals for muons in the endcap region to the ATLAS CTP.

#### **REFERENCES**

- [1] G. Aad *et al.* (ATLAS collaboration), "The ATLAS Experiment at the CERN Large Hadron Collider", JINST 3:S09003, 2008 (Electric publication).
- [2] CERN RD12 project, "Timing, Trigger and Control (TTC) System for the LHC", http://ttc.web.cern.ch/TTC/
- [3] ATLAS collaboration, "ATLAS Muon Spectrometer Technical Design Report", CERN/LHCC 97-22, 1997

# Framework for Testing and Operation of the ATLAS Level-1 MUCTPI and CTP

R. Spiwoks<sup>a</sup>, D. Berge<sup>a</sup>, N. Ellis<sup>a</sup>, P. Farthouat<sup>a</sup>, S. Haas<sup>a</sup>, J. Lundberg<sup>a</sup>, S. Maettig<sup>a, b</sup>, A. Messina<sup>a</sup>, T. Pauly<sup>a</sup>, D. Sherman<sup>a</sup>

> <sup>a</sup>CERN, 1211 Geneva 23, Switzerland <sup>b</sup> University of Hamburg, 20146 Hamburg, Germany

#### Ralf.Spiwoks@cern.ch

# *Abstract*

The ATLAS Level-1 Muon-to-Central-Trigger-Processor Interface (MUCTPI) receives information on muon candidates from the muon trigger sectors and sends multiplicity values to the Central Trigger Processor (CTP). The CTP receives the multiplicity values from the MUCTPI and combines them with information from the calorimeter trigger and other triggers of the experiment and makes the final Level-1 decision. The MUCTPI and CTP are housed in two 9U VME64x crates and are made of nine different types of custom designed modules. This paper will present the framework which is used for debugging, commissioning and operation of all modules of the MUCTPI and CTP.

Testing of the modules has been considered right from design. Most types of modules contain diagnostic memories at the input of the module which can be used to capture incoming data or to inject data into the module. Testing of the modules can be achieved by capturing data at input of a down-stream module, by reading out data from a monitoring buffer, or by reading out monitoring counters.

A layered software framework using C++ has been developed for configuring and controlling all modules and for testing them independently or grouped into complete subsystems. The lowest level uses the ATLAS VME library and driver. At the next higher level, a compiler translates a description of the VME registers from XML to C++ code. This code together with existing code for some components, e.g. HPTDC, DELAY25, and JTAG, is combined to the lowlevel library of the module. A menu program provides access to all methods of the module low-level library. Generators create data for the test memories. Simulators calculate expected results. Generators, simulators and the low-level library are combined to a suite of test programs which cover the full functionality of the MUCTPI and CTP. The low-level library is also used by the control and monitoring programs which integrate the sub-systems into the ATLAS experiment control and monitoring framework.

#### I. INTRODUCTION

The ATLAS experiment at the Large Hadron Collider (LHC) at CERN uses a three-level trigger system. The Level-1 trigger [1] is a synchronous system operating at the bunch crossing (BC) frequency of 40.08 MHz of the LHC. It uses information on clusters and global energy in the calorimeters and on tracks found in the dedicated muon trigger detectors. An overview of the ATLAS Level-1 trigger is shown in Figure 1. The Level-1 central trigger consists of the Muon-to-Central-

Trigger-Processor Interface (MUCTPI), the Central Trigger Processor (CTP), and the Timing, Trigger and Control (TTC) partitions.



Figure 1: Overview of the ATLAS Level-1 Trigger

The MUCTPI [2] combines trigger information from the two dedicated muon trigger detectors, the Resistive Plate Chambers (RPC) in the barrel and the Thin-Gap Chambers (TGC) in the end-cap region. The CTP [3] forms the Level-1 trigger decision (accept or reject) for every BC, and distributes it to the TTC partitions. It also receives timing signals from the LHC and fans them out to the TTC partitions. The TTC partitions perform the distribution of the timing, trigger and control signals to all sub-detector front-end electronics. In the ATLAS experiment there are about 40 TTC partitions. For a full overview see [4].

#### II. THE MUCTPI

The MUCTPI [2] receives the muon candidates from all 208 trigger sectors, calculates multiplicities for six programmable  $p_T$  thresholds and sends the results to the CTP. It resolves cases where a single muon traverses more than one sector and thus avoids double counting. The MUCTPI sends summary information to the Level-2 trigger and to the data acquisition (DAQ). It identifies, in particular, regions of interest (RoI) for the Level-2 trigger processing. The MUCTPI can also take snapshots of the incoming sector data for diagnostics and accumulate rates of incoming muon candidates for monitoring.

The MUCTPI is implemented as a single-crate 9U VME system with three different types of modules and a dedicated active backplane as shown in Figure 2.



Figure 2: Overview of the MUCTPI

The octant module (MIOCT) receives the muon candidates from the trigger sector logic and resolves overlaps. The active backplane (MIBAK) performs the multiplicity summing, the readout transfer and the timing signal distribution. The CTP interface module (MICTP) receives timing and trigger signals from the CTP and sends multiplicities to the CTP. The readout driver module (MIROD) sends summary information to the Level-2 trigger and the DAQ.



Figure 3: The MUCTPI in ATLAS

A prototype of the MUCTPI was installed in the experiment in 2005. It provided almost full functionality and missed only some flexibility in the overlap handling. The MUCTPI has been upgraded incrementally to the final system. Figure 3 shows the setup in the experiment with 16 MIOCTs, the MIROD and the MICTP. The MICTP is currently the last prototype module which, although it provides full functionality, will soon be replaced by a new MICTP. The new MICTP is based on a more recent FPGA allowing all logic to be in a single device. It also uses the same PCB as the MIROD. This is useful for providing spares to the MUCTPI. Another complete and another partial MUCTPI are available in the laboratory as spares as well as for firmware modification and software development.

### III. THE CTP

The CTP [3] receives, synchronizes and aligns trigger inputs from calorimeter and muon triggers, and others. It generates the Level-1 Accept (L1A) according to a programmable trigger menu. The CTP has, in addition, the following functionality: it generates a trigger-type word accompanying every L1A; it generates preventive dead time in order to prevent front-end buffers from overflowing; it generates summary information for the Level-2 trigger and the DAQ; it generates a precise time stamp using GPS with a relative precision of 5 ns and an expected absolute precision of 25 ns after calibration; it generates other timing signals like the Event Counter Reset (ECR). The CTP can measure the timing of the trigger inputs which is very important during commissioning. It can take snapshots of the incoming trigger inputs for diagnostics and accumulate rates of incoming trigger inputs and internally generated trigger combinations for monitoring.



Figure 4: Overview of the CTP

The CTP is implemented as a single-crate 9U VME system with six different types of modules and three dedicated backplanes as shown in Figure 4. The machine interface module (CTPMI) receives timing signals from the LHC. The input module (CTPIN) receives trigger input signals, synchronizes and aligns them, and sends them to the Pattern-In-Time (PIT) backplane using a switch matrix. The monitoring module (CTPMON) performs bunch-by-bunch

monitoring. The core module (CTPCORE) forms the L1A using Look-Up Tables (LUTs) and Content-Addressable Memories (CAMs), and sends summary information to the Level-2 trigger and the DAQ. The output module (CTPOUT) sends timing signals to the TTC partitions and receives calibration requests. The calibration module (CTPCAL) timemultiplexes the calibration requests of the detectors and receives additional front panel inputs. The Pattern-In-Time (PIT) bus transports the synchronized and aligned trigger signals from the CTPINs to the CTPCORE and the CTPMON. The common (COM) bus contains timing signals. The calibration (CAL) bus transports the calibration requests from the CTPOUTs to the CTPCAL.

The final CTP was installed in the experiment in 2006. Figure 5 shows the CTP with the CTPMI, three CTPINs, the CTPMON, the CTPCORE, four CTPOUTs, and the CTPCAL. There is an additional NIM-to-LVDS fan-in module for receiving NIM trigger signals and routing them to one of the CTPINs. Another two complete CTPs are available in the laboratory as spares as well as for firmware modification and software development.



Figure 5: The CTP in ATLAS

## IV. TEST FRAMEWORK

## *A. Principles and Architecture*

The problem of the test framework is the considerable number of different types of modules. In the CTP there are six different types, in the MUCTPI three. There are in total enough modules to populate two full systems of each type and a third partially. One of each system is installed in the experiment, the other two in the laboratory. The MUCTPI and CTP are also relatively complex: There is a large number and size of inputs. There are many parameters for configuration and processing. And there are many different use cases, in particular for testing of prototypes which requires a rapid evolution of the firmware and software, for testing of the modules which guarantees the quality of the production, and for operation which provides the integration into the experiment.

The test framework is based on several principles. The VME interface is the same for all modules. This is true for the hardware whose design is a copy from module to module, for the firmware which is used like an IP block [5], and for the software, i.e. the VME drivers and libraries. The modules were also right from the beginning designed with diagnostic memories which can be used for input to capture data, or for output to inject data into the processing. The modules were also designed with readout facilities and counters. The eventlike readout is used for monitoring, and the counters which can be integrating or on a bunch-by-bunch basis allow counting of data or of the BUSY status at several stages of the processing. The entire test framework is based on the common software framework provided by the ATLAS Readout Driver Crate DAQ (RCD) [6]. This framework contains the ATLAS VME driver and library, contains many utilities for bit strings, modules, JTAG chains, menus, and components like the HPTDC and DELAY25 chips. The framework also provides access to the ATLAS TDAQ control system.

|                    |            | <b>Test</b>  |  | <b>High-level</b> |  |  |  |  |  |
|--------------------|------------|--------------|--|-------------------|--|--|--|--|--|
| <b>Python Menu</b> | Gen        | <b>Sim</b>   |  |                   |  |  |  |  |  |
| <b>Module</b>      |            | <b>ATLAS</b> |  |                   |  |  |  |  |  |
|                    | XML        |              |  |                   |  |  |  |  |  |
|                    | <b>VME</b> |              |  |                   |  |  |  |  |  |
| <b>SLC</b>         |            |              |  |                   |  |  |  |  |  |

Figure 6: The Framework Architecture

The test framework is organised in several layers, see Figure 6. At the lowest layer is the Scientific Linux CERN (SLC). On top of that is the ATLAS RCD framework with its VME driver and library. On top of that are the module libraries for each type of module. These libraries are partly generated automatically from XML files and partly hand coded. They consist of  $C^{++}$  objects with methods for access of all functionality provided by any module. On top of the module libraries are PYTHON scripts which are used for rapid testing and menu programs which give full access to all functionality. Ancillary objects are generators for providing test data and simulators for behavioural simulation of the modules. On top of these are the test programs which provide a test suite for all modules and systems. On top of the ATLAS TDAQ control software is the high-level software which allows one to configure, control and monitor the modules in the experiment.

## *B. Low-level Software*

The low-level software is based on the module libraries. Part of the module libraries can be generated in an automated way from XML code. The main idea is that a module's VME map is described in an XML file in terms of a module, its blocks, the registers in the blocks, the fields in the registers and the possible values for each field. An excerpt of such a description can be seen in Figure 7. A (pre-) compiler is then run over the XML file which generates C++ classes for bit string objects for all fields and registers and a C++ class for the module with read and write methods for all registers and fields using the bit strings. These methods can be used in test programs in a very simple and intuitive way. Future extensions foreseen to the tool are to add more detail, e.g. read-only, write-only, read-modify-write functions, and to add support for more complex parts of the VME map like memories and block transfers.

| $\leq$ module name="MICTP" type="A32" size="0x00080000"  >                               |                      |
|------------------------------------------------------------------------------------------|----------------------|
| <br>hlock name="Readout">                                                                |                      |
| <register <="" name="MultiplicityConfig" td=""><td>addr="0x00000200"&gt;</td></register> | addr="0x00000200">   |
| <field <="" name="RamEnable" td=""><td><math>mask = 0x00000002"</math></td></field>      | $mask = 0x00000002"$ |
| <value <="" name="DISABLED" td=""><td><math>data = 0x00000000</math></td></value>        | $data = 0x00000000$  |
| <value <="" name="ENABLED" td=""><td><math>data = 0x00000002"</math></td></value>        | $data = 0x00000002"$ |
| $\le$ field>                                                                             |                      |
|                                                                                          |                      |

Figure 7: Excerpt of a Module's VME Map using XML

The C++ class automatically generated by the XML compiler is augmented by code containing all higher level functions for sequences of operations as needed by the module. Then a menu program is developed from the module library which is based on a text-driven menu and provides access to all methods and thus all registers and fields of the module. The code of the menu program can easily be extended whenever new features are included into the module library. It is foreseen in the future to develop a tool to derive the menu automatically from the module library. There exists a menu program for each type of module which gives detailed and complete control over the module and which is intended to be used by an Level-1 central trigger expert.

In addition to the menu program for each type of module there also is a generator for generating input data for singlemodule and full-system tests. The patterns generated include counter-like patterns, walking ones, a toggling pattern, random data, and more complex data with lots of overlapping candidates for the MIOCT testing. Also for each type of module there is a simulator which uses the same configuration as the hardware module and which generates the expected output data from given input data. This can be used for comparison between observed and expected data in tests. The simulation includes, e.g. the overlap handling of the MIOCT modules, the data processing for readout and counters of the MIOCT, MICTP, MIROD, CTPIN, CTPCORE and CTPMON modules, as well as the complete trigger generation in the CTPCORE module.

#### *C. Test Suite*

Based on the module libraries, the generators, and simulators there is a suite of tests programs for single-module and full-system tests. The single-module tests usually test register and memory access, and are based on read-write tests. There are also single-module initialisation programs which write a default configuration to the module and which read back the configuration if asked to do so. The more interesting tests concern several modules or full systems. They usually use data from the generator or a file to load into the modules, loop over the data, read back data from readout or counters and compare them to simulation. As an example, the "testCtpReadout" configures the CTP, loads the CTPIN test memories with data which will generate a L1A, starts the trigger generation by enabling the CTPIN test memories and removing the BUSY from the CTPMI, and reads the data from the CTPCORE readout FIFOs and compares them to the expected data from simulation. This tests the full chain from CTPIN memory and switching matrix, the PIT bus, the CTPCORE LUT and CAM processing, as well as the CTP timing.

Other programs in the test suite are concerned with timing alignment which is very important for the Level-1 trigger and the experiment. Using data from the CTPIN test memories or the trigger inputs with a single candidate per orbit of the LHC the CTPIN, CTPMON, and CTPCORE BC Identifier (BCID) values can be aligned with respect to the BCID in the CTPCORE readout by using the BCID offsets at several stages in the processing. Similarly, using data from the MIOCT test memories or the muon sectors sending a test pattern, the MIOCT, MICTP and MIROD BCID values can be aligned with respect to the BCID in the MICTP using the BCID offsets in several stages in the modules as well as the MIOCT muon sector data pipelines. As a consistency check the MIOCT can capture the muon sector data in its test memory and compare the muon sector BCID offsets with the MIOCT BCID.

## *D. High-level Software*

The high-level software provides integration of the MUCTPI and CTP into the experiment by supplying configuration, control, and monitoring to the ATLAS TDAQ control system [7].

The trigger configuration is taken from the ATLAS trigger database which stores the event selection strategy comprising the Level-1 trigger, Level-2 trigger, and Event Filter (Level-3 trigger). The trigger tool is a graphical user interface which allows one to browse and edit all trigger menus in the trigger database. The trigger menu compiler automatically translates the high-level description of the Level-1 trigger menu to all necessary configuration files of the CTP for loading the CTPIN switch matrices and the CTPCORE LUTs and CAMs [8].

In order to be integrated with the ATLAS TDAQ control system each module type needs to be described in a schema in the ATLAS configuration database. Such a schema contains the full configuration data, except for the trigger configuration which comes from the trigger database. The schema also contains provision for describing the flow of data between modules. This allows for an automatic setup of the inputs of the BUSY (including S-Link XOFFs) and MUCTPI sectors. A plug-in for each module type into the standard RCD controller provides the dynamic aspect of control in the sense that during

setup the configuration of each module is read from the configuration database and written into the module using the low-level library. In order to organise the setup in logical sets some plug-ins span several modules, e.g. all active MIOCTs of the MUCTPI, all active CTPINs of the CTP or the BUSY monitoring of the CTP which reads from all CTP modules.

#### *E. Monitoring*

The monitoring of the MCUTPI and the CTP is based on the principle of a producer-consumer model: the producer of information is (part of) a plug-in controller which sends data to the ATLAS information service (IS) [9], a network-based information exchange system. The consumer reads data from the IS and analyses and presents the information, usually in the form of a graphical user interface developed using Qt. As an example, the display of the ATLAS MIOCT Monitoring GUI can be seen in Figure 8. It shows the status of all MIOCTs of the MUCTPI.

|                                      | MainWindow                 |                                     |                                   |                                 |                               |                        |                              |                                  |                                |                                     |                            |                                    |                         |                            |                            | $-1$ $\leq$ $\times$         |
|--------------------------------------|----------------------------|-------------------------------------|-----------------------------------|---------------------------------|-------------------------------|------------------------|------------------------------|----------------------------------|--------------------------------|-------------------------------------|----------------------------|------------------------------------|-------------------------|----------------------------|----------------------------|------------------------------|
| Edit Help<br>Eile                    |                            |                                     |                                   |                                 |                               |                        |                              |                                  |                                |                                     |                            |                                    |                         |                            |                            |                              |
|                                      | 010 <sub>c</sub><br>٨      | $01$ ot<br>5                        | 020 <sub>5</sub><br>6             | 010t<br>7                       | 010t<br>8                     | 010t<br>9              | 010t<br>10                   | 0105<br>22                       | 01c <sub>0</sub><br>14         | 010 <sub>5</sub><br>15              | $01$ ot<br>16              | 020t<br>17                         | $01$ ot<br>18           | 0105<br>19                 | $01$ ot<br>20              | 010 <sub>5</sub><br>21       |
| Modulenri<br><b>XyentCtr:</b>        | ÷<br>1400                  | 1402                                | 1406                              | 1408                            | ÷<br>1410                     | ×<br>1412              | ×<br>1415                    | 1417                             | 1419                           | 1421                                | $+0$<br>1423               | $+1$<br>1426                       | 12<br>1428              | $\rightarrow$<br>1430      | 74<br>1431                 | -15<br>1435                  |
| <b>ECRCLE1</b>                       | 0.578<br>26<br>Ox1a        | O(57n<br>26<br>Ox1n                 | 0x57e<br>26<br>Oxin               | OH590<br>26<br>Ox1n             | 0 <sub>92</sub><br>26<br>Ox1a | $O0$ 591<br>26<br>Ox1a | 00587<br>26<br>Oxin          | O4589<br>26<br>Ox1a              | $0.65$ $83$<br>26<br>Cx1a      | 0.584<br>26<br>Oxin                 | OK59£<br>26<br>Ox1a        | 0 <sub>592</sub><br>26<br>Ox1a     | 0<594<br>26<br>Ox1a     | 00596<br>26<br>Ox1a        | 000599<br>26<br>Oxla       | 0.659b<br>26<br>Oprin        |
| Derandomizer<br>Pilos                | $\circ$                    | $\circ$<br><b>A</b>                 | $\circ$                           | $\circ$ .                       | $\circ$<br>O <sub>A</sub>     | $\circ$                | $\circ$                      | $\circ$ .                        | $\circ$                        | $\circ$                             | $\circ$                    | $\circ$                            | $\circ$                 | $\circ$                    | $\circ$                    | $\circ$                      |
|                                      | $\circ$<br>$\circ$<br>0.01 | $\circ$<br>$\circ$<br>0.38          | Q <sub>M</sub><br>$\circ$<br>0.03 | $Q =$<br>۵<br>0.03              | $\circ$<br>0.03               | Ο<br>۵<br>0.03         | OH<br>۵<br>0.03              | $Q$ as<br>$\ddot{\circ}$<br>0.03 | O as<br>$\circ$<br>0.03        | O <sub>A</sub><br>$\bullet$<br>0.03 | $\circ$<br>$\circ$<br>0.38 | $\circ$<br>۵<br>0.0%               | O as<br>$\circ$<br>0.03 | $\circ$<br>$\circ$<br>0.03 | $\circ$<br>$\circ$<br>0.03 | OH<br>$\bullet$<br>٠<br>0.03 |
| Readnut<br>Fifo:                     | $\circ$                    | o                                   | Qr                                | o                               | $\circ$                       | o                      | Qr                           | Оr                               | Qr                             | Ö.                                  | $\circ$                    | $\circ$                            | Qr                      | o                          | $\circ$                    | O <sub>F</sub>               |
|                                      | O <sub>A</sub><br>$\circ$  | $\circ$<br>$\mathbf{r}$<br><b>O</b> | $Q$ $M$<br>$\circ$                | $Q =$<br>ö                      | O Al<br>$\circ$               | $\circ$<br>ö           | ON<br>$\circ$                | $Q$ as<br>۰                      | $Q =$<br>$\circ$               | $Q$ as<br>$\Rightarrow$             | O <sub>x</sub><br>$\circ$  | $\circ$<br>$\mathbf{a}$<br>$\circ$ | $Q \approx$<br>$\circ$  | $\circ$<br>A1<br>$\circ$   | $\circ$<br>$\circ$         | Oar<br>$\bullet$             |
| Monitoring                           | 0.01                       | 0.33                                | 0.03                              | 0.03                            | 0.03                          | 0.03                   | 0.03                         | 0.03                             | 0.01                           | 0.01                                | 0.33                       | 0.01                               | 0.03                    | 0.03                       | 0.01                       | 0.03                         |
| $x_{1}$                              | Qr<br>O <sub>0</sub>       | $\circ$<br>O <sub>10</sub>          | Qr<br>Q <sub>M</sub>              | $\circ$<br>×<br>O <sub>10</sub> | $\circ$<br>$Q \land$          | $\circ$<br>$\circ$     | Qr<br>ON                     | O.7<br>$Q$ as                    | Qr<br>QM                       | Qr<br>O <sub>M</sub>                | Q<br>O <sub>A</sub>        | Qr<br>Q <sub>M</sub>               | Qr<br>O M               | $\circ$<br>O N             | $\circ$<br>O <sub>n</sub>  | Or<br>ON                     |
|                                      | $\circ$<br>0.01            | $\circ$<br>0.33                     | $\circ$<br>٠<br>0.03              | $\bullet$<br>0.03               | $\circ$<br>-<br>0.0%          | ø<br>-<br>0.03         | o<br>0.03                    | $\circ$<br>٠<br>0.03             | $\circ$<br>٠<br>0.03           | $\circ$<br>-<br>0.03                | $\circ$<br>0.13            | $\circ$<br>0.03                    | QZ<br>0.05              | $\circ$<br>-<br>0.03       | $\circ$<br>-<br>0.03       | $Q$ is<br>0.03               |
| <b>Dun't</b><br>Frection:            | 0.01                       | 0.0%                                | 0.03                              | 0.0%                            | 0.03                          | 0.0%                   | 0.03                         | 0.03                             | 0.03                           | 0.01                                | 0.03                       | 0.0%                               | 0.03                    | 0.0%                       | 0.03                       | 0.0%                         |
| BusyCtr:                             | $\Omega$<br>37586579       | o<br>7186596                        | o<br>37586588                     | e<br>15865934                   | $\alpha$                      |                        | $\Omega$<br>7586595437586633 | $\Omega$                         | $\alpha$<br>175866098475866105 | $\alpha$<br>175866148171866217      | e                          | ó<br>375866276                     | $\alpha$<br>15866284    | o<br>5866281               | 5866342                    | $\circ$<br>7586635           |
| TurnCtr:<br><b>TuhmsGLE 1</b>        | 330100                     | 330108                              | 336112                            | 330114                          | 7586598<br>336116             | 336119                 | 336111                       | 336123                           | 336125                         | 336127                              | 336130                     | 336132                             | 336134                  | 336136                     | 336131                     | 336141                       |
| ZeroGupor:                           |                            |                                     |                                   |                                 |                               |                        |                              |                                  |                                |                                     |                            |                                    |                         |                            |                            | z                            |
| Monitor:                             | Disabled                   | Disabled                            | Disabled                          | Disabled                        | Disabled                      | Disabled               | Disabled                     | Disabled                         | Disabled                       | Disabled                            | <b>Disabled</b>            | Disabled                           | Disabled                | Disabled                   | Disabled                   | Disabled                     |
| ReadoutWin:<br><b>Ougnal P Lotts</b> | ٠<br>25                    | ٠<br>25                             | ٠<br>25                           | ٠<br>25                         | 25                            | 25                     | ٠<br>$\overline{12}$         | 5<br>25                          | ٠<br>25                        | 25                                  | B<br>25                    | ٦<br>25                            | 5<br>25                 | 5<br>25                    | 25                         | 5<br>25                      |
| Mult.Pices                           | 28                         | 28                                  | 28                                | 28                              | 28                            | 28                     | 28                           | 28                               | 28                             | 28                                  | 28                         | 28                                 | 28                      | 28                         | 28                         | 28                           |
| <b>BDIDOffset</b>                    |                            |                                     |                                   |                                 |                               |                        |                              |                                  |                                |                                     |                            |                                    |                         |                            |                            |                              |
| Readout:                             | 208                        | 20R                                 | 208                               | 20R                             | 208                           | 208                    | 208                          | 20R                              | 208                            | 20R                                 | 208                        | 208                                | 208                     | 208                        | 208                        | 208                          |
|                                      | $0 + 10$<br>177            | 0.440<br>177                        | $0 + 10$<br>177                   | 0.440<br>177                    | O <sub>4</sub> 10<br>177      | 0.630<br>177           | O4640<br>107                 | 0.440<br>177                     | Ca30<br>177                    | 04.30<br>177                        | $0 + 10$<br>177            | 0 <sub>0</sub> 10<br>177           | $0 + 10$<br>177         | O <sub>4</sub> 10<br>177   | $O = 21$<br>177            | O(4n)<br>177                 |
| Sector:                              | Oxb1                       | Oxb1                                | oxb1                              | $(x+b)$                         | $\cosh 1$                     | Oxb1                   | Oxb1                         | Oxb1                             | $cb1$                          | oxb1                                | Oxb1                       | oxb1                               | Oxb1                    | Oxb1                       | Oxb1                       | Oxb1                         |
| MonCir:                              | $\Omega$                   | $\circ$                             | $\circ$                           | $\circ$                         | $\circ$                       | $\circ$                | $\Omega$                     | $\circ$                          | $\circ$                        | $\circ$                             | $\circ$                    | $\circ$                            | $\circ$                 | $\circ$                    |                            | $\Omega$                     |
|                                      | 0 <sup>1</sup>             | Ow0                                 | CHAO                              | $0 - 0$                         | $Q_{\text{NS}}$               | Osef                   | 0 <sup>4</sup>               | Own <sup>O</sup>                 | $0 - 0$                        | $Q_{\text{max}}$                    | $0 - 0$                    | Oss <sup>O</sup>                   | 0 <sup>1</sup>          | OssO <sub>1</sub>          | Crack <sup>3</sup>         | 0 <sup>4</sup>               |
| PhMonCtr:                            | 181                        | 181                                 | 181                               | 181                             | 181                           | 181                    | 181                          | 181                              | 181                            | 181                                 | 181                        | 181                                | 181                     | 181                        | 181                        | 181                          |
|                                      | oxb5                       | oxb5                                | oxb5                              | oxb5                            | oxb5                          | Oxb5                   | Owh <sub>5</sub>             | OWN <sup>5</sup>                 | Cwb/5                          | oxb5                                | oxb5                       | oxb5                               | Oxb5                    | oxb5                       | Oxb5                       | Oxb <sub>5</sub>             |
| Memory:                              | 3555<br>$Om\delta n$       | 3555<br>Oseda3                      | 3555<br>$O = 3$                   | 3555<br>Ossile3                 | 3555<br>Ostda3                | 3555<br>Omde3          | 3555<br>Omda3                | 3555<br>Denda 3                  | 2555<br>$Om\&o$                | 3555<br>Coul (n)                    | 3555<br>Ostaba 3           | 3555<br>$O = 102$                  | 3555<br>Omde3           | 3555<br>$O = 4a3$          | 3555<br>Osed e.)           | 3555<br>$OmA+3$              |
|                                      |                            |                                     |                                   |                                 |                               |                        |                              |                                  |                                |                                     |                            |                                    |                         |                            |                            |                              |

Figure 8: The MIOCT Monitoring GUI



Figure 9: The MIOCT Rate Monitoring

Some regular monitoring tasks are run for data quality monitoring, e.g. the timing-in of the MUCTPI can be checked by reading the per-bunch counters of the MIOCTs and writing the data into histograms. In Figure 9 an example can be seen for a special run where the muon sectors sent a known test pattern. One can clearly see that one of the sectors is wrongly aligned in BCID. This was corrected in the configuration database and used from the next run onwards.

#### V. SUMMARY

The ATLAS Level-1 MUCTPI and CTP framework for testing and operation covers single-module and full-system testing, provides integration into the experiment, as well as many monitoring facilities. With these the ATLAS Level-1 central trigger is ready for taking data with beam.

#### VI. REFERENCES

[1] The ATLAS Collaboration, "*The ATLAS Experiment at the CERN Large Hadron Collider*", JINST 3 (2008) S08003.

[2] S. Haas *et al.*, "*The ATLAS Level-1 Muon to Central Trigger Processor Interface*", Topical Workshop on Electronics for Particle Physics, CERN-2007-007, November 2007.

[3] R. Spiwoks *et al.*, "*The ATLAS Level-1 Central Trigger*  Processor (CTP)", 11<sup>th</sup> Workshop on Electronics for LHC and Future Experiments, CERN/LHCC/2005/038 265, November 2005.

[4] S. Ask et al., "*The ATLAS Central Level-1 Trigger Logic and TTC System*", JINST 3 (2008) P08002.

[5] R. Spiwoks, "*The VMEbus Interface of the Central Trigger Processor*", https://edms.cern.ch/document/428910.

[6] S. Gameiro *et al.*, "*The ROD Crate DAQ Software Framework of the ATLAS Data Acquisition System*", IEEE Trans. Nucl. Sci. **53** (2006) 907-911.

[7] The ATLAS Collaboration, "*ATLAS High-level, Trigger, Data Acquisition, Controls*", Technical Design Report, CERN/LHCC/2003-022.

[8] R. Spiwoks *et al.*, "*Configuration of the ATLAS Trigger*",  $11<sup>th</sup>$  Workshop on Electronics for LHC and Future Experiments, CERN/LHCC/2005/038 269, November 2005.

[9] A. Corsu-Radu et al., "*First-Year Experience from the ATLAS Online Monitoring Framework*", 17th International Conference on Computing in High-Energy and Nuclear Physics", [http://cdsweb.cern.ch/record/1181482,](http://cdsweb.cern.ch/record/1181482) March 2009.

# *WEDNESDAY 23 SEPTEMBER 2009*

# *PARALLEL SESSION B3 PACKAGING AND INTERCONNECTS*

# Construction and Performance of a Double-Sided Silicon Detector Module Using the Origami Concept

C. Irmler<sup>a</sup>, M. Friedl<sup>a</sup>, M. Pernicka<sup>a</sup>

<sup>a</sup> Institute of High Energy Physics, Nikolsdorfergasse 18, A-1050 Vienna, Austria

irmler@hephy.oeaw.ac.at

#### *Abstract*

#### The APV25 front-end chip with short shaping time will be used in the Belle II Silicon Vertex Detector (SVD) in order to achive low occupancy. Since fast amplifiers are more susceptible to noise caused by their capacitive input load, they have to be placed as close to the sensor as possible. On the other hand, material budget inside the active volume has to be kept low in order to constrain multiple scattering.

We built a low mass sensor module with double-sided readout, where thinned APV25 chips are placed on a single flexible circuit glued onto one side of the sensor. The interconnection to the other side is done by Kapton fanouts, which are wrapped around the edge of the sensor, hence the name Origami. Since all front-end chips are aligned in a row on the top side of the module, cooling can be done by a single aluminum pipe.

The performance of the Origami module was evaluated in a beam test at CERN in August 2009, of which first results are presented here.

### I. INDRODUCTION

The Belle Detector [1] is located at the interaction point of the KEK B-factory (KEKB), a low-energy but high luminosity asymmetric electron-positron collider in Tsukuba, Japan [2]. The energies of KEKB are 8 GeV for  $e^-$  and 3.5 GeV for  $e^+$ , respectively. Since its inauguration in 1999 the luminosity of the collider was increased continuously and reached a new world record of  $2.11 \times 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> in June 2009.

A major upgrade of the KEKB and the Belle detector (Belle II) is foreseen until 2013/2014. The target luminosity is  $8 \times$  $10^{35}$  cm<sup>-2</sup>s<sup>-1</sup>, which is about 40 times the present value. Accordingly, a similar increase of background at the interaction region is expected. This leads to significant increase of the occupancy of the silicon vertex detector (SVD), which is currently about 10% for the innermost layer and thus already at the limit with respect to track finding. Moreover, the trigger rate will rise from 450 Hz to up to 30 kHz (10 kHz in average). In the present vertex detector (SVD2) [3], the sensors are read out by VA1TA [4] front-end chips. The VA1TA is operated at only 5 MHz and has a sample-hold circuit, which is blocked during read out. Since this would lead to an enormous dead time, it is clear that the VA1TA can not be used at such high trigger rates. Both issues, occupancy and dead time, can be solved by using a front-end chip with shorter shaping time, faster readout clock and integrated pipeline for the Belle-II silicon vertex detector (SuperSVD) readout.

## II. APV25

The APV25 front-end chip, originally developed for CMS at CERN, was identified to fulfill the requirements of the SuperSVD not only in terms of occupancy and dead time, but also concerning radiation hardness. Due to its shaping time of 50 ns an occupancy reduction by a factor of 12.5 can be achieved compared to the VA1TA with 800 ns shaping. This factor not only results from the quotient of the peaking times, which is 16, but also considers measured shaper waveforms of the chips as well as thresholds. Moreover, the APV25 offers a so called *multi-peak mode*, which allows to read out six consecutive samples of the shaper output. By processing this data with FPGAs, the actual hit timing can be determined with a precision of few nanoseconds, hence resulting in an additional occupancy reduction [5, 6]. Thanks to the clock frequency of 40 MHz and an 192 cells deep analog pipeline, the APV25 can be read out continuously up to a trigger rate of about 50 kHz without appreciable dead time.

Unfortunately, faster shaping comes along with a higher susceptibility to noise, but robust tracking requires high spatial resolution and thus in practice a minimum cluster signal-to-noise ratio (SNR) of ten. The cluster SNR is given by the sum of cluster signals divided by the square sum of RMS noise of all strips in the cluster. Since the signal is limited by the sensor thickness  $(300\mu m)$ , it is nessesary to minimize noise. The noise figure of the APV25 is given by  $ENC = 250 e + 36 e/pF$ and thus worse then that of the slow VA1TA, for which it is  $ENC = 180 e + 7.5 e/pF$ . There is no possibility to reduce the constant term of this equation, but the second one is proportional to the capacitive input load of the chip, which is mainly given by the sensor geometry and the length of the interconnections between sensor and readout chip. Since the geometry is defined by physics requirements, the only way to reduce the capacitive load and thus ensure high SNR is to place the APV chips as close as possible to the sensors.

#### III. LADDER DESIGN

The current SVD2 is composed of four layers of 4 inch double sided silicon detectors (DSSD). From the first to the fourth layer, it consists of cylindric arranged ladders with 2, 3, 5 and 6 DSSDs, respectively. The radius of the innermost layer is 20 mm, that of the outermost is 88 mm. Since KEKB is a low energy machine and thus multiple scattering has to be considered with respect to vertex resolution and tracking efficiency, material budget in the sensitive area is an important issue. Therefore, in SVD2 up to three sensors were concatenated (ganged) and commonly read out by hybrids located outside of the acceptance region at the edge of each ladder. Fig. 1 shows a photograph of all four ladder types. As all ladders are readout by the same numbers of hybrids, this scheme further reduces the number of readout channels, but also creates ambiguities, which have to be resolved by the tracking algorithm.



Figure 1: The ladders of all four layers of the present Belle Silicon Vertex Detector (SVD2). In the ladders of the outermost layer, three sensors are ganged and read out by hybrids on either side.

On the other hand, sensor ganging significantly increases the capacitive load of the readout chips. As described in section II., this is no problem in case of SVD2 because of the moderate peaking time and thus low noise figure of the VA1TA chip. However, the situation looks different in case of SuperSVD and AVP25, where the resulting high capacitive load of ganged sensors would be to high. In the past we built a prototype module using two 4" DSSDs read out by four APV25 chips on each side. Aiming to compare the SNR of a single and two ganged sensors, we concatenated 384 of the 512 strips on each side, so that one of the APV chips on each side is only connected to a single sensor.



Figure 2: Tentative layout of the SuperSVD with two DEPFET pixel layers surrounded by four cylindrical DSSD layers. The green sensors are read out conventionally, the red ones use the Origami chip-onsensor concept.

Measurement results had shown, that even ganging of two 4" sensors leads to a poor cluster signal-to-noise ratio of 10 (nside) or below (p-side), depending on the pitch, while the values of a single sensor are reasonable [8]. Hence ganging of two ore more detectors is not an option with fast shaping.

Furthermore it has to be considered that the SVD in Belle II will extend to a radius of 140 mm. As shown in fig. 2, it will again consist of four cylindrical layers of double-sided silicon sensors and two additional DEPFET pixel [7] layers in the innermost region, hence the SuperSVD will have six layers in total. Since the length of the ladder and thus also the number of sensors and readout channels, respectively, increases with the radius, sensors made of 6" wafers are prefered.

In order to achieve reasonable SNR and thus a good spatial resolution, each sensor will to be read out individually by four or six APV25 chips per side, depending on its pitch and location. In case of the innermost layer, but also the sensors located on the edges of layer 4 to 6, this can be be done using conventional hybrids mounted outside the acceptance. The inner sensors will be read out by hybrids following the Origami chip-on-sensor concept, which will be described in the next section.

## IV. ORIGAMI CHIP-ON-SENSOR CONCEPT

As a conclusion of the discussion about fast shaping and sufficient signal-to-noise ratio it is obvious that the APV25 chips have to be placed as close as possible to the sensor strips, leading to a chip-on-sensor concept. This means that the readout chips together with the hybrid circuit sit on the top of the sensor in order to minimize the length of the fanouts. Using such a concept allows to read out a single side of a DSSD. In 2006 we built a prototype module based on this scheme, where the short strips of a 4" DSSD (n-side) were read out by APV25 chips located on a hybrid made of a double-layer Kapton circuit [8] separated from the sensor by a sheet of rigid foam called Rohacell [9]. The module was tested in several beam tests, where it showed excellent performance. The achieved SNR was about 18 and thus significantly higher than that of modules using the same sensor but conventional read out.

Keeping the material budget in mind, there was the question how it is possible to extend this chip-on-sensor concept to double-sided readout without doubling everything.

The solution is the "Origami chip-on-sensor concept", which we already presented earlier [8]. Nevertheless, we will briefly describe this idea here. Fig. 3 shows drawings of top and side views of an Origami chip-on-sensor module. In that scheme, the APV25 chips of both sides are placed on a single flexible circuit, mounted onto one side of the sensor. This flex-hybrid is made of only three copper layers and contains integrated pitch adapters to connect the strips on the same side as the hybrid. The channels of the opposite side are attached by small flexible fanouts wrapped around the edge of the sensor, hence the name Origami. All connections between flex pieces, sensor and APV chips are made by wire bonds. The depicted design is indented for a 4" DSSD with 512 strips on both sides, each read out by four APV25 chips, respectively.

Thermal and electrical insulation between hybrid and sensor is given by a 1 mm thick layer of low mass, but rigid foam (Rohacell). Nevertheless, sufficient cooling of the APVs is required,

since the power dissipation of each chip is about 350 mW. By testing several cooling options using a thermo-mechanical mockup of a future SVD ladder, liquid cooling was identified to be the only feasible solution [8] and thus foreseen in the concept. Arranging all front-end chips in a row allows cooling by a single aluminum pipe, that eventually can also be used as a mechanical support, together with a zylon rib, which is foreseen in longitudinal direction. The mechanical structure of the concept is held very simple and mainly based on the design of the present SVD ladders. A detailed study, which also addresses the stability of a whole ladder as well as thermal issues, has been started recently.

#### **a) Top view:**



Figure 3: Top and side views of the Origami chip-on-sensor concept for a 4" DSSD. Both are dimensional, but on a different scale. a) Top view: The four APV25 chips which read out the strips on the opposite side are shown in green for clarity and the flex pieces to be wrapped around the edges are straightened out. b) Side view: The wrapped flex, which connects the strips of the bottom side, is located at the left edge.

It is clear that using such a hybrid inevitably increases the material budget in the sensitive volume, but there is no alternative solution, particularly in the outer layers, to ensure reasonable SNR with fast shaping. To achieve lowest possible material budget, the APV chips will be thinned down to approximately 100  $\mu$ m. However, the calculated average material budget is about 0.72 % $X_0$  and thus about 1.5 times that of the conventional design, but offering a significant improvement of signalto-noise ratio.

# V. APV THINNING

The APV25 is about 300  $\mu$ m thick, but its active electronics is only in the surface. We plan to thin it down to  $100 \mu m$  in order to minimize the material budget of the Origami modules. Since this was never tested in the past, we sent one wafer with 319 good dies to the French company EDGETEK/WSI, for thinning and dicing. We received 314 good dies with a thickness of 106  $\mu$ m in average. Only 5 pieces were lost, corresponding to a yield of 98.4%. We randomly took 16 of them to equip 4 (conventional) hybrids. Moreover, we also assembled one hybrid with normal (unthinned) APVs for comparison. Electrical tests of all five hybrids have shown that there is no measurable difference with respect to signal quality and noise of both thinned and normal APV25 chips, respectively. Hence, we conclude that we can actually use thinned chips.

#### VI. PROTOTYPE ASSEMBLY

In order to show that our idea is in principle feasible, we built the first fully functional prototype of an Origami chip-onsensor module. Therefore we used the same 4" DSSD from Hamamtsu, Japan, as for the prototype modules described in sections III. and IV.



Figure 4: An Origami hybrid and the two flex fanouts.

We started with designing the layout of the three flex circuits, which were later produced by the CERN PCB workshop. An image of the final flex pieces is shown in fig.4. The hybrid is composed of only three copper and two Kapton layers with thicknesses of 10  $\mu$ m and 25  $\mu$ m, respectively. The component layer as well as the bonding pads of the fanouts are goldplated. With respect to the design rules of the PCB workshop, both fanouts as well es the pitch adapters on the hybrid were implemented in a staggered 2-layer design.

### *A. Attaching Flexes on Bottom Side*

Assembling an Origami module requires about 12 steps in total from which the most interesting or critical ones will be described here. First of all, the two flex fanouts were glued onto the sensor side with the long strips, which later becomes the bottom side of the module. Therefore a two component epoxy paste adhesive (Araldite 2011) was used. The sensor was placed on a custom jig (jig1) with a porous stone inlay and held by vacuum. At this point, precise alignment of the flexes against the bonding pads of the sensor and the hybrid is very important. To adjust the distance between the two flexes we used a dummy prototype of the hybrid (fig. 5). Since this is not suitable for serial production, more precise alignment tools should be used instead.



Figure 5: Gluing of the flexes onto the bottom side (long strips) of the DSSD. The pieces are aligned against the bonding pads of the strips and a sample origami hybrid (top left) using a microscope.

After curing of the glue, wire-bonding between sensor and flexes is the next and also the last task, which has to be performed on this sensor side. In order to flip the sensor and the flexes, a second very similar jig (jig2) was stuck onto the first one with three alignment pins. Once the whole thing (both jigs and the module) is turned over and vacuum is switched to jig2, jig1 can be removed.

## *B. Hybrid Assembly*

In parallel to A. we equipped the hybrid with all passive electronic components, glued it onto the Rohacell foam, attached the APV chips and did wire-bonding of the power and control lines of the APV25s. Then we performed the first electrical tests and found several open vias in the Kapton hybrid, which could mostly be repaired by soldering of thin wires. Lately, we found out that the broken vias were caused by a failure during hybrid production (the vias were not entirely filled with metal), which should not occur again in future. In the end 7 of 8 APV25 chips worked well. We performed an internal calibration run with excellent results. The last chip still had a broken via in one of its two differential output lines, which could not be repaired. Afterwards the hybrid was glued onto the the sensor and aligned to the bonding pads of both sensor and fanouts, respectively (fig. 6), followed by wire-bonding between sensor and the pitch adapters of the top-side APVs.



Figure 6: The Origami hybrid after it was glued onto the top side of the sensor. A piece of metal is put onto it to press it down during curing of the glue.

# *C. Bend and Glue Fanouts*

Initially we thought that bending of the fanouts around the edge of the sensor without damaging underlying wire bonds of the top-side will be the most critical task. Thanks to using a micro-positioner with a custom vacuum nozzle, this task was fairly easy. This tool is depicted in fig. 7 and allows very precise positioning as well as lowering of the the fanout and moreover was used to hold the pieces in place while the clue was curing. Afterwards the input channels of all APV chips where connected to their pitch adapters by wire bonding.



Figure 7: Bending and positioning of the flex fanouts using a custom vacuum tool attached to a positioner, which is normally used for probe needles.

# *D. Attaching Cooling Pipe and Frame*

One of the remaining tasks was to attach the cooling pipe onto the front-end chips. The area of the preamp/shaper of the APV25 has been identified as the region with the highest power dissipation by measurements with an infrared camera. In order to maximize cooling efficiency, this location was chosen to attach the cooling pipe. Since the chips of both top and bottom sides reside at the voltage potential of that side of sensor to which they are connected, i.e.  $\pm 40$  V, a thin electrically insulting but thermally conductive foil is placed between the chips and the pipe together with heat-conductive paste. Moreover, the cooling pipe is slightly flattened to improve the thermal contact. The connection to the cooling system (chiller) is done by two fittings located on either end of the pipe.

Finally, the mechanical support was attached and the module was built into a frame for beam tests. For availability reason we used a 5 mm high rib made of epoxy rather than Zylon as support structure of the prototype module. A photograph of both sides of the final module is shown in fig. 8.



Figure 8: Top and bottom views of the final Origami module, built into a frame for beam tests.

#### VII. BEAM TEST PERFORMANCE

In August 2009 we performed a beam test at the CERN SPS beam line, where the Origami prototype module was tested together with other, already well tested, Belle DSSD prototype modules for comparison. The beam was a mixture of pions, protons and kaons at 120 GeV/c. The Origami module has been operated for several hours both with and without cooling, respectively, were it has shown excellent performance. For cooling we used a chiller and distilled water at a temperature of 13◦C, chosen to be slightly above the dew point to avoid condensation.

|              |      | $w/o$ cooling | $w / \text{cooling}$ |               |  |  |  |
|--------------|------|---------------|----------------------|---------------|--|--|--|
| Side:        | top  | bottom        | top                  | <b>bottom</b> |  |  |  |
| Cluster SNR: | 16.7 | 11.5          | 18.5                 | 12.8          |  |  |  |

Table 1: Cluster SNR of the Origami prototype module without and with cooling, respectively.

As shown in tab. 1 a signal-to-noise ratio of about 16.7 was achieved for the top-side, which has the short strips, without cooling. Due to the narrow pitch as well as both the longer strips and larger fanouts, the result for the bottom side (p-side) is slightly worse, but still above ten. Anyhow, considering the results of the reference modules, we observed that the noise level of the whole system was about 10% higher than in our previus beam tests, caused by a cabling failure resulting in a ground loop. That means, with correct system cabling, slightly better values can be expected. Moreover, tab. 1 shows, that cooling leads to about 10 percent improvement of the signal-to-noise ratio of.

We further applied a hit time finding algorithm [8] to the data and compared the results to that of previous beam tests. The results of this analysis are plotted in fig. 9. It is clearly visible, that the present results (red crosses) and particularly that of the Origami module (red dots), are showing similar precision than previous measurements (gray crosses).

#### VIII. SUMMARY AND OUTLOOK

Motivated by the Belle II upgrade, we developed the Origami chip-on-sensor concept, allowing to read out both sides of a DSSD by APV25 chips using a single flexible circuit, sitting on the top-side of the sensor. This concept renders both low material budget and short connections between sensor and the front-end chips in order to ensure sufficient signal-to-noise ratio at fast shaping. A prototype using a hybrid composed of a flexible 3-layer Kapton circuit and a 4" DSSD was built and successfully tested in a beam.

Recently, we started to design the mechanical structure of the future Belle II SVD, aiming to build a complete prototype ladder of its outermost layer using 6" sensors and an enhanced Origami based readout.

#### **Time Resolution vs. Cluster SNR**



Figure 9: Time resolution in relation to signal-to-noise ratio of various Belle II prototype modules, measured at several beam tests.

#### **REFERENCES**

- [1] A. Abashian *et al.* (The Belle Collaboration), **The Belle Detector**, Nucl. Instr. and Meth. A 479 (2002), 117–232
- [2] S. Kurokawa, E.Kikutani, **Overview of the KEKB Accelerators**, Nucl. Instr. and Meth. A 499 (2003), 1–7, and other articles in this volume
- [3] H. Aihara *et al.*, **Belle SVD2 vertex detector**, Nucl. Instr. and Meth. A 568 (2006), 269–273
- [4] **VA1TA Chip**, http://www.ideas.no/products/ ASICs/pdf/Va1Ta.pdf
- [5] M. Friedl *et al.*, **Obtaining exact time information of hits in silicon strip sensors read out by the APV25 front-end chip.**, Nucl. Instr. and Meth. A 572 (2007), 385-387
- [6] M. Friedl *et al.*, **Readout and Data Processing Electronics for the Belle-II Silicon Vertex Detector**, this volume
- [7] P. Fischer *et al.*, **Progress towards a large area, thin DEPFET detector module**, Nucl. Instr. and Meth. A 582 (2007), 843-848
- [8] M. Friedl *et al.*, **The Origami Chip-on-Sensor Concept for Low-Mass Readout of Double-Sided Silicon Detectors**, CERN-2008-008 (2008), 277-281
- [9] Rohacell (http://www.rohacell.com) is a rigid type of styrofoam produced by Degussa.
- [10] Zylon (http://www.toyobo.co.jp/e/seihin/kc/ pbo/) is a stiff, but light-weight material made by Toyobo.

# **Application of a new interconnection technology for the ATLAS pixel upgrade at SLHC**

A. Macchiolo<sup>a</sup>, L. Andricek<sup>b,</sup> M. Beimforde<sup>a</sup>, H.-G. Moser<sup>b</sup>, R. Nisius<sup>a</sup>, R.H. Richter<sup>b</sup>

<sup>a</sup> Max-Planck-Institut für Physik, Föhringer Ring 6, D80805, München, Germany <sup>b</sup> Max-Planck Institut Halbleiterlabor, Otto Hahn Ring 6, D81739, München, Germany

[Anna.Macchiolo@mpp.mpg.de](mailto:Anna.Macchiolo@mpp.mpg.de)

#### *Abstract*

We present an R&D activity aiming towards a new detector concept in the framework of the ATLAS pixel detector upgrade exploiting a vertical integration technology developed at the Fraunhofer Institute IZM-Munich. The Solid-Liquid InterDiffusion (SLID) technique is investigated as an alternative to the bump-bonding process. We also investigate the extraction of the signals from the back of the read-out chip through Inter-Chip-Vias to achieve a higher fraction of active area with respect to the present ATLAS pixel module. We will present the layout and the first results obtained with a production of test-structures designed to investigate the SLID interconnection efficiency as a function of different parameters, i.e. the pixel size and pitch, as well as the planarity of the underlying layers.

#### I. INTRODUCTION

An upgrade of the present LHC accelerator, that is designed to reach an instantaneous luminosity of  $10^{34}$ cm<sup>-2</sup>s<sup>-1</sup>, is planned to increase this value by a factor of ten in a two phase process [1].

In Phase 1, an upgrade to a peak luminosity of  $(2-3)$  10<sup>34</sup> cm<sup>-2</sup> s<sup>-1</sup> is foreseen around the year 2014 without changes to the machine hardware. The Phase 2 upgrade (Super LHC, SLHC) is expected to be realized around 2018 reaching a maximum luminosity of  $10^{35}$  cm<sup>-2</sup> s<sup>-1</sup> by modifications to the insertion quadrupoles and changes to the main machine parameters. In this scenario the innermost layers of the ATLAS vertex detector system will have to sustain very high integrated fluences of more than  $10^{16}$  n<sub>eq</sub> cm<sup>-2</sup> [2]. The resulting defects in the semiconductor sensors cause higher leakage currents and a reduced charge collection distance. This leads to a higher noise contribution and reduced signals sizes. Thin planar pixel sensors are among the candidate technologies for the replacement of the ATLAS pixel system. At the same applied bias voltage a higher electric field is present across the detector and an increased Charge Collection Efficiency is expected with respect to sensors with a standard thickness of (250-300) μm [3]. Another challenge for the upgraded system is the increased occupancy in the pixel cells due to the higher luminosity. Reducing the pixel size in the inner part of the tracking detectors is a natural choice for decreasing the occupancy and at the same time increasing the spacial resolution. Succeeding the FE-I3 readout chip, the  $20.0 \times 18.6$  mm<sup>2</sup> FE-I4 chip [4] will

feature a reduced pixel size of  $50 \times 250 \mu m^2$  to address these issues for the Phase 1 upgrade and for the outer layers of the ATLAS pixel system at the SLHC. For the innermost pixel layers at the SLHC a still higher rate capability is needed. A possible solution is the development of vertical integrated (3D) electronics, that can lead to an additional reduction of the pixel sizes and hence the cell occupancy. This is because the 3D circuits can lead to a more compact design thanks to multiple tiers. A preliminary version of the FE-I4 chip is already being translated into the 3D technology in a Multi-Project run with Tezzaron-Chartered [5].

# *A. Vertical Integration for the ATLAS pixel upgrade*

The work reported in this paper aims at developing a new detector concept in the framework of the ATLAS Inner Tracker upgrade. In particular, we envisage a demonstrator module composed of pixel sensors, 75 and 150 μm thick, connected to their front-end electronics by the novel Solid–Liquid InterDiffusion (SLID) [6] process developed by the Fraunhofer Institut IZM-Munich. At the moment a chip designed to readout the ATLAS pixel detectors exploiting the 3D integration is not available. As a first step the present ATLAS FE-I3 chip will be used. We explore the SLID interconnection as a possible alternative to the bump-bonding, that resulted to be the main cost driver during the production of the ATLAS pixel modules. We plan to use Inter-Chip-Vias (ICV), a process also developed by IZM, for the extraction of the signals from the ASIC backside. This allows for the design of new fourside buttable devices, without additional space needed for wire bonding. The chip in the demonstrator module will be thinned down to 50 μm.



Figure 1: Proposed module layout with thin sensors and the ICV-SLID vertical integration technology.

#### II. PRODUCTION OF THIN PIXEL SENSORS

The production of n-in-n and n-in-p thin pixel sensors has been completed by the Semiconductor Laboratory (HLL) of the Max-Planck-Institut (MPP). The final active thickness is 75 μm for the n-in-n devices, and 75 or 150 μm for the n-in-p devices. The pixel sensor geometry is such that they can be interfaced to a single FE-I3 chip. The method adopted for the production of these devices, developed at the MPI HLL, allows for the adjustment of the sensor thickness freely down to a thickness of 50 μm [7]. In a first step the backside implantation is performed on the standard wafers. Then the sensor wafers are bonded to a handling substrate and thinned to the desired thickness from the front side. After polishing, the front side processing and passivation are carried out. Finally the handle wafer can be selectively removed by deep anisotropic etching. This procedure allows for frames to be left in the handle wafer to improve the mechanical stability. For the present pixel production the handle wafer has not yet been etched away since it serves as a support during the ASIC interconnection phase. The pre-irradiation characterization has been completed for the p-type wafers [8]. As shown in Fig. 2, very good pixel performances have been achieved, with high breakdown voltages, when compared to the depletion voltages of 40 V for the 75 μm thick detectors and 100 V for the 150 μm thick detectors.



Figure 2: Leakage Current of the p-type pixel sensors. .

#### III. THE SLID INTERCONNECTION

The ICV-SLID vertical integration technology allows achieving multi-layer semiconductor devices with interchip vias (ICV) for vertical signal transport. The SLID process starts with the deposition of a thin layer of TiW on the sensor and electronics surfaces as diffusion barrier for the copper. Then a 5 μm thick Cu layer is applied on both sides, followed by a 3 μm thick Sn layer, which is electroplated only to the sensor. The contact areas where these last processes take place are defined through a mask. Finally the sensor and the ASIC are aligned and contacted at a temperature around 300 °C under a pressure of 5 bar. A  $Cu<sub>3</sub>Sn$  alloy is formed, that electrically and mechanically connects the two devices. The alloy is stable up to temperatures of  $600^{\circ}$ C, allowing the application of the SLID interconnection successively to different tiers. The possible influence of the SLID metal system on the sensor properties has been investigated and no detectable effects have been observed [9].

 This technique is a possible alternative to the bumpbonding and, since it consists of less processing steps it has the potential of being cheaper. Moreover the SLID pads can have variable shapes and sizes, adjusted to the sensor geometry and to the requirements on the mechanical stability. A drawack is that reworking, in case of failure, is not feasible, as opposite to bump-bonding.

## *A. Test production with daisy chains*

A test-production with daisy chains has been completed at the MPP. It consists of 6" wafers where both the sensor and chip part have been implemented, mirrored with respect to the horizontal axis. This enables to study the placement precision of the chips and the efficiency of the SLID connection both for a wafer-to-wafer and a chip-towafer approach, while using a single set of masks. Different pad sizes and pitches have been implemented to explore the limits of the technology. Aplanarities of the silicon surfaces have been realized by introducing steps in the SiO<sub>2</sub> layer (100 nm) or the aluminum layer (1  $\mu$ m) below the SLID interconnection pad. Figure 3 depicts the schematics of the daisy chains.



Figure 3: Schematics of the daisy chain design for the SLID testing.

The wafer-to-wafer process has been succesfully achieved yielding an inefficiency of less than  $10^{-3}$  for most of the tested chains. Table 1 summarizes the results of all the daisy chain measurements. The resistance per pad varies between 0.25 and 1.5  $\Omega$  in the different chains. The sizeable statistical uncertainties are not due to the limited number of measured connections but to the spread observed among different chains with the same geometry.

 In addition, measurements of the misalignment resulted for the wafer-to-wafer interconnection in an average value of less than 5 μm for the first of the two packages tested, and  $(5-10)$  μm for the second one. In this kind of assembly the chip wafer is diced after it has been attached to the handle wafer used for the interconnection. A varying misalignment for the different structures present in a package can arise from the fact that the epoxy-based adhesive used to attach the chips to the handle wafer softens at the temperature needed for SLID before the Cu<sub>3</sub>Sn alloy is completely formed.

Observations performed with an infrared microscope revealed in some of the chains with larger pad size (80x80  $\mu$ m<sup>2</sup>) an outflow of the tin layer creating a short with the neighbouring pads. No problems were visible for the chains with smaller pad sizes and in the structures where we have implemented the pad geometry  $(27x60 \mu m^2)$  needed for the interconnection of the ATLAS pixels (Figure 4). The need to optimize the amount of tin according to the pad size and the applied pressure leads to the requirement of limiting the number of different chip geometries in the same package.

A chip-to-wafer assembly using these daisy chain structures is on-going. In this case the chip structures are diced before being placed onto the handle wafer. This will be the method adopted for the face-to-face connection of the FE-I3 chip to the thin pixel sensors.



Figure 4: Infrared image of the interconnected daisy chain of the ATLAS pixel geometry. The darker rectangles correspond to the SLID pads.

## IV. THE INTER CHIP VIAS

The 3D-interconnection technology also offers the possibility of extracting the signals from the backside of the chip by using the Inter Chip Vias (ICV) process. The ICV technique is being applied for the read-out of the MPP thin pixel sensors connected to the FE-I3 chips.

 The foreseen process flow for the FE-I3 starts with the etching of the vias on the chip pads originally designed for the wire bonding, after having removed the last aluminum layer. The vias cross-section is  $3x10 \mu m^2$ , and the initial depth 60 μm. For lateral via isolation a Chemical Vapour Deposition (CVD) of silicon dioxide is applied and then ICVs are metalized with a tungsten filling.

After this step the electroplating is performed on the front side that is finally passivated. The chip wafer is then bonded to a handle wafer and thinned down from the backside to 50 μm to expose the vias. An isolation layer is

deposited on the backside and the metallization needed to create the new contact pads is applied.

The design of the sensor-chip interconnection is at the moment being finalized with the determination of the optimal placement of the SLID pads and the vias.



Figure 5: Schematic view of the Inter Chip Vias

#### V. CONCLUSIONS

In the R&D effort towards thin pixel sensors and the ICV-SLID vertical integration, a first production of thin pixel sensors and SLID test structures has been completed. A high efficiency of the SLID interconnection was measured independently of the chosen pad sizes. This activity will proceed with the SLID interconnection of thin pixel structures to the ATLAS FE-I3 chip and the extraction of the signals from the backside of the chip.

## VI. REFERENCES

- [1] F. Gianotti et al., Eur. Phys. J. C 39 (2005) 293
- [2] M. Bruzzi et al., NIM A 541 (2005) 189

[3] G. Casse et al., 'Studies on Charge Collections Efficiencies for Planar Silicon Detectors after doses up to 1016 n.eq./cm2 and the Effect of varying Substrate Thickness', accepted for publication in IEEE Trans. Nucl. Sci. , December 2009

- [4] M. Barbero et al., NIM A 604 (2009), 397
- [5] J.C. Clemens, this proceedings

[6] A.Klumpp et al. Japanese Journal of Applied Physics 43, NO7A

[7] L. Andricek et al., IEEE Trans. Nucl. Sci., Vol. 51, No. 3 (2004) , 1117

[8] M. Beimforde, proceedings of the "7th International

Hiroshima Symposium on Development and Applications of Semiconductor Tracking Devices", NIM A, to be published

[9] A. Macchiolo et al., NIM A 591 (2008) 229

| Pad Sizes $\lceil \mu m^2 \rceil$ | Pitch $\lceil \mu m \rceil$ | Aplanarity [µm] | Connection measured | Inefficiency $[10^{-3}]$ |
|-----------------------------------|-----------------------------|-----------------|---------------------|--------------------------|
| 30x30                             | 60                          |                 | 8288                | < 0.36                   |
| 80x80                             | 115                         |                 | 1120                | < 2.7                    |
| 80x80                             | 100                         |                 | 1288                | < 2.3                    |
| 27x60                             | 50,400                      |                 | 24160               | $0.5 \pm 0.1$            |
| 30x30                             | 60                          | 0.1             | 5400                | $1.0 \pm 0.4$            |
| 30x30                             | 60                          | 1.0             | 5400                | $0.4 \pm 0.3$            |

Table 1: Geometrical parameters and performances of various SLID connection options

# 3D electronics for hybrid pixel detectors – TWEPP-09

S. Godiot<sup>a</sup>, M. Barbero<sup>b</sup>, B. Chantepie<sup>a</sup>, J.C. Clémens<sup>a</sup>, R. Fei<sup>a</sup>, J. Fleury<sup>c</sup>, D. Fougeron<sup>a</sup>, M. Garcia-Sciveres<sup>c</sup>, T. Hemperek<sup>b</sup>, M. Karagounis<sup>b</sup>, H. Krueger <sup>b</sup>, A. Mekkaoui<sup>c</sup>, P. Pangaud<sup>a</sup>, A. Rozanov<sup>a</sup>, N. Wermes <sup>b</sup>

> <sup>a</sup> Centre de Physique des Particules de Marseille, France <sup>b</sup> University of Bonn, Germany <sup>c</sup>Lawrence Berkeley National Laboratory, California, USA

## twepp@cern.ch

#### *Abstract*

Future hybrid pixel detectors are asking for smaller pixels in order to improve spatial resolution and to deal with an increasing counting rate. Facing these requirements is foreseen to be done by microelectronics technology shrinking. However, this straightforward approach presents some disadvantages in term of performances and cost. New 3D technologies offer an alternative way with the advantage of technology mixing.

For the upgrade of ATLAS pixel detector, a 3D conception of the read-out chip appeared as an interesting solution. Splitting the pixel functionalities into two separate levels will reduce pixel size and open the opportunity to take benefit of technology's mixing. Based on a previous prototype of the read-out chip FE-I4 (IBM 130nm), this paper presents the design of a hybrid pixel read-out chip using threedimensional Tezzaron-Chartered technology. In order to disentangle effects due to Chartered 130nm technology from effects involved by 3D architecture, a first translation of FE-I4 prototype had been designed at the beginning of this year in Chartered 2D technology, and first test results will be presented in the last part of this paper.

#### I. INTRODUCTION

Improving spatial resolution and dealing with higher luminosity and radiation's levels is one of the challenges of the ATLAS read-out chip upgrade [1]. A way to decrease pixel size is to split pixel into two or more parts and to stack them vertically. By this way pixel area is roughly reduced by the number of stacked circuits. This architecture is made possible by new 3D technologies.

Tezzaron offers one of the first commercial processes for 3D integrated circuit. This process combines Chartered 130nm technology and Tezzaron 3D technology. A first MPW run for High Energy Physics has been organized within a consortium of 15 institutes (France, Italy, Germany, Poland, and United-States).

This paper presents one project submitted in this run, called FE-TC4 and designed in collaboration by Bonn, CPPM and LBL. Based on the FE-I4 prototype (pixel read-out prototype chip for ATLAS upgrades, in IBM 130nm [2]), FE-TC4 splits pixel functionalities into two levels. The first one (Tier 1) is dedicated to the analogue part of the pixel and for sensor connections and is described in section III. The second

one (Tier 2) is dedicated to the digital part of the pixel. These two Tiers will be connected using 3D connections.

Two different designs were implemented for the digital Tier (Tier 2). The first one, detailed in section IV A, has been especially intended to study the parasitic coupling between Tiers. This Tier also provides some simplified data readout functionality. The second one, described in section IV B, is based on read-out structure foreseen for ATLAS pixel Front-End upgrade and includes "4-pixel region" architecture.

In order to evaluate 3D technology specific issues, 4 test circuits have also been designed and are developed in section VI. Specific issues like reliability of 3D connections and influence of these connections on transistors have been addressed.

The main objectives of FE-TC4 circuits are to demonstrate the feasibility of these new 3D hybrid pixel detectors, to measure the parasitic effects involved by 3D structure and to test the sensor hybridization on top of such circuit.

#### II. BRIEF DESCRIPTION

# *A. Vertical integration with 3D Tezzaron-Chartered process*

3D technology has been initiated by the increasing demand of memory cells and by the idea to stack vertically memory on processor, allowing better access times. This implies to make connections on the top and on the bottom of each stacked circuit. For this run, two Tiers are stacked face to face, that is to say that transistors (top of individual circuits) face each other. This configuration implies connecting the sensor through the analogue Tier from the backside, and designing sensitive elements like preamplifiers in front of digital Tier parts. This "face to face" wafer to wafer bonding technology, imposed by the MPW run, is not the best way of interconnecting for our purposes.

Each of the stacked Tier can be accessed from their backside with use of Through Silicon Vias (also called Super-Contacts). Tezzaron process is based on a Via First technology: Super-Contacts are formed before the BEOL (Back End Of Line) of Chartered process. This kind of technology allows very small via dimensions and pitches. Super-Contacts are drawn with a diameter of 1.2µm and a minimum pitch of 2.5µm. As Super-Contacts can only be 12µm deep at most, wafers must be thinned in order to create through silicon connections.

A 2-Tier circuit is sketched in Figure 1, underlining the physical stack and the placement of Super-Contacts.



Figure 1: 3D assembling of 2 Tiers (not to scale)

Tiers are bonded wafer to wafer with Cu-Cu thermocompression process by using the  $6<sup>th</sup>$  metal layer of Chartered technology to form the bond interface.

This Bond Interface is formed by an uniform pattern of hexagonal metal shapes, as shown in Figure 2. In order to provide a strong mechanical coupling, this pattern covers completely the chip, and most of the interface bonds are not physically connected to an active signal. Electrically unused bonds are left floating.



Figure 2: Bond interface layout

Furthermore, to allow the possibility of testing each Tier separately, IO pads for bonding or for probe testing can be reformed above metal 6 (bond interface). This possibility is given by one additional metal level offered by Chartered technology and called Re-Distribution Layer (RDL). However wafers on which RDL metallization is done are lost for 3D hybridization because of masked Interface Bonds.

# *B. FETC4 project description*

The full Chartered reticle size is 26mm by 31mm. This reticle has been shared between all MPW participants into sub-reticles (A to K) of 6.4mm by 5.5mm.

| TX1            | TVI            | TY <sub>2</sub> | TX <sub>2</sub> |
|----------------|----------------|-----------------|-----------------|
| A1             | B1             | Β2              | A2              |
| C <sub>1</sub> | D <sub>1</sub> | D2              | C2              |
| E1             | F1             | F2              | F2              |
| G1             | H1             | H2              | G <sub>2</sub>  |
| J1             | K1             | K <sub>2</sub>  | J2              |

Figure 3: Entire Chartered reticle layout

A particularity of this project is to avoid the cost of two sets of masks by implementing the two Tiers in the same reticule whereas a conventional 3D structure stacks two different wafers (which can also be fabricated with different technologies). With only one set of mask, and by taking care of layout mirroring between the two Tiers, two identical wafers can be bonded together face to face. For example, subreticle A1 (top Tier) is bonded with A2 (bottom Tier), and B1 is bonded with B2.

When the 4 sub-reticles (A1, B1, A2 and B2) are bonded face to face, four 3D-chips are formed: A1 on top of A2, B1 on top of B2, but also A2 on top of A1 and B2 on top of A1. As only the top Tier will be thinned, the two last configurations are uninteresting because super-contacts of A1 and B1 chips cannot be accessed. The disadvantage of this choice is to lose the half of 3D chips.

Sub-reticles reserved for FE-TC4 project are C1, C2, D1 and D2. As presented in Figure 4, they contain the following circuits:

- AE : Tier 1, analogue chip FE-TC4-AE
- DC : Tier 2, digital chip FE-TC4-DC (« à la FEI4 »)
- DS : Tier 2, digital chip FE-TC4-DS (for parasitic coupling study)
- C1-1, C1-2, C1-3, D1-1, D2-1, C2-1, C2-2, C2- 3 : Test circuits
- SEU : Test circuit for SEU studies
- "α" area : Reserved for another project



Figure 4: FE-TC4 sub-reticles

# III. TIER 1 : ANALOGUE DESIGN

#### *A. Analogue Tier 1 design strategy*

A first translation of FE-I4-P1 prototype chip from IBM 130nm to Chartered 130nm technology had been performed in February 2009 with the design of FE-C4 prototype (Chartered 2D technology with 8 metal levels). Based on this first step, FE-TC4-AE analogue Tier chip has been designed in Tezzaron-Chartered 3D technology with 5+1 metal levels. The pixel size, for this attempt, is also kept identical to FE-I4- P1 pixel: 166µm x 50µm.

This Tier is pin to pin compatible with FE-I4-P1 circuit, even if some subsidiary elements have not been implemented. Without these elements, some of the I/O pads remain unused. In the 3D structures, I/O signals of Tier 2 are transmitted through the Tier 1 (where some pads must be included on the back-side for IO connections), these unused pads are reserved for Tier 2 signals or powers. These "digital" pads are not connected to the analogue Tier 1 core. They only transmit signals through Tier 1, from bond interface to back-side metal.

At the schematic level, the analog Tier of FE-TC4 is identical to the FE-C4 prototype designed in Chartered 2D technology earlier, as well as to the previous IBM prototype FE-I4-P1. Components (transistors, resistors…) for Chartered design have been chosen as close as possible of components used in IBM design. Because of the tight schedule for these runs, no optimization of transistors has been made.

3D Tezzaron/Chartered run is limited to 5 metal levels for routing, plus a  $6<sup>th</sup>$  level reserved for bond interface. From the FE-C4 design, the FE-TC4 was redesigned by:

- Reducing the number of metal layers in layout,
- Adding all 3D connections:
	- o for the input signal from sensor,
	- o for the output signal to Tier 2.
- Changing 2D I/O pads into 3D I/O pads.

Reducing the number of metal layers is possible for this chip because the matrix is only of 14 columns of 61 pixels. But for larger matrix, power distribution must be thought through again. One possible solution can be to use metal levels of Tier 2 for Tier 1 power routing.

## *B. Analogue pixel with 3D connection*

The most time effective approach has been adopted. Pixel schematic is kept identical to FEI4-P1 pixel with amplifiers, discriminator, DACs, configuration registers and simple readout part. Specific changes have been implemented for 3D assembling:

- Input metal contact for sensor hybridization is routed by Super-Contacts to Tier 1 back-side.
- An electrical contact using interface bonds has been added to route the discriminator output signal to the digital Tier (Tier 2).
- One switch and its configuration signal has been added to transmit the discriminator output signal either to Tier 2, either to simple read-out part existing yet into the pixel. Figure 5 indicates the location of this switch in the pixel schematic.

With these changes, pixel output signal can be read-out, either by the same way than FE-I4-P1 and FE-C4 pixel (in this case, Tier 2 is not needed), either through the Tier 2 chip.



Figure 5: Pixel structure with switch for 3D connection

## IV. TIER 2 : DIGITAL DESIGN

The analogue Tier 1 output is read-out by the digital Tier 2. Two versions have been designed for this digital chip. The first is intended to study parasitic coupling effects between the two Tiers in order to test 3D architecture related issues. This chip is called FE-TC4-DS. The second version called FE-TC4-DC is a more complex read-out chip which structure is close to the one foreseen for FE-I4.

# *A. FE-TC4-DS :Tier 2 dedicated for test*

In 2D pixel designs, care is taken to avoid parasitic coupling between analogue and digital signals. One of the way used is to implement analogue and digital part as far away from each other as possible in the pixel area. In such a 3D arrangement, analogue and digital parts faces each other and shielding is the only possible way to reduce coupling.

The goals of FE-TC4-DS Tier ([3], [4]) are first to verify that the Tier 1 is working correctly and second to study the parasitic coupling between Tiers. This Tier is composed of a matrix of pixels with the same dimensions as the analogue Tier 1. This digital pixel is sketched in Figure 6.



Figure 6: FE-TC4-DS pixel structure

 "Hit" is the output of Tier 1. Signal called "CEG" (Count Enable Global signal) enables counting for all the pixels. Counting of a digital hit (for test) is also possible. A counter of 11 bits is used to count the number of hits. The result can be read via a 1-bit shift register.

To study the parasitic coupling between the two Tiers, the so-called "drum" cells are designed to generate digital noise in front of specific structures of the analogue Tier. Eleven different structures corresponding to specific analogue functions can be identified in the layout of the pixel, as shown in Figure 7.



Figure 7: Analogue pixel layout: 11 specific areas
In front of each area, a drum cell has been designed with the structure described in Figure 8. The only function of these drum cells is to generate digital commutation (digital noise). An 11-bits shift register per pixel will configure these cells: Each drum cell can be activated (or not) independently of the others.



Figure 8: A "drum" cell

Moreover, in order to study a possible effect of shielding, different column layout configurations have been implemented. Five columns are designed without any shielding, 4 columns with shielding made of metal 5, 2 columns with shielding made of metal 3 and 2 columns with both shielding (metal 3 and metal 5).

# *B. FE-TC4-DC : Complex read-out chip*

FE-TC4-DC is the second digital read-out Tier that will be bonded to the analogue Tier. For a design as realistic as possible with respect to the ATLAS pixel requirements, the architecture chosen is very similar to what the architecture of the future FE-I4 will be ([5], [6]). In particular, it is based on the same 4-pixel regional structure that will be sketched below. But due to time constraints, a simplified periphery and readout control logic was aimed for. After introducing the 4 pixel digital region, this periphery will be described underlining the main differences to the FE-I4 architecture.



Figure 9: The 4-pixel regional digital logic

The 4-pixel architecture is schematized in Fig. 9. These four pixels form a 2 by 2 pixel block inside a Double-Column In this design, latency counters and trigger management units, as well as read and memory management units are shared between four adjacent pixels. The pixels still retain individual Time over Threshold (ToT) counters, as well as individual hit processing circuitry. Any discriminator that fires in the corresponding four analogue pixels starts the common latency

counter in the digital Tier, effectively time-stamping a particular event. It is to be noted that when several pixels are hit in the same bunch-crossing inside a single 4-pixel region, a single latency counter is allocated. This has the important consequences of reducing digital activity, reducing digital power and improving the efficiency of the architecture. This structure is also well tuned to the physics, as real pixel hits come clustered. Furthermore, it is possible to distinguish in the digital logic small hits from big hits, by the time their comparator stays above threshold. The logic allows smaller hits to be associated with bigger hits in their immediate vicinity, either in the same region, or in adjacent regions (socalled "neighbor logic" mechanism). This provides a handle to avoid recording these small hits with time-walk. One difference at the 4-pixel region level with respect to the FE-I4 is the existence of a hit memory, which forms the basis of a column level readout shift register. This readout shift register provides an alternative way to read out pixel hits.

The complete Double-Column is made of 31 4-pixel regions, the top region having two dummy inputs as the corresponding analogue Tier contains only 61 pixels per column. To simplify the periphery, signals which are used in the FE-I4 full scale chip for the reading out of data and for the communication of the pixel hits to the periphery, related to the FE-I4 control block, need to be provided from the outside for this prototype. There is also an alternative way available to read-out data through a simple shift register. Finally, configuration of the chip and simple readout is achieved through multiplexed shift registers controlled through 2 enable bits.

# V. SENSOR CONNECTION

In order to build a hybrid pixel detector, connections to a silicon sensor have to be done. We decided to keep the IZM bump-bonding technology already employed for ATLAS pixel modules (and foreseen for the upgrades).

Analogue and digital Tiers are bonded face to face. Only Tier 1 is thinned. Backside metallization on Tier 1 is used to form both wire-bond pads (for circuit inputs and outputs), and bump-bonding pads (above each pixel for sensor connection). Hence, due to geometric constraints, the sensor must be smaller than the read-out matrix. The sensor illustrated in Figure 10 is reduced to 7 columns of 48 pixels (instead of 14 columns of 61 pixels for Tier 1 matrix). The complete 3D final assembly is depicted in Figure 11.



Figure 10: Sensor layout

The design of this sensor has been done by the Munich group (Max-Planck-Institut für Physik, Werner Heisenberg Institut).



Figure 11: 3D final assembling for FE-TC4 project

# VI. 3D TEST CIRCUITS

In each FE-TC4 sub-reticles test circuits have been implemented for testing Chartered technology and 3D basic elements. Test structures are grouped into four chips which will be bonded in 3D configuration at the same time as the other chips of the project.

• Evaluating capacitances:

Values of preamplifier feed-back capacitor and injection capacitor can be measured on dedicated arrays of 1024 capacitances. An array of 100x100 Super-Contacts is also implemented to measure the equivalent parasitic capacitance of one Super-Contact (relative to the substrate).

Testing Super-Contact influence on transistors:

Super-Contacts are thin but deep vias which can be placed closed to transistors (minimum space with active region is 0.5µm). To evaluate influence of this new generation of contacts on transistors performances and to evaluate the involved parasitic coupling, different test structures have been designed. These test structures include enclosed and linear transistors placed at various distances from Super-Contacts. Moreover a signal can be applied on Super-Contacts in order to study the influence on transistors which have been implemented with different configurations of substrate tap ? or Nwell rings.

> • Testing Bond Interface and Super-Contact connection reliability:

These tests can be performed only in 3D configuration. As depicted in Figure 12, chains of interface bonds connected in series with Metal 5 and chains of Super-Contacts are implemented to evaluate the rate of successful connections. The expected result of these tests is a very good yield as announced by Tezzaron.

Testing the mechanical quality of thinned chip:

Creating Super-Contacts, bonding face to face the two Tiers, thinning Tier 1, and bonding on this thinned Tier may generate mechanical stresses, especially on Tier 1 areas where input-output pads are implemented. Test structures like linear PMOS, enclosed PMOS, feed-back capacitances array, injection capacitances array, poly-silicon resistance and shift registers are implemented under or closed to wire-bond pads (Back-Side metal).



Figure 12: Tests structures for Bond Interface (left) and Super-Contacts (right) reliability

# VII. FE-C4 PRELIMINARY TEST RESULTS

Porting directly one design (FE-I4-P1) from IBM 2D technology (8LM) into Chartered 3D technology (5LM) appeared to be quite challenging. In order to disentangle problems due to the technology itself from problems caused by 3D connections and stacking, we decided to make a first step consisting in a simple conversion from IBM to Chartered 2D (8LM). The resulting chip, called FE-C4-P1, is an exact translation of FE-I4-P1, and, due to a tight schedule, even the transistors sizes remain unchanged, leading to an unoptimized choice of their dimensions. No detailed simulations except the "typical" case have been performed.

Submission of this chip had been done in February 2009 and the chip has been tested in early May.

Preliminary results are very encouraging: Chip functionalities are fully working and analogue results (measured with the LBL set-up previously used for FE-I4-P1 tests) have demonstrated performances comparable to those of the IBM chip. Minimum matrix threshold close to 1000 e-, noise about 80 e- rms and threshold dispersion (un-tuned) of 200 e- have been measured.

One fundamental characteristic needed for IBL or SLHC upgrades is radiation tolerance up to few hundred MRad. SEU behaviour of the chip is also an issue. Irradiation of FE-C4-P1 has then been carried out using CERN\_PS irradiations facility with 24 GeV proton beam up to about 400 MRad. After approximately 160 MRad, we noticed a problem on the digital registers of the chip which tend to stay "blocked" in the "1" state. These registers can only returned to the "0" state by power-off of the chip. We currently think that this effect is due to the shifts of P-Mos and N-Mos transistors' VTs caused by the irradiation of this non-optimal design which tends to encourage the "1" state.

This problem, which could be easily corrected in the next versions of the chip, gives in turn a problematic tuning of all the currents which drive the analogue parts. However, we were able to make analogue measurements of few tenths of pixels after 400 MRad. The mean noise of these set of pixels has been measured at 230 e- rms: Even if it is a factor 3 higher than the one measured before irradiation, it stays at a reasonable level. Thus no show-stopper concerning radiation hardness of Chartered technology has then been detected.

# VIII. CONCLUSION

Benefits of 3D circuits appear evident: Pixel size can be decreased by separating digital function in another Tier. Alternatively, more functionalities can be implemented in front of each analogue pixel. Moreover, each Tier can be designed in a different technology for better performances.

This chip is one of the first chips demonstrating the feasibility of such 3D circuit. By the end of the year 2009, hopefully, tests results would confirm the functioning and the quality of the assembling and will reinforce the position of 3D architecture for future detectors technology selection.

# IX. REFERENCES

[1] M. Barbero et al : "A new ATLAS pixel front-end IC for upgraded LHC luminosity"

*Nuclear Instruments and Methods in Physics Research A: Volume 604, Issues 1-2*, *1 June 2009*, *Pages 397-399*

[2] A. Mekkaoui, FE-I4\_PROTO1, "ATLAS pixel upgrade for SLHC-electronics", Internal Document, 2008.

[3] B. Chantepie, "Synthesis document of FETC4 Tier 2", Internal Document, 2009

[4] B. Chantepie, "Proposal for simple digital", Internal Document, 2009

[5] D. Arutinov *et al*, "Digital Architecture and Interface of the New ATLAS Pixel Front-End IC for Upgraded LHC Luminosity", IEEE Trans. Nucl. Sci. 56, 388 (2009)

[6] M. Karagounis *et al*, "Development of the ATLAS FE-I4 pixel readout IC for b-layer Upgrade and Super-LHC", proceedings of TWEPP 2008. Published in 'Naxos 2008, Electronics for particle physics' 70-75

# *WEDNESDAY 23 SEPTEMBER 2009 PLENARY SESSION 4*

# Marvin Johnson

# *Abstract*

This paper describes methods for minimizing common mode noise in electronic detector systems. It discusses grounding issues, proper design of the signal path and experiment wide methods for low noise design. These principles are illustrated by several examples.

# I. INTRODUCTION

Detectors for high energy physics experiments have changed significantly in the last few years. The twin goals of higher resolution and lower cost have moved the readout electronics from circuit boards located in a counting house to dedicated chips mounted directly on the detector. This effort has resulted in much better detectors but at the expense of ever decreasing signal levels. Thus, control of electrical noise is becoming an increasingly important feature of detector design.

Since there are many good text books describing methods for minimizing common mode noise, I will concentrate on methods of applying these methods to detector design. I will illustrate these ideas with several examples that I have been involved in. The first section discusses grounds and noise currents. The next section covers some features of detector design while the last section discusses more general aspects of experiment design.

# II. GROUNDS AND NOISE CURRENTS

The term "electrical ground" means different things to different designers. A designer of a radio tower wants an electrical ground that can safely absorb several thousand amps from a lightning bolt. A building designer wants a ground that can keep the parts of a building and surrounding area at roughly the same potential as the center tap of the local power transformer. Detector designers have little need for either of these features. A good detector ground has a large capacitance so that noise currents flowing onto the ground do not change the voltage of the ground. It should also have a large surface area so that the current flow is not concentrated into a small area. This minimizes any magnetic field effects. From the detector point of view such a ground makes the noise current "disappear". The vacuum shell for the large CMS magnet is an example of a good detector ground.

An important feature of noise signals is that they are almost never a voltage source. That is, the noise source has some internal resistance so shunting even a small part of the current to a ground may significantly reduce the amplitude of the noise signal. One should always ground detectors even if the connections are not ideal.

Most detectors operate at high frequency so low frequency noise is usually not important. However, the high frequencies mean that inductance almost always dominates over resistance in determining impedance to ground. For example, a 20 cm long wire 500  $\mu$ m in diameter has nearly 10 ohms of inductive impedance at 40 MHz.

# III. FRONT END DESIGN

Many contemporary detector designs have average signal levels of only a few thousand electrons. To put this in perspective, if a detector is sensitive to a constant current of 56 nA for 10 nS, it will accumulate 3500 electrons in a charge sensitive amplifier. This amount of noise current can be generated from magnetic coupling between 1 cm of wire (such as a silicon strip) located 1 cm away from a conductor (such as a cooling pipe) carrying about 100  $\mu$ A of 10 MHz noise current. This example assumes an amplifier with 100 ohm input impedance. The obvious solution to this noise problem is to ground the pipe. If the pipe is 5 mm in diameter and 1 meter long, its impedance at 10 MHz from self inductance is over 7 ohms so grounding the pipe at one end may not eliminate the noise signal.

This example illustrates that electrical properties of mechanical components are often important to the overall detector design. Many of the noise problems that I have worked on are the result of "unintended consequences" of other systems interacting with the readout electronics. It is also true that electrical design might solve mechanical problems. For example, one might use some of the mechanical support structure as a ground return so that overall detector mass is reduced. I think that it is important that there be one design team for the detector - not separate teams for mechanical, cooling and electronics. It may seem wasteful to have electronic

engineers sit through a discussion of cooling but if the cooling pipes are conductors, how these are routed and grounded could well be crucial to the success of the detector.

Most designers do a good job on the basic input circuit for a detector. This is not the case for the return part of the circuit. This is best illustrated by an example. Fig. 1 shows a schematic of a simple liquid argon readout cell for a calorimeter. One side of the cell is at high voltage and the other side is the readout plate. Charged particles passing through the cell ionize the argon atoms. The electrons drift to the anode and are collected by the pre amp.



Fig. 1. Circuit diagram for a simple detector circuit.

Charge flowing into the preamp must be balanced by charge flowing out of the ground of the preamp and back to the cathode of the detector cell. Otherwise, the charge on the cathode would continue to increase. An electrical circuit must be a complete path back to the starting point. Thus, when charge is collected from the argon cell, a similar amount of charge is sent out the amplifier ground which must return to the high voltage side of the cell. Think of a simple common emitter circuit shown in fig. 2.



Figure 2: Common emitter amplifier. Any current injected into the base flows out the emitter and then back to its source.

Charge flows into the base and out the grounded emitter. For the circuit to be complete, this charge must flow back to the high voltage side of the argon cell. If the return path encloses any varying magnetic fields, noise signal will be induced into the circuit by Ampere's law. In particular if the return path is though a remote high voltage supply (as shown in fig. 1), the detector is likely to be quite noisy. The best design is to install a capacitor between the high voltage line to the cell and the amplifier ground (fig. 3). This capacitor should be as close to the amplifier ground as possible. Additionally, adding a resistor in the ground return of the high voltage supply will force all the return current through the capacitor as well as breaking any ground loops involving the high voltage system.



Figure 3: This is the same as fig. 1 but with the addition of a capacitor to provide local signal return to the cathode plate.

This is a straight forward design but it can have subtle problems. A muon system employing both anode and cathode readout had the following problem. The noise level was satisfactory when the chambers were installed but over the next few months the noise increased roughly linearly with time Fig. 4 shows a simplified schematic of this chamber. The designers have installed capacitors for proper return of the ground currents to the HV system. This looks fine on paper until one looks in more detail at the detector. This problem was traced to a poor ground connection between the anode and cathode boards. Both boards needed to be removed easily so the ground connection was made with a screw. Over time the surface of the screw oxidized thereby increasing the resistance between the two grounds. The actual schematic looked like the one shown in fig. 5 where R represents the resistance of the screw. As R increases, more of the return current is forced onto different paths. If these paths enclose fluctuating noise currents, some of this noise will appear in the signal. The simple solution of adding an explicit ground connection between the two circuit boards eliminated the noise problem.



Fig.4. Wire chamber with both anode and cathode readout. Note that there is only one signal return capacitor.



Fig. 5. This is identical to fig, 4 but with a resister shown in the return path between the anode and cathode amplifiers. The resistor represents the added resistance of the oxidized mounting screw.

Another example is a precision drift chamber with both anode and cathode readout. It worked well in test beams and in test setups outside the experiment. But when it was installed in the experiment and all the amplifiers installed, it would break out into stable oscillation after a few minutes. The time it took for the oscillations to start was variable. This behavior was the result of a poor design of the high voltage system itself. The drift chamber used a graded voltage system so that the drift velocities were roughly uniform throughout the detector. A schematic of the voltage distribution is shown in fig. 6.



Fig. 6. Schematic of the high voltage distribution for a precision drift chamber. The high voltage distribution line was 32 times the length of the chamber.

The cathode pads were fed from a common line through resistors which set the pad voltage. This common feed wire ran back and forth across the chamber 32 times. The far end of the wire was open and the near end was terminated in a large resistor. The entire circuit was etched on a polyimide sheet and installed with the cathode pads mounted directly over the preamp inputs. The source of the oscillation was the high voltage line which functioned as a cable resonator. That is, when some of the preamp output was coupled back into this line (through accidental coupling), it excited the natural resonance frequency of the line. The most likely feed back path was through the feed back capacitor via a poorly grounded ground plane. Of coarse, the feedback from the preamps was random but the line selected out its natural frequency. When there were enough preamps feeding energy into the line, the signal exceeded the preamp threshold and the entire chamber started to oscillate at the resonance frequency of the high voltage line. The oscillations started on noise signals so the start time just depended on achieving enough noise signal at one time to start the oscillation. A very simple fix for this problem would have been to have one line across the end of the chamber and 32 branch lines going to the preamps. The line would still have resonated but the frequency would have been above the bandwidth of the amplifier so no oscillations would have occurred.

There are many other structures in detectors such as cooling lines, cables and so on that could form resonant systems. All that is needed for oscillations to occur is a resonance in the bandwidth of the preamp, electrical coupling to the preamp input and some coupling of the preamp output to the structure. The key to preventing this type of problem is to

make sure conducting mechanical structures are well grounded and electrical structures are short enough so that any resonance is above the bandwidth of the amplifier.

A third example is a wire chamber that is read out from both ends. This example reads out the cathode on one end and the anode on the other but it could also read out both ends of a wire in order to get the coordinate along the wire. A simplified schematic is shown in fig. 7.



Fig 7. Schematic of a wire chamber that is read out from both ends. All the anode channels are read from one end and all the cathodes from the other end. The gap in the ground shows that there was a very poor ground connection between the two ends.

The return circuit is only at one end and there is a break in the return ground plane. The gap in the return circuit is equivalent to an infinite value of R in the muon chamber example so one might expect that this chamber did not work at all and that was the case. The signal return path through the external electronics was so long that the phase of the returned signals was shifted to give positive feedback so that the preamps oscillated. The symptom was that oscillations would occur depending on output cable position. What was happening was that the propagation velocity of the return signal depended on the capacitance of the ground line to the surrounding world. That is, the formula for the velocity of signal propagation on a cable is

$$
v = \frac{1}{\sqrt{LC}}
$$

where L and C are the inductance and capacitance per unit length. When the cable position was changed, the capacitance changed which then changed the signal delay time. The overall delay was close to that needed for positive feed back so one position of the output cable would cause oscillation and another would not.

The fix for this problem was identical to the previous example: connect the grounds between the two ends. This eliminated the oscillation problem but the detector was still noisy. The circuit with the grounds connected is shown in fig. 8.



Fig. 8. This is identical to fig. 7 but the gap in the ground plane replaced by a noise generator. The gap was shorted together but the difference in potentials between the two grounds causes current to flow through the ground plane. The resistance in this connection causes a noise voltage.

I have included a noise generator in the circuit. The ground potential at the two ends of the detector is not the same so some ground current flows through the new connections. The connections have resistance so this results in a noise voltage that is directly in the return path for the one set of preamps which means that the noise is in the readout. This is a difficult problem to solve. Making the return path have very low impedance will minimize the noise. The noise can only be eliminated by isolating the grounds of one or both sets of preamps so that no external ground current can flow. The next example describes the use of ground isolation to eliminate this ground loop.

Note that adding a capacitor to provide a local return for the high voltage is likely to make the noise problem worse. Now the ground current is flowing through the high voltage plane so both sets of preamps will see the noise. Also, the high voltage plane is likely to have more impedance than a well constructed ground connection so the noise signal will be larger.

 What do you do if not everything is close together? This could be a large liquid argon calorimeter where the high voltage port is separated from the signal port or a silicon detector where the preamps are connected to the sensors by a flex cable. This is just an extension of the previous example so we know the answer; either isolate the grounds of the preamps or make a very good ground connection.

Since most detectors involve high frequency signals, inductance is usually much more important than resistance. The formula for the self inductance of a rectangular conductor is

$$
L = .002l \left( Log \left( \frac{2l}{H+W} \right) + \frac{1}{2} - Log(e) \right)
$$

where l is the length of the conductor and H and W are the height and width of the conductor. The skin depth of copper at 1 MHz is 66  $\mu$ m so H is small for most detectors. Thus, the most efficient way to distribute material for a low inductance connection is to make a wide thin sheet.

The formula for a wire (or cylinder since they are the same) is

$$
L = .002l \left( Log \left( \frac{2l}{R} \right) - \frac{1}{2} \right)
$$

where R is the radius of the wire. Again, we see that a large radius is important for a wire to have a low value of inductance.

Since the dependence in both cases is logarithmic, one rapidly reaches a point of diminishing returns. Also, these formulas break down as the width or radius approaches the length. But they do give us a guideline on how to proceed.

A good example of both a low impedance ground plane and an isolated ground preamp is the layer 0 silicon detector for D0. This device has a radius of only 18 mm so that the chips could not be mounted directly on the sensors. Rather, we used a roughly 300 mm long polyimide cable to attach the sensors to the chips. In order to minimize intrinsic noise, the cable capacitance must be made as small as possible. Thus, the cable was made without a ground plane. There is only one small trace to provide a return path for the bias voltage which has a resistance of 4 ohms. The impedance at 10 MHz is more that twice this value. This is far too high an impedance for a low noise design so we must use some other connection. We must also keep the overall mass as low as possible. The best option is to use some of the mechanical structures as electrical elements. The cooling lines are plastic so they will not work. However, the body of the device is a 12 sided carbon fiber polygon with a diameter of 35 mm. A 35 mm cylinder 300 mm long has an inductive impedance of less than an ohm at 1 MHz. If we can make the support structure conductive, our problem is solved.

High modulus carbon fiber is quite conductive if one can make good electrical contact with the carbon fibers[1]. We have developed a method of taking 50 micron thick polyimide film coated with a 5  $\mu$ m thick layer of copper, etching a mesh pattern on it and then co curing the polyimide with the carbon fiber. That is, we etched a mesh ground plane on a piece of 50  $\mu$ M thick polyimide that was coated with a 5  $\mu$ M thick layer of copper. Standard printed circuit vias were used to bring contacts to the reverse side of the material. This material was laid up with the copper layer facing the carbon fiber and the assembly was cured as a unit (fig. 9). This process results in a very low resistance device that is within a factor of 2 of an all copper structure at high frequencies The high voltage coupling capacitor was mounted on the sensor so that the support structure remained at ground. The trace length between the capacitor and the sensor bias plane was kept as small as possible.



Fig. 9. Copper mesh co cured onto the carbon fiber mechanical support.

The conductivity of a detector's mechanical structure can be very useful for some aspects of detector design but it can also create ground loops through the detector. Any detector with multiple independent readout sections is subject to possible ground loops. All one needs is to have different sections grounded to different locations and a conducting path through the detector. This was described in the third example above. The usual solution to this problem is to provide a dielectric break in the mechanical design that isolates the different readout sections. Sometimes design constraints prevent this. This was the case for the layer 0 detector. The small diameter and long length of this detector required a continuous carbon fiber structure. The only solution that eliminates this loop is to isolate the local electronics ground from the outside world grounds so this is what we did.

 We can break up the isolation problem into three main parts: 1) the download and readout system, 2) the power supply, and 3) the circuit board layout. The readout system for layer 0 is LVDS so we chose

to use the differential drivers themselves to isolate the readout. Other methods such as optical or magnetic coupling were studied but none were satisfactory for this environment. Power supply isolation was achieved by using a separate power supply for the isolated system. We selected a supply that had good AC isolation at high frequency. The circuit board was designed with minimum overlap between the two grounds. With all components installed, the resulting board had 33 ohms isolation between the grounds at 7 MHz with little frequency dependence. There are six boards in parallel in this system so the overall impedance is around 6 ohms. This is a lower limit since the cables connecting the detector to the outside world have some inductive impedance.

# IV. GENERAL TECHNIQUES

#### *A. Ungrounded Conductors*

One very common problem with detectors is isolated conducting components. By this I mean pieces of metal that are isolated from the rest of the detector. This isolation can be caused by oxidized aluminum or by glueing parts together with non conducting adhesives. Either one of these results in a conductor which can be at an arbitrary voltage. Signals induced on these components spread over the entire surface of the component by Gauss's law (fig. 10).



Fig. 10. Schematic of a noise source coupling to a front end circuit through an ungrounded piece of metal.

If one part of the component is close to a sensitive part of the circuit, the noise may simply be channeled directly to the sensitive component. Ungrounded components are one of the most common problem areas in detectors.

Bare aluminum oxidizes immediately. In order to ground aluminum one must establish a connection through this oxide to the base metal. This can be done by either a mechanical connection or by plating the metal. I will cover plating first. There are 2 common plating methods: Alodining and tin plating. Both methods work well but they have somewhat different applications. The Alodine process coats the Al with a coating that is only a few molecules thick. Thus, there is no change in the dimensions of the parts but the surface is easily scratched. It is most suitable for parts that have critical mechanical dimensions and will not be disassembled often.

Tin plating typically coats the material with 250  $\mu$ M of tin so it has good mechanical robustness but the parts have grown in size. This method is good for cable trays and other parts that may need to be disassembled or be exposed to rough handling.

A mechanical connection can be made by use of a star washer or similar mechanical device. Star washers are lock washers with many sharp points. If they are properly tightened, they will cut through the aluminum oxide and form a good connection to the aluminum underneath. The main problem with this method is maintaining enough pressure on the washer to maintain a gas-tight connection. Otherwise, the aluminum will reoxidize under the washer. With careful application, these connections can last for several years.

Sometimes the aluminum oxide problem is only recognized after the detector is completed. There are some things that can be done after assembly. One is to use star washers. However, if there are few mechanical connections, this will not work. A second solution is to use a product called an alodine pen which allows alodining small sections of an aluminum part. One can then make reasonably good connections with only mechanical pressure such as with a clamp.

# *B. Power Distribution*

Transformers come with 0, 1 or 2 shields. A single shield is typically made as conducting screen between primary and secondary coils. A doubly shielded one usually has screens wrapped around the primary and secondary coils. A single shield reduces noise from capacitive coupling between primary and secondary by a factor of about 100. Two shields give an additional factor of 10. A single shield is connected directly to ground. Both shields of a doubly shielded transformer can be connected to a local ground but a better way is to isolate the ground of the secondary. That is, the secondary is attached to a ground isolated system. Then the shield of the secondary is connected to this ground. This arrangement would be suitable for a very sensitive experiment such as a dark matter search.

There is a serious safety issue with an isolated secondary ground. If the transformer fails, the ground of the secondary could raise to the voltage of the secondary. Since the grounds are isolated there would be nothing to trip the primary circuit breaker. This problem can be eliminated by attaching a saturable inductor between the two grounds. At very low currents, the inductance is high and there is a break between the two grounds. If a larger current flows ( few hundred milliamps), the core saturates, the relative permitivity drops to 1 and the coil will present very little resistance to current flow.

# *C. Cables*

Covered cable trays grounded every few meters to a good instrument ground provide the best protection from noise pick up in signal or power cables. If a covered tray is not possible, lining the bottom of the tray with a thin copper foil will usually give some benefit. it provides magnetic shielding for fields from below and it forms a ground plane for the cables passing over it. Of course, this works best for cables that lie directly on the copper ground plane. Also, the copper must have periodic ground connections.

# *D. Cable Shields*

Grounding cable shields is often controversial. It is usually best to ground only one end of the cable. Otherwise, you risk forming a ground loop. Sometimes a capacitor coupling is used at one end to break the low frequency ground loops. It is important to choose the best ground for the cable shield which could be either at the source or destination end. If the grounds are equal, I usually choose the rack end rather than the detector end because I want to route energy picked up by the cable away from the detector.

#### *E. Racks and Other Infrastructure*

Racks and other support structure should be welded together and connected with a low impedance connection to a good ground. This is especially true if the rack is being used as a cable ground. Connections between structures that are painted and bolted together are rarely adequate for for a good ground.

# V. SUMMARY

Successful design of any low signal level device require great attention to detail by skilled designers. I find it very useful to draw a simplified schematic of the entire detector including all the mechanical components that are potential conductors. Coupling strengths can be estimated by using simple formulas or by using various field calculation programs. I have been quite successful using finite element codes to calculate the capacitance between circuit elements. Once this is done, one can eliminate components that have negligible coupling to the electronics and make sure the others are adequately grounded.

# VI. REFERENCES

1. W. Cooper et al. Nucl. Inst.and Meth. A 550 (2005) 127

# *WEDNESDAY 23 SEPTEMBER 2009*

# *PARALLEL SESSION A4 TRIGGER*

# **Feasibility studies of a Level-1 Tracking Trigger for ATLAS**

I. Bizjak<sup>a</sup>, R. Brenner<sup>b</sup>, N. Konstantinidis<sup>a</sup>, M. Sutton<sup>c</sup>, <u>M. Warren</u><sup>a</sup>

<sup>a</sup> Department of Physics and Astronomy, University College London, UK

**b** Department of Nuclear and Particle Physics, Uppsala University, Sweden

<sup>c</sup> Department of Physics and Astronomy, University of Sheffield, UK

[warren@hep.ucl.ac.uk](mailto:warren@hep.ucl.ac.uk)

# *Abstract*

The existing ATLAS Level-1 trigger system is seriously challenged at the SLHC's higher luminosity. A hardware tracking trigger might be needed, but requires a detailed understanding of the detector. Simulation of high pile-up events, with various data-reduction techniques applied will be described. Two scenarios are envisaged: (a) regional readout - calorimeter and muon triggers are used to identify portions of the tracker; and (b) track-stub finding using special trigger layers. A proposed hardware system, including data reduction on the front-end ASICs, readout within a super-module and integrating regional triggering into all levels of the readout system, will be discussed.

#### I. INTRODUCTION

A tracking trigger is a relatively new proposal for the ATLAS upgrade, which already has a well established tracker project. A re-design would be ideal, but without a full physics study to support the case, and with viable possibilities to adapt the design as it stands, the additional effort and time required is likely too costly. To this end the work presented here has used the current state-of-the-art Pixel and Strip upgrade projects as a foundation. We have attempted to work within the architectural and technological constraints of the existing design. For most of the sub-systems we seek extensions of existing capabilities, but little in complete re-design. In areas less defined (e.g. most of the off-detector electronics), we use the current ATLAS SCT and Pixel topology as the baseline.

A track-trigger straddles two distinct components of ATLAS - detector (including readout) and trigger. These two groups have agreed parameters in-which to operate (trigger rates, latency etc.) and we attempt to retain these where possible.

Various options for tracking readout exist, falling into 3 areas:

1) Bunch-crossing (BC) rate readout of the whole detector. This increases data-volume/bandwidth by a factor of order 300, and is deemed infeasible.

2) Auto/local event selection with special layers. Ondetector logic selects good track-stubs autonomous of any triggers, which are "pushed" out as needed. In the case of Strips, both sides of a module will be connected to each other. These connections could be at the chip, module, or super-module level, with increasing bandwidth requirements respectively. Early studies show that high readout rates are required as it is difficult to distinguish between low- and high-pT tracks (influenced by the magnetic field). Options for on-detector track-finding are also being investigated, although this require grouping data from modules spread over multiple layers/discs with difficult readout challenges. These ideas are in their infancy and not covered in this paper.

3) Readout only regions of the detector prior to an L1A being issued, making use of seeding from early stages of the L1 trigger system. This is the focus of this paper.

# II. REGIONAL READOUT

Regional readout uses L1Calo and L1Muon to identify potentially interesting features at a few hundred kHz. They issue fast readout requests to specific regions in the tracker at this rate, providing the  $(\eta, \phi)$  position of the objects identified as interesting. In this way, only a small fraction of the detector is read out, and only at a reduced rate such that the required additional bandwidth will be modest.

Several variations are possible with this approach, depending on how fast the regional detector data can be read out and processed, and on the overall Level-1 Trigger latency envelope. Ideally, tracking information should be used directly within the Level-1 Trigger. However, ATLAS has also discussed an option for a two-stage Level-1 trigger, for use if the Inner Detector readout is too slow. This would require additional buffers on all ATLAS detector front-end ASICs (FEICs), in which data would be held until the slower, definitive hardware trigger decision is available.

#### *A. Regional Readout System Overview*

The track-trigger builds on the existing Level-1 Trigger architecture, in which a potentially interesting event is identified, and a signal synchronous with that event is sent to the detector front-end (FE) modules. The FEICs transfer the event data from their pipelines to a readout buffer where the data are queued until they can be transferred offdetector.

For regional data readout, the process has two important differences:

- The trigger in this case is a regional-readout-request (R3) which is not broadcast to all FE modules. Instead it is sent only to the Inner Detector modules that fall within the region-of-interest.

- Readout is minimally buffered - when an FEIC receives an R3, it must return the data as fast as possible employing prioritised multiplexing or a separate data path.



Figure 1: Conceptual Regional Readout System within ATLAS DAQ.

Figure 1 shows the system layout. The track-trigger process begins with the receipt of one or several RoIs from the L1Calo or L1Muon system by the RoI mapping hardware. The information is decoded and synchronised, generating readout requests to be sent to the modules within the RoI. At this stage the physical geometry of the detector can be used to send targeted RoI/R3 signals to the Readout Drivers (RODs) which map and forward them to the desired super-module.

The Super-Module Controller (SMC) ASIC (that resides at the edge of a super-module) decodes the signal for inclusion in the trigger, timing and control (TTC) signal distribution, using special lines/protocol to identify which modules should be read out.

The FE modules comprise a Module Control Chip (MCC) and many FEICs. Upon receiving an R3, the MCC prepares for readout of track-trigger data while forwarding the R3 signal to the FEICs, which copy the raw-event from mid-pipeline, process and insert it at the front of any queues. The data are then sent off-detector on a prioritised channel.

In the case of dedicated track-trigger links, these data would travel directly to the Track-Trigger Processor (TTP). It is more likely, though, that track-trigger data will be multiplexed with normal event data on the same links and be intercepted on the ROD for forwarding to the TTP.

# *B. Rates and Expectations*

Some estimates need to be made of the trigger rates we expect. We presume that the Level-1 rate remains at 100kHz, and the R3 rate somewhere between the bunchcrossing and L1 rates at 400-500kHz.

The detector will likely contain of order 4000 RoIs. Guesstimate from current detector expectations indicate that an RoI encompasses  $\sim$ 1% of modules on the detector, and that ~4 RoI are expected per event [1].

Figure **2** shows pictorially the scale of an RoI.



Figure **2**: Event display showing RoI geometry (RoI: ∆φ=0.2, ∆η=0.2 at Calo ∆z=40cm at beam line).

# III. IMPLEMENTATION

Incorporating a track-trigger, particularly as part of the Level-1 Trigger into the ATLAS upgrade involves changes to many sub-detectors and almost all sub-systems of the inner-detector. As the overall architecture of the detector is affected, and will need to be re-evaluated, the constraints and requirements need to be examined:

- Trigger latency – Latency affects almost all aspects of the design, but in terms of trigger it defines the FE pipeline length – longer pipelines need more resources.

- Data volume – Bandwidth affects readout rate, deadtime and latency.

- Data transfer and synchronisation – Transferring different data types with differing constraints is difficult.

- Regional-readout-request distribution – Targeted R3s need more infrastructure.

- Off-detector readout and track-finder – This is a new sub-system where a fast and synchronous path to Level-1 Trigger is required.

#### *A. Overall Latency*

FEICs have finite pipelines, defining the Level-1 trigger latency. The current ATLAS has a maximum latency of  $\sim$ 3.2µs (128 BC). The upgrade already prefers more (6.4µs is a common assumption) [2], but this needs to be evaluated against cost and complexity – in both new hardware and increased power.

Much of the trigger latency is consumed by cable lengths between the counting room and the detector  $-$  a round-trip time is 1µs. As the track-trigger system needs to readout the RoI data prior to a level-l decision an additional 1µs round trip is required. The track-finding efficiency increases with processing time. An initial estimate, based on D0 [3] indicates a minimum of 2µs.

Initial estimates:

| $BC \rightarrow RoI$ |                    | $1200ns + 500ns$ fibre     |                                   |  |
|----------------------|--------------------|----------------------------|-----------------------------------|--|
| Decode RoI/R3        |                    | $650ns + 500ns$ fibre      |                                   |  |
| Data Volume          |                    | 2375 <sub>ns</sub>         |                                   |  |
| Readout              |                    | $325ns + 500ns$ fibre      |                                   |  |
| $Track-Finder + L1$  |                    | $2000ns + N + 500ns$ fibre |                                   |  |
| <b>Total</b>         |                    | $8550ns + N$               |                                   |  |
|                      |                    |                            |                                   |  |
| L1 Muon/Calo         | Rol/R <sub>3</sub> | Data                       | $T$ <b>N</b> ckfind+L1<br>Readout |  |
|                      |                    |                            | <b>US</b>                         |  |

Figure 3: Chart showing contributions to latency.

#### *B. Data Volume and Dead-time*

Event data is the largest contributor to latency ondetector. Although queuing regional data in the FEICs would only slightly increase latency due to the low R3 rate per module, the peak latency would be much higher. It follows, therefore, that a module cannot accept a second R3 while busy with readout of the previous, and data-volume equates to dead-time.

To reduce data-volume (and latency) data compression on the FE module is desirable. For track-finding not all hit data is useful - in general, if a module (or FEIC) has too many hits, or wide clusters, there will be little opportunity of a track-finder to identify un-ambiguous tracks.

To effect this, simulations have been carried out where the cluster width is restricted to  $\leq$ 3 strips and the number of clusters per FEIC and per MCC are capped. Using SLHClike events (400 pile-up) it can be shown that  $\langle 5\%$  of highpT track derived hits are lost [1]. See Figure 4.

To further reduce data-volume, only the first strip of 2 strip hits are used. Combining the low number of hits with known hit-count maxima allows for an efficient packing algorithm that will improve further with larger (more strip channel) chips.



Figure 4: Plot showing cluster width differences between higher pT and min-bias events.

# *C. Data Transfer and Synchronisation*

Ideally regional data would have a dedicated path offdetector allowing for fixed latency and no congestion. This introduces many new readout paths, and could double the number optical links between the detector and counting room. This is obviously not desirable.

Sharing a readout "channel" with event-data makes sense (especially when considering the low data-volume), but this both de-synchronises the data and increases latency: Event data will most-likely be transferred in packets [4] with headers, trailers, bunch-crossing IDs, event IDs, chip IDs etc. A packet might be broken into frames allowing it to be transferred non-continuously. Regional data will need to wait for any in-progress packets or frames to finish transferring before initiating readout.

Smaller frames will have less impact on regional-data synchronisation, but will also decreases data-volume efficiency. Ideally a frame of the order 10 bits would be a compromise worth investigating: 1 start bit, 1 normal/regional event select bit, and 8 bits data.

# *D. Off-detector Readout*

Data from the detector are transferred, via optical links, to RODs in the counting-room. Regional data does not need to be processed by the ROD in a significant way. Here the ROD acts as a router diverting the incoming data out to the track-finder hardware.

As track-trigger data-volume is low, the number of links to the track-finder can be optimised and data concentrated (although queuing during times of peak volume needs to be taken into account). Tags will need to be added to the data to identify which link (or module ID) it belongs to. As the data will arrive relatively slowly from the front-end (a single optical link is shared by 12 modules) it might be fragmented when sent to the track-finder and require more tagging. The additional latency incurred while queuing can be reduced, on average, by prioritising older data (i.e. that with earlier bunch-crossing IDs).

Detector layout plays a part in level readout latency too. As an RoI will encompass adjacent super-modules, data should be routed to different RODs. For example, in the barrel, only every  $3<sup>rd</sup>$  super-module, radially, should be connected to the same ROD.

# *E. Track-Finder*

Due to the distinct differences in layout between barrel and end-cap, the track-finder will have optimised configurations divided geographically along the length of the detector: barrel, end-cap and both. The detector will also be divided into quadrants, with overlap. This motivates independent track-finder units servicing the 24 zones.

To allow for asynchronous data, the track-finder unit will assign a processor to an individual event (BCID). Incoming data from the RODs will need to be routed first

to its' zonal unit (and duplicated in the case of overlaps) and then routed to the processor assigned to that event.

The processor is expected to operate using a "bingo" technique – as data arrives it is used and if tracks are found they are logged. This means tracks can be found even with incomplete data-sets.

By determining a processing cut-off time synchronous to the event being processed, all tracks found can be passed to the next stage of the trigger system synchronously if needed, with outstanding data discarded.

# *F. Regional-Readout-Request Distribution*

The regional-readout-request signal operates similarly to the L1-Accept  $(L1A)$  signal – it is synchronous to the BC it acts on, used to copy data from the front-end pipelines, generated by the Level-1 Trigger, and is desired to be low-latency.

However, unlike the L1A, the R3 is not broadcast, but instead targeted at specific modules. There are of the order 50000 modules in the tracker alone, so this is a large-scale system.

With ~4000 RoIs it will be most efficient to distribute RoI-IDs as opposed to R3 signal where possible. Using CERN Giga-Bit Transceivers (GBTs) in the counting room, we can distribute 6 to 10 RoIs/BC [5], allowing RoIs to be broadcast to all ROD-Crates (containing  $\sim$ 10 RODs each) via the TIM, or directly to each ROD (of order 200 in the SCT+Pixels).

Each ROD identifies which of its connected supermodules are inside the RoI and generates an R3 map for these modules. This requires a custom look-up table on each ROD which will need uploading at configuration.

The R3 signals are transferred using a special GBT word to the super-module where the SMC decodes the signal and forwards to the modules.



Figure 5: Schematic of R3 generation and distribution system.

As each module needs to be identified individually, point-to-point links between the SMC and the module would be ideal, but resources on-detector are limited. Sending the signal serially (at 40Mb/s) is slow and introduces latency (300-600ns). Latency can obviously be improved by broadcasting at higher rates.

A compromise between signalling and latency ondetector would be to split the super-modules into 'zones' allowing simultaneous short bitmaps to be sent to each group of modules.

Other options include broadcasting just the central module ID and let the modules decide if they are inside the RoI or not.

# IV. CONCLUSION

Although a track-trigger has only recently been applied to the ATLAS upgrade design, options have been found for its incorporation. There are many outstanding issues, not least of which is the latency requirement, but all of the subsystems involved seem capable of the modifications required.

# V. REFERENCES

[1] ATLAS Track-trigger webpage,

<https://twiki.cern.ch/twiki/bin/view/Atlas/L1TrackTrigger>

[2] N. Gee, private communication

[3] H. Evans, meeting slides

[http://indico.cern.ch/getFile.py/access?contribId=5&resId=](http://indico.cern.ch/getFile.py/access?contribId=5&resId=0&materialId=slides&confId=65243) [0&materialId=slides&confId=65243](http://indico.cern.ch/getFile.py/access?contribId=5&resId=0&materialId=slides&confId=65243)

[4] Strips Readout Working Group, private communication

[5] CERN GBT Project, <http://cern.ch/proj-gbt>

# Design of a trigger module for the CMS Tracker at SLHC

# G. Hall, for the CMS Collaboration

#### Blackett Laboratory, Imperial College, London SW7 2AZ, UK

g.hall@imperial.ac.uk

# *Abstract*

The CMS experiment is planning a major upgrade of its tracking system to adapt to an expected increase in luminosity of the LHC accelerator to  $10^{35}$  cm<sup>-2</sup>.s<sup>-1</sup>. It will then have to cope with several hundred interactions per bunch crossing and fluxes of thousands of charged particles emerging from collisions. CMS requires tracker data to contribute to the first level trigger, to maintain the present 100kHz rate while increasing the trigger decision latency by only a few µs. A key part of a system to achieve this will be the design of a suitable module to generate trigger primitives.

One possible solution is based on so-called "stacked tracker modules" using closely spaced, coarsely pixellated sensor layers situated at intermediate radius within the tracker volume. A basic readout architecture is proposed and some of the electronic implications are described. Estimates of likely power consumption are given, and data rates and link bandwidth requirements.

# I. INTRODUCTION

The upgrade of the LHC accelerator to Super-LHC (SLHC) foresees operating at  $10^{35}$  cm<sup>-2</sup> s<sup>-1</sup> luminosity to provide increased statistics. and allow deeper investigations into rare processes including, hopefully, discoveries of new physics. The LHC peak luminosity of  $10^{34}$  cm<sup>-2</sup> s<sup>-1</sup> will eventually deliver about 50  $\text{fb}^{-1}/\text{yr}$  [1,2] and CMS was designed for 10 years operation under these conditions. Most of CMS should survive predicted irradiation levels and perform well with few changes at higher luminosities, except for the Tracker which will gradually suffer degradation from radiation damage, probably reducing performance after about  $500$  fb<sup>-1</sup>. There are already plans to allow earlier replacement of some layers of the pixel detector, since that will suffer more radiation damage than the outer Tracker. Trigger and data acquisition systems would also take advantage of technology evolution to be improved to cope with SLHC data volumes and rates whenever luminosity increases should occur.

The present Tracker surrounds the interaction point and provides precise, efficient measurement of charged particle trajectories and secondary vertices. It comprises pixels in three barrel layers at radii 4.4-10.2 cm and ten barrel layers of silicon microstrips to a radius of 1.1 m. It includes two endcap disks in the pixel detector and nine in the strip tracker on each side, plus inner barrel disks, extending acceptance to  $|\eta|$  of 2.5. With about 200  $m<sup>2</sup>$  of active area the CMS system is the largest silicon tracker ever built [3]. The pixel system is quickly removable, in case of beam-pipe bake-outs. Inner layer replacement was foreseen after several years of high luminosity operation as sensors reach irradiation levels corresponding to  $100-300$  fb<sup>-1</sup> integrated luminosity. The microstrip and pixel detectors have operated with the rest of CMS taking cosmic ray data since 2008 and the performance looks very promising.

Most CMS sub-detectors will not change much for SLHC. It is important to maintain compatibility and retain the Level 1 trigger rate limit of 100kHz. Trigger latency can increase from  $\sim$ 3.2 $\mu$ s to 6.4 $\mu$ s, limited by electromagnetic calorimeter pipelines.

The notable exception is the tracking system, whose performance will eventually be degraded by radiation damage caused by immense particle fluxes. Greater radiation tolerance will be required, especially for sensors. In contrast, ASIC electronics should withstand SLHC radiation levels but the 0.25µm CMOS technology pioneered by CMS will be superseded by more advanced processes [4].

In the congested SLHC environment of 300-400 events per beam crossing, with thousands of particles emerging from interactions, higher granularity is required [5] CMS also requires to use tracker data in the first level trigger decision.

# II. THE UPGRADED TRACKER

The first phase of the machine upgrade might be five or six years after LHC start-up, to reach a peak luminosity of 2-  $3x10^{34}$  cm<sup>-2</sup> s<sup>-1</sup>. Inner focusing magnets will be replaced, larger aperture collimators installed and the proton linac replaced to reach the ultimate LHC current. Around the same time the inner layer of the pixel system should be replaced. It looks possible to rebuild the pixel detector to achieve improved performance by reducing material [6,7]. In the longer term the pixel system for operation at  $10^{35}$  cm<sup>2</sup> s<sup>-1-</sup>is expected to be similar in detector area and material budget to the Phase I device. It is also expected to evolve further, with a new ROC architecture and pixel size optimized for SLHC conditions.

High quality tracking and vertexing performance must certainly be maintained in the congested SLHC environment. From simulations of heavy ion events in the present tracker, with similar track density to SLHC, an extra pixel layer would restore track seeding losses. A new layout can be optimised for track finding and jet reconstruction. Granularity must increase because of leakage currents as well as track recognition.

Multiple scattering, photon conversions, bremsstrahlung and hadronic interactions are undesirable, and depend on limiting material within the Tracker. A major constraint for a new system is that cooling pipes, power cables and optical fibres follow complex, congested routes and installation was time consuming and difficult. It is unlikely they can be replaced.

Another major challenge is the requirement to use Tracker data for the first time in the Level-1 trigger. Single  $\mu$ , electron and jet Level 1 trigger rates at SLHC will greatly exceed 100kHz and cannot be reduced sufficiently by increasing  $p_T$ thresholds or by other improvements in calorimeter and muon system algorithms. Tracking information in the present High Level Trigger (HLT) already provides additional rejection power and motivates future use of Tracker data at L1. However, the constraints are very different, since the HLT has access to all the Tracker data for almost complete track reconstruction and has a relatively long time (~40ms) available. In contrast a L1 track trigger must make decisions in a few µs and it does not seem feasible to transfer data to HLT processors and fully reconstruct tracks. The data volumes are simply too large.

One proposal has been made to use cluster width information to eliminate low  $p_T$  tracks [8]. An alternative which has been simulated in some detail deploys closely spaced, coarsely pixellated sensor layers at intermediate radius and compares hit patterns [9] to eliminate data from low  $p_T$  tracks, thus reducing the data volume significantly. The  $p_T$  cut is set by the angle of a track in the layer, and the logic might be relatively simple. The development of modules which would allow this is the main subject of this paper.

Presently there is no single design agreed for the Phase II Tracker. Simulations are vital, and alternative layouts are under consideration to investigate performance in detail.

#### III. THE TRACK-TRIGGER CHALLENGE

The major difficulty implementing tracking triggers at Level 1 is the data volume. It is easy to see that it is not feasible to transfer all data off-detector for decision logic. For example a single layer at a radius of 25 cm with 2.5mm x 100 $\mu$ m pixels is expected to have an occupancy  $\sim$ 0.5% at  $10^{35}$ cm<sup>-2</sup>s<sup>-1</sup>. This would require ~20M channels of coarse pixels, each contributing about 24bits so a data rate of  $\sim$ 96,000 Gb/s needing an enormous number of optical links, and power. Therefore some method for on-detector data reduction, selective readout, or a combination, is essential. Pixellated trigger layers will be more power hungry than microstrip layers, so the challenge is obvious.



Fig. 1. The principle of selecting high transverse momentum tracks in stacked layers. A stub is a pair of hits passing the selection criteria.

The charged particle transverse momentum spectrum of contains a large fraction of low  $p_T$  tracks which are not useful for triggering. It is conceptually simple to estimate the transverse momentum using pairs of closely spaced layers [9], provided sensor element sizes are properly dimensioned, which depend on the radial location of the layer (fig. 1). A double layer identifies "stubs" which are pairs of nearby hits in the two sensors which allow to define a track with transverse momentum above a  $p_T$  threshold. The method to find stubs is simply to compare a binary pattern of hit pixels on upper and lower sensors, possibly with a processing step

on the detector which identifies clusters rather than using individual hits since this is expected to lead to extra combinations. These should be the trigger primitives transferred to the L1 trigger system for more sophisticated algorithms to process for the final trigger decision.

Under SLHC conditions, the hit density means a high rate of combinatorial background if the sensor area which is searched for matching pairs is not carefully defined. This will depend on the radial separation of the two sensor planes (fig. 2).

Quite extensive simulations [10] have been carried out using a layout of the detector which includes a realistic model of individual detectors and the services thought to be required to power and read out the double sensor layer modules producing trigger stubs, which are here referred to as "PT modules". The objective is to understand better how the PT modules can best contribute to trigger and the overall L1 trigger rate reduction which is achievable. Some results are illustrated in Table 1 and fig. 2 for events containing muons pairs in the presence of high pileup, suggesting sensor separations of less than a few mm could meet the requirements.



Fig 2. Efficiency for constructing stubs as a function of transverse momentum in high luminosity conditions at SLHC. Here the selection criteria are a row window of 3 pixels, and a column window of 2 pixels for 0.5mm spacing, and 3 pixels for 1mm- 2mm.

Efficiency is the fraction of stubs to tracks above  $2 \text{ GeV/c}$ , while the fake rate is associated from stubs which are formed when hits from two tracks which would not pass the  $p_T$  cut on their own are correlated to generate fake stubs. The reduction factor is the ratio of hits to stubs, which require to be read out. As can be observed, a separation of about 1mm between sensors provides high efficiency and low fake rate with a reduction factor  $\sim 20$ . Efficiency falls as the separation between the layers increases, largely because of geometrical acceptance; the fake rate also increases as it becomes harder to reject accidental combinations.

Table 1. Simulation estimates of stub finding efficiency in 100um x 2.5mm pixels stacked layers at 25cm radius, with 0.5% occupancy.

| Radial separation<br>[mm] | Efficiency<br>[%] | Fake rate<br>[%] | Reduction<br>factor |
|---------------------------|-------------------|------------------|---------------------|
| 0.5                       | 99.0              | 0.7              | 8.0                 |
| 1.0                       | 99.4              | 4.1              | 22                  |
| 2.0                       | 97.7              | 17.8             | 96                  |
| 3.0                       | 96.0              | 39.0             | 210                 |
| 4.0                       | 92.9              | 47.2             | 254                 |

# IV. MODULE REQUIREMENTS

The studies into the definition of the trigger layers have begun to pose questions such as the following

- how are the stubs to be used in the trigger and what rejection factors are achieved?
- how many layers are needed?
- what is their optimal location, allowing sufficient n coverage?
- what is the impact of material, in trigger layers and elsewhere, on trigger performance?
- how important is z-measurement of the primary vertex, and what is the required resolution?
- what is the impact of the trigger layers, which will certainly be more massive than conventional tracking layers, on tracking performance?
- what are the likely cost, power requirements and contribution to the tracker material budget?
- can the layers be read out at the full 40MHz rate or is a L0 trigger, i.e. a signal preceding the L1 trigger, needed to select a region of interest?

Although the answers to some of the questions posed above are needed to guide the design of the PT module, it is also difficult to answer them without concrete details of a module design in mind. For this reason, this must be an iterative process. For the present a couple of concepts are being evaluated, with the hope to identify an optimum design which can be prototyped by a collaboration within CMS.

#### *A. Schematic Module design*

The first module type is illustrated in Fig. 3. The pixel size is 100µm by 2.5 mm arranged in columns of 256 rows with 32 columns per module, so an approximate active sensor size of 25.6 mm x 80 mm contains 8192 pixels. The sensor is expected to be 200µm or less in thickness. Hits are read out to the upper and lower sides of the module where the connections between the two sensors are made, which allows a module to be constructed without material under the sensitive sensor area in the interests of minimizing multiple scattering in the measurement paths and reducing heat dissipation in the immediate proximity of the sensors. The data are read out from a column of 128 pixels and transferred to the end of the readout chip on each clock cycle. Typically less than one pixel per column will be hit in each beam crossing and this may be exploited to avoid a high speed serialiser which is expected to be too power hungry.

The readout ASIC (ROC) for each column is assumed to be a 128 channel front-end element, with amplifier and other circuits in each pixel, plus an "assembler" at the periphery where data to be used by the trigger are temporarily stored and comparisons of patterns between the two layers are made. To minimize the interconnections on the module and take advantage of the higher density of metal lines possible at the chip level, the assembler is part of the ROC ASIC, not a separate chip. Probably several columns will be amalgamated into a single chip, perhaps with up to 8 adjacent channels (20mm wide). Note that two ROCs are required to read out the full module width.

It is assumed that the high speed links required for the system will be based on the CERN GBT (Gigabit Bidirectional Transmission) [11] and Versatile Optical Link projects [12] which are developing a radiation hard bidirectional optical links for use in the LHC experiment upgrades. At the edge of the module there is another ASIC, referred to as a "concentrator" which would be the interface to the GBT, or actually a component of the GBT chip set which would be an economical solution. It would provide inputs to (clock, trigger, control data) and outputs (data for the tracktrigger) from the module.



Fig. 3 The PT module seen in plan view (upper) and in section (lower), indicating connections required by the two sensor layers. The sensor is bump bonded to the ROCs. In this example 8 ROCs with a total of 8192 channels are required.

A significant contribution to the total power required for PT layers comes from the links, which are assumed to require 2W/channel for 4.8 Gbps including error correction, of which 3.2Gbps is available for data. Roughly 3000 GBT links are required to read out a layer of about 40M pixels at radius of 25cm, assuming a data reduction factor of 20 and an occupancy of 0.5% and 24 bits transmitted for each selected pixel (sending data from only one of the two sensors in the double layer). These figures assume 50% use of the GBT bandwidth, so the link power requirement is 6kW for the layer, or 150µW/channel. It seems that detector layout will constrain the location of the GBT transceivers to the close vicinity of the module. It is plausible that the link speed might double for the same estimated power consumption, so there is hope to improve on this contribution to the power budget. There are significant uncertainties in these estimates as factors such as ability to select clusters and optimal layout of links, as well as the use of bandwidth must be better understood.

The logic of the readout chip design is illustrated in fig. 4. Identical chips are foreseen in the two layers, with certain elements not operational, because not needed, in one layer to save power. Hit data are transferred to a memory buffer in the assembler area each clock cycle, then data from the lower layer are passed to the upper layer. In the upper layer, hits are also passed from neighbouring columns so a comparison can be made between patterns from the lower layer and three columns in the upper layer to make a decision on valid stubs. Those patterns consistent with high  $p_T$  stubs are transferred from the module off-detector.

Very provisional estimates of local power requirements have been made which suggest that, using 130nm CMOS, about 100µW per channel might be achievable, leading to a total of  $\sim$ 250 $\mu$ W per channel. It should be emphasised that these evaluations are quite uncertain, since chip designs have not begun and the logic and local data transmission rates are

not well understood. However, such estimates are essential in developing the module design.

Hit data should be stored on the pixel for full readout following a Level 1 trigger. Given the likely number of layers in the future Tracker, it is probably desirable to read out all hit data from PT modules despite the low  $p_T$  threshold, which will add further to the power required. This functionality would also be valuable for evaluation. It is estimated that a binary pipeline in each pixel would require too much space and an architecture similar to that used in the present pixel detector is under discussion.



Fig. 4 (left) Schematic of possible layout of ROC chip to read out 128 pixel columns, in this case grouped in units of 4 ROCs per chip. (right) Schematic of the data flow to allow comparison logic to be placed in the assembler area of the ROC, at the periphery of the sensor.

The total power consumption for stacked layers with these pixel dimensions can thus be estimated to be about 10kW for 40M pixels at 25 cm radius, and 19kW for 75M pixels at 35 cm radius. The total number of links required is 2900 and 5600 for the two cases; which does not allow for full readout of the layers, only track-trigger data. These layers will therefore represent the major contribution to power consumption of a likely layout of a new Tracker and great care will be needed not to allow either power, material or numbers of links to increase significantly if the tracking performance is to be maintained.

#### *3.2 Alternative Module design*

CMS now has experience of automated assembly but it will be highly desirable to optimise construction to take full advantage of commercial manufacturing. It may be possible to design a module with more advanced technologies, and transfer many of the assembly issues to industry. To do so requires a different approach to the logic and a careful evaluation of commercial methods, where multi-layer technologies continue to advance significantly [13].

The concept is derived from hybrid pixel detectors. The basic module consists of a matrix of read-out chips (TFEA: Tracker Front-End ASIC), each integrated circuit an array of 4 by 160 identical channels, mapped onto a corresponding array of 100 µm x 2000 µm elements on the silicon sensor. (Dimensions are illustrative, chosen to use a 150mm diameter sensor wafer.) The modules proposed are composed of a sandwich as illustrated in fig. 5 and assembled using a combination of standard technologies, such as wire-bonding and bump-bonding.



Fig. 5 Cross section of the module in the views along z and in the rphi plane. Dimensions are illustrative.

The ASIC should be large enough to cover the sensing area with a minimum of dead space as well as reducing module power consumption, so therefore will avoid moving data at high speed across chips whenever possible.

The integrated circuits are connected using wire-bonding or bump-bonding on a double-sided substrate (fig. 5). The example illustrated requires through-vias in an intermediate layer, of the type used commercially for low cost memory assembly. The read-out chips are then sandwiched between two silicon sensor layers connected to the chips using coarse pitch bump-bonding which should be readily available, e.g.  $\sim$ 200 $\mu$ m minimum pitch with relatively large bumps. The choice of an inexpensive and well known interconnection technology minimises costs, risk and investment and simplifies the manufacturing process. In addition, the concept is intended to allow straightforward testing on the ASICs, to enhance module production yield and simplifying manufacturability.

The architecture differs from the previous concept, by aiming to perform all necessary functions *locally* on each front-end chip. No transfer of data to a correlator or assembler area on the chip is necessary as all front-end and triggering functions are performed in or close to each pixel in the TFEA chip.

The Front-End ASIC is composed of a number of identical functional units, all present in each channel but not necessarily activated depending on the position of the ASIC in the module. These units are:

- A front-end amplifier, shaper and discriminator providing a binary yes/no answer at each bunch crossing.
- An Event Store buffer, to store the decision of the previous stage until arrival of the L1 trigger.
- A Data Link unit used to send information retrieved from the Event Store buffer upon arrival of an L1 trigger.
- A Local Trigger link, used to send promptly at each bunch crossing information from the FE to the TFEA chip(s) in the layer below.
- A Trigger Logic block to correlate information from the FE blocks in order to find stiff tracks. Clearly this logic block must have connectivity to adjacent pixels.
- A Trigger Link block to send promptly the result of the previous Trigger Logic block, if positive, to an external trigger logic block.
- A Clock and Control block to receive and regenerate the clock necessary to operate the TFEA. It also contains a slow control interface necessary to address local configuration registers and ancillary logic.

The connectivity of these blocks inside each TFEA and across the two layers of a module is illustrated in fig. 6. Both layers obviously contribute to the formation of a trigger primitive, but only the lower one saves and transmits the data upon arrival of the L1 trigger, which is depicted. It is expected that the hit information from only one of the two layers would be transmitted. As the correlation of the hit position is performed inside the chip on the lower layer of the assembly, this architecture should minimise data movement outside the TFEA chips and therefore reduces power consumption.

This conceptual design poses several important questions, including cost, such as practicality of large scale, low profile wire bonding and the assembly of large area sensors on a multi-layer substrate, with double-sided ASIC assembly, as well as issues concerning the logic and data transmission schemes. Wire and bump bonding meeting these specifications is routinely done in high volume flash memory assemblies at very low cost and with high yield and through via substrates are also in widespread use.



Fig. 6. Block diagram for the logical functions of a pair of TFEA chips

#### *3.3 Developments ahead*

The next steps in developing these ideas is to evaluate the approaches to module development in more detail and to compare and contrast the pros and cons. For example it is important to

- understand the impact on the material budget
- understand the implications of different choices for power or logic
- identify and design building block circuits
- understand the requirements for commercial manufacture, including costs and the scale of technological challenges
- evaluate many issues for module construction, especially power and cooling.

There are also many practical details which must be better understood, such as the handling of z-offsets, implementation of comparison logic before hopefully arriving at single concept for prototyping. It will be essential to evaluate a real module in a beam test, even with limited features, since this type of module has never been used in a previous experimental system.

# V. CONCLUSIONS

Modules which will provide trigger primitives for use in CMS look feasible. Once prototyped they will provide a new part in the detector toolbox but will contribute a large fraction of a future Tracker power and material budget. The physics objectives will become clearer in the next few years and may evolve so designs which are flexible will be needed. It is also crucial to improve understanding of power consumption, which is sensitive to occupancy and achievable rejection factors in the selection.

Benefits from ASIC technology feature size reduction are expected in implementing the features required but no dramatic performance gains are yet anticipated. In addition, the questions of how to process off-detector very large volumes of tracker data may not be straightforward, as well as the trigger algorithms required to utilize the information, so there are further challenges ahead. "Conventional" assembly methods may be feasible but it will be important to evaluate commercial manufacturing, exploiting technology progress, which may have a very important role in building these novel features into a future tracking detector. Prototype module development is now very timely.

#### **ACKNOWLEDGMENTS**

Many colleagues in CMS developed the outstanding tracking detector which has now been completed and are contributing to preparations to improve it for an even more demanding future. The ideas for the second module concept originate with A. Marchioro, who I would like to thank along with D. Abbaneo, K. Gill, M. Pesaresi, M. Raymond, A. Ryd for very valuable discussions in developing these ideas.

#### **REFERENCES**

- 1. The CMS Collaboration*. CMS Tracker Technical Design Report,* CERN/LHCC 98-6 (1998) and Addendum CERN/LHCC 2000-016 (2000)
- 2. The CMS estimate of integrated luminosity assumes  $10^7$ s operation annually at nominal luminosity with a 50% operational efficiency factor.
- 3. The CMS Collaboration *The CMS experiment at the CERN LHC* 2008 JINST **3** S08004 [doi: [10.1088/1748-](http://dx.doi.org/10.1088/1748-0221/3/08/S08004) [0221/3/08/S08004\]](http://dx.doi.org/10.1088/1748-0221/3/08/S08004)
- 4. G. Hall *Recent progress in front end ASICs for highenergy physics* Nucl. Instr. and Meths in Phys. Res., A541 (2005) 248-258
- 5. F. Gianotti, M. Mangano, T. Virdee, et al., *Physics Potential and Experimental Challenges of the LHC Luminosity Upgrade* Eur. Phys. J. **C39** (2004) 293–333. [doi:10.1140/epjc/s2004-02061-6.]
- 6. G. Hall, *The upgrade programme of the CMS Tracker at SLHC*. Vertex 2008. To be published in Proc of Science
- 7. R. Horisberger. Private communication and internal CMS presentations.
- 8. F. Palla 2007 JINST **2** P02002
- 9. J. Jones, G. Hall, C. Foudas, A. Rose. *A Pixel Detector for Level-1 Triggering at SLHC*, arXiv:physics/0510228. CERN Report CERN-2005-011(2005) 130-134
- 10. M. Pesaresi. Private communication and internal CMS presentations. PhD thesis in preparation.
- 11. https://espace.cern.ch/GBT-Project/default.aspx
- 12. https://espace.cern.ch/project-versatile-link/default.aspx
- 13. A. Marchioro, who developed the ideas in this section. Private communication.

# Trigger R&D for CMS at SLHC

G. Iles  $a$ , C. Foudas  $a$ , M. Hansen  $b$ , J. Jones  $c$ 

a Imperial College, London, UK <sup>b</sup> CERN, 1211 Geneva 23, Switzerland <sup>c</sup> Princeton University, Princeton, NJ, USA

[g.iles@imperial.ac.uk](mailto:g.iles@imperial.ac.uk)

# *Abstract*

CERN has made public a comprehensive plan for upgrading the LHC proton-proton accelerator to provide increased luminosity commonly referred to as Super LHC (SLHC) [1]. The plan envisages two phases of upgrades during which the LHC luminosity increases gradually to reach between  $6-7\times10^{34}$  cm<sup>-2</sup>sec<sup>-1</sup>. Over the past year, CMS has responded with a series of workshops and studies which have defined the roadmap for upgrading the experiment to cope with the SLHC environment. Increased luminosity will result in increased backgrounds and challenges for CMS and a major part of the CMS upgrade plan is a new Level-1 Trigger (L1T) system which will be able to cope with the high background environment at the SLHC.

Two major CMS milestones will define the evolution of the CMS trigger upgrades: The change of the Hadronic Calorimeter electronics during phase-I and the introduction of the track trigger during phase-II.

This paper outlines alternative designs for a new trigger system and the consequences for cost, latency, complexity and flexibility. In particular, it looks at how the trigger geometry of CMS could be mapped onto the latest generation of hardware while remaining backwards compatible with current infrastructure.

A separate paper presented at this conference [2] looks at what could be possible if large parts of the trigger system were changed, or additional hardware added to create a time multiplexed trigger system.

# I. INTRODUCTION

Plans are already well advanced for upgrades to the LHC machine that will provide increased luminosity. The current CMS experiment will fail to reap the full benefit of these upgrades for a number of reasons. One of these is that the current trigger system will be overwhelmed. It will not be possible to set sensible energy thresholds without the trigger rate exceeding the maximum Level-1 Accept (L1A) rate of 100kHz. Hence the Global Trigger would be forced to restrict the trigger rate by simply pre-scaling the trigger and thus effectively negating any benefit from increased luminosity.

It is for this reason that work has started on trying to integrate a tracking trigger in a future trigger system.

This would help identify the most interesting events and bring the trigger rate back below 100kHz. A new trigger system could potentially have several others benefits such as improved flexibility because it would be based solely on

FPGAs. The improvements in technology could also make the system easier to design, build and maintain, which could have a substantial impact not just on the cost of the hardware. but also on the manpower cost to test and operate it.

The phase I upgrade of the Hadronic Calorimeter (HCAL) electronics will precede that of the tracker and will provide lateral information of the energy depositions within the HCAL. An upgraded trigger system implemented at the same time would provide improvements to cluster-based triggers, such as the tau trigger, whilst at the same time preparing the trigger for track trigger information. This will enable CMS to make more stringent isolation cuts and provide triggers of higher purity early in the upgrade program. Consequently, the time seems ripe to begin consideration of a new trigger system.

# II. CURRENT TRIGGER

The trigger in CMS is split into two stages; the L1T (Level-1 Trigger) operates on coarsely segmented data that is transmitted and analysed for every proton-proton bunch crossing; the HLT (High Level Trigger) operates on the high resolution data that is stored on-detector in pipeline memories and is only read out after receipt of a L1A. The L1T uses a mixture of ASICs and FPGAs to processes data from each bunch crossing (i.e. 40MHz), while the latter uses PCs to process events at up to 100kHz.

The L1T design is split into two paths. The calorimeter trigger path is decribed here, but there exists a similar path for the muon trigger.

The Trigger Primitive Generators (TPGs) provide coarsely segmented data from the detector front ends at "tower" resolution, which for the Electromagnetic & Hadronic Calorimeters (ECAL & HCAL) consist of energy depositions with some additional detail (e.g. energy spread). The RCT (Regional Calorimeter Trigger) uses a clustering algorithm to search for electron candidates. It also reduces the resolution further by building "regions". These are then used by a clustering algorithm in the Global Calorimeter Trigger (GCT) to find jets. The GCT then sorts the electrons and jets into rank (i.e. in order of importance) and transmits the data to the Global Trigger (GT) which searches for physics signatures.

#### III. UPGRADE PATH

A new trigger system would replace the RCT and GCT. It would be highly desirable if this could be achieved with little impact on the rest of the CMS detector. The minimal changes would probably require upgrading the TPG and GT interfaces to use multimode optical links running at speeds comparable to the latest iteration of FPGAs (i.e up to 6.5Gb/s, perhaps up to 11 Gb/s).

This was foreseen over a year ago and thus when a replacement had to be designed for the GCT-GT links it was based on a Xilinx Virtex 5 with multimode optics [3]. The Optical Global Trigger Interface (OGTI) design (fig. 1) is essentially the first step in an upgrade of the trigger. A beneficial aspect of the card is that there is spare link bandwidth and thus it would be possible to drive two GTs. An upgraded GT could therefore be developed in parallel with the existing GT without having an impact on normal CMS running.



Figure 1: OGTI Card. Xilinx XC5VLX110T FPGA and 4x POP4 optics providing 16 channels at 3.2Gb/s in a dual CMC form factor.

It might be useful to use the same concept for the TPGs, which would need their links upgraded *(i.e. they would have* dual outputs). This is relatively easy because the links between the TPGs and the RCT reside on a daughter card known as the SLB. Hence the second step in an upgrade program would probably be to switch these links to use optical multimode links and an FPGA.

A new RCT and GCT could then be developed in parallel with the output going to a new GT, which could then be fed into the existing GT as a technical trigger without comprimising normal CMS operation.

Upgrading the links in CMS is relatively straight forward, but not the data on them. The latter would require changing the TPGs and while this is planned for the HCAL there is currently no plan for ECAL. A second option might be to build adapter cards, however this would impose a latency penalty that may or may not be acceptable. The following is therefore a consideration for a new trigger system design in which the data flowing from the TPGs remains unchanged, albeit concentrated onto faster optical links where possible.

# IV. TRIGGER GEOMETRY

The CMS coordinate system (fig. 2) has its origin centred at the nominal collision point. The azimuthal angle  $\varphi$  (0 to  $2\pi$ ) radians) is measured in the plane perpendicular to the beam.

The polar angle  $\theta$  (- $\pi/2$  to  $\pi/2$ ) is measured from the plane perpendicular to beam, although it is more normally expressed in terms of pseudorapidity, η, because at a hadron collider particle production is roughly constant as a function pseudorapidity.



Figure 2: The φ and η coordinate system used in the CMS detector.

The TPGs, provide coarsely segmented data at "tower" resolution, which has an η, φ coverage of 0.087 x 0.087 rad up to  $\eta = 1.74$ . Beyond that the towers are larger [4]

The trigger geometry (fig. 3) is split into 18 regions in  $\varphi$ and  $\pm 11$  regions in η, however regions  $\pm 8$  and above (i.e. psueudorapidity  $> 3.0$  and  $< 5.0$ ) are only covered by the Forward HCAL.



Figure 3: A portion of the RCT input geometry. Only 4 of the 18 regions in φ are shown and only  $\frac{1}{2}$  of η. The approximate size of an electron, tau and normal jet are shown to give the reader an indication of size.

 Each region is sub-divided into 4x4 towers except for the HF that is divided into 2x2 towers. In the case of ECAL, these towers are further subdivided into 5x5 crystals. Electrons have a width of less than 2 towers in both dimensions. Tau jets are similar, although they can extend to 3 towers in the φ dimension. Standard jets span up to 9-12 towers in both dimensions. Both systems transmit 8bits of energy and one extra bit. ECAL transmits the Fine Grain Veto bit, which is asserted when 90% of the energy within a tower is not contained within two crystals in η (i.e. it is designed to identify a single electron/photon, while allowing for the fact that an electron might emit bremsstrahlung radiation in the magnetic field). HCAL transmits the Minimum Ionising Particle (MIP) bit, which indicates that the energy deposited was compatible with a muon passing through it.

The tower information arrives at the RCT in the form of cables with 4 channels (ABCD). Channels AB and CD both span a single tower in  $\eta$ , but 4 towers in  $\varphi$  and when combined they span 2 towers in η. The links currently run at 1.2Gb/s with each bunch crossing comprising 2x9bits of tower data, 5bits of hamming code and a single bit for BC0 identification.

The 4 links would combine nicely to create a single 4.8Gb/s link with room for additional information if the Hamming code and BC0 were discarded in favour of a once per orbit CRC check and a special 8B/10B k-code to indicate BC0. This would provide 8 towers per bunch crossing. However, there are some special circumstances in which channels ABCD do not originate from the same location and thus forming a single 4.8Gb/s link would not be possible. Instead there would have to be 2x 2.4Gb/s links which would require additional FPGA I/O.

#### V. TECHNOLOGY CHOICE

The two major advances over the last 5 years that are particularly useful for a trigger system are the continuing advances in both FPGA technology with embedded SerDes blocks operating a multi Gb/s rates, and the move to the optical interconnects necessary to transmit these signals over distances of more than a few feet.

Despite the latest FPGAs now having an I/O bandwidth of several hundred Gb/s they are still approximately an order of magnitude below what would be needed to absorb all the TPG data of several Tb/s in a single FPGA.

The challenge is therefore to concentrate the data into multiple FPGAs with sufficient boundary condition data for the cluster algorithms to operate efficiently and within a timescale of  $< 1$ µs.

If we assume that in an upgrade there should be some spare capacity for additional tower information (e.g. improved energy resolution) and thus allocate 12bits rather than 9bits per tower and we also assume a 4.8Gb/s, 8B/10B link synchronised to the LHC clock then we can transmit 8 towers (i.e. half a region) per bunch crossing (25ns). It is of course possible to slightly improve the efficiency of the link by going to 64B/66B encoding. We may also prefer to run with a slightly faster asynchronous clock, at perhaps 5.0Gb/s, however these are just details. The basic architecture should not be determined by these details and the data packing on the fibres should not be optimised so that it becomes imposible to easily understand the system. Consequently, we require approximately 4 links per region to accept HCAL and ECAL data. It is assumed that any tracking trigger, possibly even muon trigger would require substantially less bandwidth because it is only transmitting location information, however for modularity reasons they may require multiple input links and perhaps a lower speed interface to the FPGA (i.e. <  $1$ Gb/s).

#### VI. INITIAL CONCEPT

The original concept behind a new trigger system was to place all the ECAL, HCAL, muon and tracking trigger information into a single FPGA at tower resolution so that coincidences between different subsystems could be used to improve physics object recognition. The baseline design consisted of finding trigger objects centred within a single region that was bounded by a region on all sides and all corners so that an array of 3x3 regions was constructed (fig. 4). The boundary information would be provided by duplicating data where necessary. This led to the development of the Matrix card [2] that incorporated a 72x72 cross-point switch for data duplication.





This architecture has several disadvantages. The design is very inefficient because only 1/9 of the data is processed in any given processing card. Furthermore, duplicating and distributing such a large quantity of data is not trivial. For example, if we use our earlier assumption of 4 links per region to bring ECAL and HCAL data into the FPGA we would require 36 (9x4) links running at 4.8Gb/s. The largest Xilinx Virtex 6 FPGAs do have this many links, however there is little spare capacity for extra trigger input.

Furthermore, it is currently envisaged that the data duplication would take place with a combination of large, high speed serial, protocol agnostic, cross-point switches and optical / µTCA backplane interconnects. It is not clear whether the links would be able to pass through many of these components, as they might have to, without regeneration to avoid the jitter becoming too large. The inefficient nature of the design would require a large number of cards  $(252)$ . Lastly, the large number of cards would require the sorting stage to consist of two stages (i.e. passing through 2 cards) because of the large fan-in. This would impose additional latency.

# VII. SPLIT FINE/COARSE PROCESSING

An alternative approach was therefore considered. It is the requirement to fully contain a jet that requires such a large overlap between processing regions. It was therefore decided to split the fine and coarse processing into two parts. The fine processing would have the bandwidth to provide an overlap of just one tower in the first dimension and have an entire region of overlap in the second dimension. The fine processing would concentrate on electron and tau detection whereas the coarse processing would be used for jet detection.



Figure 5: Two processing cards exchanging data to perform fine processing (i.e. creating electron/tau clusters). The two shaded regions on either side provide data to build clusters centered within the 3 middle regions.

The basic concept (fig. 5) is to receive 5 regions of data in η, although potentially it could be φ, and locate electrons and taus centred on the 3 central regions (or 3+1 regions when one region is at a n limit). Hence 4 cards could span from  $n = -3.0$ to  $+3.0$  (i.e. where there is both ECAL & HCAL coverage). The 4 cards would cover n regions -7 to -4, -3 to -1,  $+1$  to  $+3$ , and  $+4$  to  $+7$ . If we assume that we need 4 links at  $4.8 \text{Gb/s}$  to receive 12bits of data for both HCAL and ECAL information then we would expect to require 20 input links excluding any tracking information. However, the barrel/endcap boundary is arranged in such as way that it is probably not possible to merge the 4x1.2Gb/s links into a single 4.8Gb/s link (i.e. the data sources are in different locations) and it would be necessary to use 2x2.4Gb/s links. Hence we expect that the cards covering  $\eta = -3$  to -6 and  $\eta = +3$  to +6 would require 22 links, however this would need verification from ECAL and HCAL cabling experts.

In the second dimension, which would nominally be  $\varphi$ , 4 bidirectional links would provide either the overlap information or possibly pre-clustered objects. The latter potentially offers far more useful information to be transferred, possibly even allowing full size jets to be built, however this requires study because it would require a more complex algorithm. A very similar concept is used in the current GCT to sucessfully cluster jets. The 4 bidirectional links would be transmitted over either a custom µTCA backplane or QSFP optical cables.

There are 18 regions in  $\varphi$  and thus a full system would require 72 cards distributed across 8 µTCA crates, with a pair of crates for each η segment.

The simplest way of handling the jets is to coarse grain the data into 2x2 tower squares and transmit them to a jet processing stage. The 2x2 tower resolution is more than sufficient for jet processing and would combine very nicely with the jet information from the HF which is already at a  $2x2$ tower resolution. The jet cards would cluster jets centred on an area that spanned  $\frac{1}{2}$  of  $\eta$  and 2 regions in  $\varphi$ , but they would have access to 1 extra region in both  $\eta$  and  $\varphi$  so that jet clusters could be built with a size up to 10x10 towers. The electrons and jets would then be sorted in terms of rank (i.e. importance) before being forwarded to the GT. It would require 4 cards to sort the electrons and 2 cards to sort the jets. The GT would receive up to 16 electrons and 16 taus (4 per η segment), 8 central jets from the HCAL Barrel & Endcap, and 8 forward jets from the Forward HCAL).

The design currently uses 22x 5 input links and 8 sharing links running at 5.0Gb/s. There would also need to be a link for slow control over Ethernet and another for DAQ. Hence 32 links are used. It is assumed that the bandwidth for a tracking trigger would be substantially less as it is simply indicating the presence of a high transverse momentum track. A single input link would be sufficient to provide 1bit of information per tower.

A minimum of 36 links are therefore necessary if we wish to reserve up to 4 links for a tracking and possibly even muon information.

The Xilinx XC5VTX150T has 40x 5.0Gb/s links and the latest announcements from Xilinx for the Virtex 6 range include up to 36x 6.5Gb/s links (XC6VLX550T) for the LXT series and 48x 6.5Gb/s links, plus 24x 11Gb/s links for the HXT series (XC6VHX565T) .

# VIII. PROCESSING CARDS

The Mini-T5 (fig. 6) is an attempt to build a processing card with the capabilities necessary to realise the system described above. The same card would be used for the fine (electron/tau) processing, coarse (jet) processing and subsequent sorts.



Figure 6: The Mini-T5 technology demonstrator card. SNAP12 optics would be mounted bottom right. QSFP optics are mounted in the middle of the right hand side. Power supplies are at the top. The Samtec differential headers and the AMC card edge connector are on the left hand side.

It is based on a Xilinx Virtex-5 XC5VTX150T-2FFG1759C in a double width AMC form factor. The FPGA offers 40 links running at up to 5Gb/s. It is pin compatible with the XC5VTX240T if extra logic or links are required. It also uses the same GTX transceivers used in the Virtex-6 and thus it should be possible to upgrade the board with minimal changes to the firmware when the large Virtex-6 FPGAs become available.

There are two types of optics. SNAP12s are unidirectional devices providing either 12 inputs or outputs at up to 6.5Gb/s. An interesting alternative is the PPOD from Avagotech, which is very similar, but rated up to 10Gb/s, however questions remain over availability to relatively low volume science experiments. QSFPs offer 4 bidirectional links at up to 10Gb/s, but often in only a cable format (i.e. no MTP connector). This doesn't allow the fan in/out of fibres often required by a physics experiment. The Mini-T5 has 2xSNAP12-Rx, 1xSNAP12-Tx and 2xQSFPs.

Additional high speed link I/O is provided on the backplane on ports 0-7 (i.e. common options and fat pipes on the µTCA specification). Ports 1 and 3 have the option of being switched to LVDS ports on the FPGA to allow for reception/transmission of fast control such as Timing, Trigger & Control (TTC) and Trigger Throttle System (TTS).

The card also has Samtec QTH/QSH series headers on either side of the card, which are each connected to up to 40 LVDS pairs that can operate up to 1.25Gb/s. Samtec offers flex cables for these connectors and thus it is possible to hook adjacent cards together with very low latency and with a bandwidth similar to that of the QSFP optical inter card connection. Alternatively, it is possible to install daughter cards for additional tracking trigger I/O.

The card also has an external AT32UC3A microprocessor for offloading appropriate tasks and for AMC card functionality. The design is finished and is passing through pre-manufacture checks before being submitted for manufacture.

# IX. LATENCY

The latency associated with serial links is unpleasant (typically ~100ns for both transmission and reception), however it offers an excellent way of bringing large amounts of data into an FPGA and offers electrical isolation between sub-systems. The CMS TDR allocates  $\leq 1$  µs for both RCT and GCT including input and output links. Hence if we wish to retain a reasonable amount of time for processing within FPGAs we must have a maximum of 2 serial link transmissions within a combined RCT and GCT.

In the Mini-T5 example the first serial link period is used to provide the overlap area for the electrons and pass the coarse 2x2 tower information to the jet processing cards. The second serial link period is used for transmitting the data to sorting cards.

# X. SERVICES

The MCH in a µTCA crate (fig. 7) provides GbE and clock distribution to each slot, however CMS would probably require additional functionality. For example the LHC clock needs to be extracted from the biphase mark encoded TTC signal, which is distributed at 1310nm on single mode fibre. The fast control information (i.e. Channels A/B) encoded on the TTC signal needs to be distributed in a constant latency, upgradeable manner (i.e. LVDS at 400 or 800Mb/s). Some systems (e.g. trigger) have a very high data bandwidth, but generate a relatively small amount of data. For these systems it would be useful to have a data concentrator or DAQ channel per card.



Figure 7: The Vadatech VT891 crate with 12 full size AMC slots and redundant MCH/PM slots may be a good choice for a standard CMS µTCA crate.

Trigger systems also need a lot of inter card data sharing. This can be accomplished by modifying an existing  $\mu TCA$ backplane. This is standard practice in the  $\mu$ TCA community and relatively inexpensive.

# XI. CONCLUSIONS

A compact trigger architecture has been presented that remains backwards compatible with the current CMS experiment. It could be easily extended to incorporate a tracking trigger. A single card design is used for the entire system, albeit loaded with 4 different firmware versions, of which 2 are very simliar.

#### XII. ACKNOWLEDGEMENTS

We would like to thank Sarah Greenwood (Imperial College) for layout of the Mini-T5 card and STFC for financial support.

# XIII. REFERENCES

- [1] F. Zimmermann et al., "CERN Upgrade Plans for the LHC and its Injectors", CERN-sLHC-PROJECT-Report-0016, 2009
- [2] J. Jones et al., "The GCT Matrix Card and its Applications", These Proceedings, Paris, France, 2009
- [3] G. Iles et al., "Performance and lessons of the CMS Global Calorimeter Trigger", TWEPP-08, Naxos, Greece, 2008
- [4] The CMS Collaboration, S Chatrchyan et al., "The CMS experiment at the CERN LHC", JINST 3 S08004, 2008
- [5] The Trigger and Data Acquisition Project, Vol. I, The Level-1 Trigger, CERN/LHCC 2000-038, CMS TDR 6.1, 15 December 2000.

# Design Considerations for an Upgraded Track-Finding Processor in the Level-1 Endcap Muon Trigger of CMS for SLHC operations

D. Acosta<sup>a</sup>, M. Fisher<sup>a</sup>, I. Furic<sup>a</sup>, J. Gartner<sup>a</sup>, G.P. Di Giovanni<sup>a</sup>, A. Hammar<sup>a</sup>, K. Kotov<sup>a</sup>, A. Madorsky<sup>a</sup>, M. Matveev<sup>c</sup>, P. Padley<sup>c</sup>, L. Uvarov<sup>b</sup>, D. Wang<sup>a</sup>

> <sup>a</sup>University of Florida/Physics, POB 118440, Gainesville, FL, USA, 32611 <sup>b</sup>Petersburg Nuclear Physics Institute, Gatchina, Russia <sup>c</sup>Rice University, MS 315, 6100 Main Street, Houston, TX, USA, 77005 madorsky@phys.ufl.edu

#### *Abstract*

The conceptual design for a Level-1 muon track-finder trigger for the CMS endcap muon system is proposed that can accommodate the increased particle occupancy and system constraints of the proposed SLHC accelerator upgrade and the CMS detector upgrades. A brief review of the architecture of the current track-finder for LHC trigger operation is given, with potential bottlenecks indicated for SLHC operation. The upgraded track-finding processors described here would receive as many as two track segments detected from every cathode strip chamber comprising the endcap muon system, up to a total of 18 per 60° azimuthal sector. This would dramatically improve the efficiency of the track reconstruction in a high occupancy environment over the current design. However, such an improvement would require significantly higher bandwidth and logic resources. We propose to use the fastest available serial links, running asynchronously to the machine clock to use their full bandwidth. The work of creating a firmware model for the upgraded Sector Processor is in progress; details of its implementation will be discussed. Another enhancement critical for the overall Level-1 trigger capability for physics studies in phase 2 of the SLHC is to include the inner silicon tracking systems into the design of the Level-1 trigger.

# I. CMS ENDCAP MUON LEVEL-1 TRIGGER SYSTEM **OVERVIEW**

The CMS Endcap Muon system consists of 540 six-plane cathode strip chambers<sup>1</sup>. Strips, milled on the cathode panels, run radially in the endcap geometry and thus provide a precise measurement of the φ-coordinate. Wires are stretched across strips and define the radial coordinate of muon hits.

# *A. Generation of Trigger Primitives*

Electronic components responsible for the generation of trigger primitives include:

- Cathode Front End Board (CFEB), 5 per chamber
- Anode Local Charged Track board (ALCT), 1 per chamber
- Trigger Mother Board (TMB), 1 per chamber

l

The CMS Endcap Muon system is comprised of two endcaps. Each endcap consists of 4 layers of Cathode Strip Chambers (CSCs); these layers are commonly called "stations". Station ME1 is the closest to the Interaction Point (IP), station ME4 is the farthest.

For the purposes of Trigger system, each endcap is subdivided into six 60º sectors. Each sector is served by one Sector Processor (SP) board; there are 12 SPs in the Endcap Muon Trigger system. Each SP is implemented as a 9U VME board; all SPs are housed in one VME crate that is located in the CMS Underground Support Cavern (USC55).

The TMB associated with each chamber can provide up to two trigger primitives on any bunch crossing. Each trigger primitive contains the following information:

- Cathode hit coordinate (half-strip number)
- Cathode pattern type (measure of the track bend angle)
- Anode hit coordinate (wiregroup number)
- Anode pattern type (collision or halo track)
- Trigger primitive quality

The trigger primitives generated by TMBs are delivered to Muon Port Cards (MPC), also located in the Peripheral Crates. There is one MPC per station (9 chambers), except station 1 that has 2 MPCs because there are 18 chambers in it. Each MPC receives up to 18 trigger primitives per bunchcrossing (BX). The MPC selects the best three trigger primitives out of 18, and sends them via 1.6 Gbps optical links to the Sector Processor.

# *B. Track reconstruction in Sector Processor*

The Sector Processor (SP) receives trigger primitives from MPCs associated with all stations in a specific sector, for a total of up to 15 primitives per BX. In addition to that, the Barrel Muon system (Drift Tube Chambers, or DT) delivers up to two trigger primitives from the region where it overlaps with the Endcap Muon system. If one or two more DT trigger primitives are available at the same BX, they can be delivered with a delay of one clock cycle.

Track reconstruction involves the following hardware modules:

# *1) Conversion of raw trigger primitives into geometrical parameters.*

In the current design, the conversion of raw trigger primitives into φ and η (pseudorapidity) is performed using

<sup>&</sup>lt;sup>1</sup> 468 chambers installed and operational and 72 additional chambers (ME4/2) to be fabricated and installed.

large 2-stage look-up tables (LUTs). The amount of memory required to convert a single trigger primitive is around 4MB.

# *2) Multiple Bunch Crossing Analysis (BXA)*

Cathode Strip Chambers may not report all the trigger primitives related to a certain track at the same precise BX; some trigger primitives are delivered with a delay of one or even two BXs because of charged particles drift time inside the chamber or imperfect synchronization. In order to build a track that has such delayed trigger primitives, the SP needs to analyze up to 2 BXs in addition to the current one. The BXA keeps the history of trigger primitives belonging to two previous BXs. All trigger primitives (current and delayed, total of 9) from each station are sorted on each BX, and best three primitives are sent for further processing. This ensures that the tracks are built taking the highest quality primitives into account.

# *3) Extrapolation Units (EUs)*

Each EU checks that  $\varphi$  and  $\eta$  parameters of two trigger primitives from two different stations (A and B) are within certain limits (windows) from each other.

In the current Track-Finder design, almost all possible combinations of stations have to be extrapolated; this brings the total number of extrapolations<sup>2</sup> to 210. In addition, the EUs for the ME1-ME2 and ME1-ME3 extrapolations provide a 2-bit extrapolation quality based on the φ difference between the trigger primitives.

#### *4) Track Assembly Units (TAUs)*

Each TAU takes one particular trigger primitive from ME2, ME3, and ME4, and tries to find as many valid extrapolations as possible to other stations. If the search is successful, TAU reports a possible track candidate. There are 12 TAUs for collision tracks and 6 for halo tracks (accelerator produced muons outside the beam pipe). Each track candidate receives a rank that encodes stations and extrapolation qualities used to construct it. The rank reflects the "quality" of the track candidate – the higher that number is, the more stations have participated in the track.

# *5) Transverse Momentum (Pt) Assignment Units (PAU)*

The tracks assembly results from available primitives are delivered to PAUs. There is one  $P_t$  Assignment Unit per TAU. These units identify the track segments used to build each track candidate, assign φ and η parameters (taken from the best available track segments) to track candidates, and calculate the φ difference for the best available 2 or 3 stations. On the output, they provide the address for the  $P_t$  Assignment Lookup Table  $(P_t LUT)$ .

# *6) Final Selection Unit (FSU)*

There are two FSUs: one for collision and one for halo tracks. Each FSU receives the ranks of all track candidates (12 collision or 6 halo candidates). FSU keeps a history of track

candidates 2 BXs in the past, and selects the best three collision tracks or 1 best halo track out of all available candidates. Simultaneously, it checks for tracks that have η and φ parameters close to each other. If such tracks are found, only one of them having the highest rank is left; all others are removed. This η+φ track cancellation is necessary because TAUs sometimes may produce different track candidates that correspond to a single physical track. One more reason for the cancellation is chamber drift time (see BXA unit description above). This leads to multiple track candidates created over a duration of up to 3 BXs, so taking the track candidate history into account becomes necessary to find the best tracks.

# *7) Output Multiplexer (OM)*

The results of the final selection are delivered to the OM. This module passes the track parameters of the best tracks selected by FSUs to its outputs. Priority is given to collision tracks. A halo track (if found) is multiplexed to the first unused output.

# *8) BX Correction Unit (BXC)*

The final step in the Track-Finder logic is Bunch-Crossing number correction. For the best performance the timing for a track should be set to the BX when the second trigger primitive for it was received. The BXC is applying variable delay to the output tracks to make sure this timing requirement is satisfied.

# *9) P<sup>t</sup> Assignment Lookup Table (P<sup>t</sup> LUT)*

The  $P_t$  LUT is a separate hardware module implemented as memory IC. The address of this memory is provided by the SP logic and is formed by  $P_t$  Assignment Units (see PAU description above). The output includes track  $P_t$  encoded into 5-bit value, track quality (2-bit value), and "valid" flag.

# II. TARGETING SLHC

<span id="page-284-0"></span>The current design of the CMS CSC Endcap Track-Finder is totally adequate up to the current LHC design luminosity. However, for the SLHC operation, there are a number of problems that have to be addressed. This section lists these problems and proposed solutions.

# *A. MPC filtering*

Currently, the MPC selects the best three trigger primitives out of 18 available. However, with a luminosity upgrade to  $L=10^{35}$  cm<sup>-2</sup>s<sup>-1</sup>, we can expect at least 7 trigger primitives per BX in every MPC. This number is based on simulations [\[1\]](#page-287-0), and in reality could be higher.

Our current intention is to design an upgraded Track-Finder that can process all available trigger primitives (2 per chamber, or 18 per MPC). This would allow us to reduce significantly the dependence on background hits in the CSCs, the rate of which is unknown at this time for both LHC and SLHC.

# *B. Optical link bandwidth*

Trigger primitives are delivered from MPCs to SPs using 1.6 Gbps optical links. To deliver 18 trigger primitives instead

 $\frac{1}{2}$  φ and η extrapolations are counted separately. The number shown is for SP with mezzanine card upgraded in 2008, and does not include halo extrapolations.

of 3, we will need 6 times more bandwidth than we have now. To accommodate that, data links with larger throughput have to be used.

We are considering two options: faster optical links working at a higher bit rate (10 Gbps), or multi-channel links running at a moderate bit rate (1.6 to 2.4 Gbps). Both options seem to be suitable for our purposes. The 10 Gbps links require fewer fibers but have to be run asynchronously to the machine clock to reach full bandwidth. The parallel links can be run in "traditional" mode (synchronous to the machine clock), but require special multi-core fibers and more serializer-deserializer pairs.

Removing the MPC trigger primitive filtering and upgrading the optical links will require a complete MPC redesign and a system-wide replacement (60 boards).

# <span id="page-285-0"></span>*C. Trigger primitive conversion to angular coordinates.*

Currently, this conversion requires 4MB of memory per primitive, which is unacceptable for the upgraded design. We plan to use FPGA logic combined with much smaller LUTs implemented inside the FPGA. The fact that we plan to receive trigger primitives from all chambers means that chamber numbers do not have to be explicitly analyzed during the conversion, which leads to savings in logic and LUT size.

# *1) Coordinate systems*

The angular coordinates that were used in the current SP design are not very convenient. For example, the φ coordinate uses 4096 values per 62º sector, which is ~0.015º per φ unit. The corresponding angular coordinate in trigger primitives is the half-strip number, with unit value of 0.06665º for the majority of chambers. If the  $\varphi$  scale is selected that has the unit value of  $0.06665/4 = 0.0166625^{\circ}$ , the half-strip to  $\varphi$ conversion for most of the chambers becomes as simple as adding or subtracting one value and then adding two least significant bits.

The wiregroup number arriving with trigger primitives is currently converted into an η coordinate. This is also not the optimal coordinate for further SP logic processing, since the η unit value is not constant relative to angular value of that coordinate (known as  $\theta$ ). Ideally, to compensate for that would require the extrapolation windows for η EUs to depend on the absolute value of  $\eta$ ; in other words, the closer the track is to the beam axis, the wider extrapolation windows should be used. This compensation cannot be implemented in the current SP design because of insufficient logic size, so some average η extrapolation windows are selected that allow for track reconstruction of sufficient quality.

For the SLHC SP design, we intend to convert the wiregroup to  $\theta$  directly. This would allow for uniform extrapolation windows with no dependence on θ.

At the end of the pipelined logic, when the best three tracks are identified, the SP will still assign φ and η values to them as required along with any alignment corrections of the chamber positions for the best accuracy. However, this assignment for just three tracks consumes a very small amount of logic resources.

# <span id="page-285-1"></span>*2) Half-strip to φ conversion*

The track-finding algorithm can operate using a φ coordinate limited in precision to one strip in ME1/2, ME2/2, ME3/2, and ME4/2 chambers (0.1333º). This significantly reduces logic resources without compromising the performance.

The half-strip coordinate is first multiplied by a certain factor. For most chambers this factor is ½, which is equivalent to removing the least significant bit (LSB). For some chambers, this factor is exactly 1 (no operation). Finally, for a relatively small number of chambers, this factor is a certain "inconvenient" number, so an internal FPGA multiplier or LUT has to be used. The list of chamber types and corresponding factors is shown in Table 1.

Table 1: Multiplication factors for  $\varphi$  conversion.<sup>3</sup>

| <b>Chamber type</b>  | Strip angle           | F                          |
|----------------------|-----------------------|----------------------------|
| ME1/2, ME2/2, ME3/2, | $0.1333^{\circ}$      | $\frac{1}{2}$ (remove LSB) |
| ME4/2                |                       |                            |
| ME2/1, ME3/1, ME4/1  | $0.2666$ °            | 1 (no operation)           |
| ME1/1a               | $0.2222$ <sup>o</sup> | 0.8335                     |
| ME1/1 <sub>b</sub>   | $0.1695^{\circ}$      | 0.636                      |
| ME1/3                | $0.1233^{\circ}$      | 0.4625                     |

When the best three tracks are identified, the Track Finder will still need to assign the precise φ values to them. However, the conversion to full-precision  $\varphi$  has to be done for only 3 trigger primitives, which leads to logic size reduction.

#### *3) Wiregroup to θ conversion*

For the majority of chambers, this conversion can be done by a small LUT. It takes the wiregroup number as input, and provides a 7-bit θ value on the output.

The exception is ME1/1 chambers, because of their unique tilted-wire design [2]. The SP may receive two half-strip numbers and two wiregroup numbers on each BX from such chambers, and it is impossible to match each of these halfstrip numbers to one particular wiregroup number, so all combinations have to be taken into account. This requires each wiregroup parameter to be converted into two distinct θ outputs, or "duplicated". Figure 1 shows a graphical representation of the problem.



Figure 1: θ duplication in ME1/1 chambers

The current SP design does not implement this logic. To allow for using ME1/1 trigger primitives in the SP track

1

<sup>&</sup>lt;sup>3</sup> This table shows strip angle for each chamber type. Halfstrip angle can be calculated by dividing strip angle by 2.

reconstruction, η extrapolation windows are made wide enough to be insensitive to ME1/1 wire tilt. This should work fine for LHC, but with increased SLHC background tighter extrapolation windows may become necessary.

The proposed wiregroup to  $\theta$  conversion schematics is shown in Figure 2.



Figure 2: ME1/1 wiregroup to  $\theta$  conversion

The 6-bit wiregroup number is converted into a base  $\theta$ value by an LUT. Simultaneously, two other LUTs that take strip numbers and 2 most significant bits of wiregroup as inputs produce 4-bit correction values, which are added to the base  $\theta$  value and form the duplicated  $\theta$  outputs.

# *D. Geometry constraints for track building*

In the current SP design, we have to consider almost all combinations of trigger primitives since each of them may come from any chamber in the station. In the proposed upgraded design, since we receive all primitives from all chambers without filtering, it is possible to implement logic only for the physically allowed chamber combinations.

There are two considerations that must be taken into account:

- Track bending in magnetic field is limited. The φ difference between primitives created by a single track in any two stations cannot be more than  $\sim 10^{\circ}$ .
- Track projection in  $\theta$  direction is a straight line; bending happens only in φ projection. Therefore, a chamber coverage map in  $\theta$  must be used to select valid chamber combinations.

Figure 3 shows such map. As an example, one can clearly see that extrapolations between chambers ME1/2 and ME3/1 are not necessary because any single track originating in Interaction Point (IP) cannot cross both of these chamber types. There are many other chamber type combinations that don't have to be considered. Note that for halo tracks, the chamber combinations would be different.

Using the above constraints, the track building maps were generated. Examples of such maps for collision tracks are shown in Figure 4.

# *E. Upgraded design – implementation of modules*

#### *1) Extrapolation Units*

Since the CSC is not a pixel-type detector, when two trigger primitives are available from a certain chamber it is impossible to tell which half-strip coordinate corresponds to which wiregroup. This leads to additional complexities in the design of the track-finder because all combinations of halfstrip and wiregroup coordinates should be analyzed. The current SP design takes this into account only for ME1 trigger primitives; ME2, ME3, and ME4 trigger primitives are assumed to have perfect match between half-strip and wiregroup coordinates, which is a trade-off. In the upgraded design, we must take this into account for all stations.



Figure 3: θ coverage map.



 $\bigoplus$  -means path to chamber directly behind

Figure 4: Track building maps for ME1 $\rightarrow$ ME2 and ME1 $\rightarrow$ ME3 extrapolations and track assembly.

Even with the geometry constraints shown above, the number of extrapolation units in the upgraded design will grow significantly. As can be seen from Table 2, the total number of required extrapolations is 2252, which is  $\sim$ 11 times more than in the current design.

#### *2) Final Selection Unit*

Such a large number of track candidates (54 collision and 54 halo) leads to a huge growth in final selection and  $\varphi$ + $\theta$ cancellation logic. Since we need to keep the latency as low as possible, the implementation of selection and cancellation logic is very straightforward – each candidate has to be

compared with each other simultaneously. The number of such comparisons is proportional to the square of the number of candidates. This means that the logic size for FSU will grow relative to the current design by a factor of  $\sim$ 20. Taking into account that FSU is already occupying the largest part of logic in the current design, it may become problematic to select a suitable FPGA for such upgraded design. The present SP board is using and FPGA from Xilinx's Virtex-5 family (XC5VLX155). The largest FPGA that should be soon available is XC6VLX760, is just 5 times bigger.

| <b>Extrapolation</b>             | φEU  | $\theta$ EU |
|----------------------------------|------|-------------|
| ME1-ME2                          | 208  | 248         |
| ME1-ME3                          | 232  | 336         |
| ME1-ME4                          | 168  | 272         |
| $ME2-ME3$                        | 132  | 132         |
| ME <sub>2</sub> -ME <sub>4</sub> | 132  | 132         |
| ME3-ME4                          | 132  | 132         |
| ME1-MB1                          | 48   | 0           |
| ME2-MB1                          | 48   | $\Omega$    |
| Total                            | 1100 | 1252        |

Table 2: Numbers of  $\varphi$  and  $\theta$  extrapolations<sup>4</sup>

# *F. Other modules*

l

Implementation of other modules should not lead to any problems with FPGA capacity because the amount of logic they occupy is small relative to EUs and FSUs, and the logic size grows in direct proportion (not square) to the number of track candidates.

# *G. Pattern-based track reconstruction*

Taking into account possible implementation problems of the upgraded SP logic based on our current design, we have decided to evaluate an approach that can lead to significant logic size savings while providing all the functionality that is required for SLHC operation. It is very similar to pattern search logic used in front-end boards, such as the ALCT. In the case of the Sector processor, the pattern is created from the trigger primitives in chambers, so 4 "layers" of chambers are considered by pattern detectors. Besides logic size reduction, other benefits of this approach include:

- "Natural" ability to analyze multiple bunch-crossings.
- Virtually ghost-free track candidates, which improves the quality of final tracks reported to Global Trigger, reduces the size of selection logic and eliminates cancellation logic.
- Track timing is automatically set by the second trigger primitive. In the current SP design, we had to implement a special module and increase the latency to achieve that.

There are separate pattern detectors for  $\varphi$  and  $\theta$ projections. The sector is split into 5  $\varphi$  zones and 6  $\theta$  zones defined by the chamber coverage map; each zone has its own independent pattern detector.

The preliminary structure of the pattern used for φ zones is shown in Figure 5. Before φ pattern detectors can be applied, trigger primitives from each chamber are decoded as described in section [II](#page-284-0)[.C](#page-285-0)[.2\).](#page-285-1) Then, "raw hits" are recreated inside the FPGA logic. Each dot on the diagram represents a certain number of raw hits ORed together; this way, sufficient φ coverage is achieved while keeping the logic size of the pattern detector relatively small. The number of ORed hits for each dot is shown above ME1 station. Such structure allows for precise detection of high- $P_t$  tracks; low- $P_t$  tracks are detected with much lower precision, which is acceptable.



Figure 5: Possible pattern structure for φ zones

# *H. CSC+Tracker = Better Trigger*

One more important direction which is being investigated is the challenge of matching CSC triggers with an inner silicon Tracker. By doing this, we should be able to reach better rate reduction using the Tracker to confirm CSC trigger candidates, and improve track fitting.

# III. CONCLUSIONS

We are moving ahead quickly with the hardwareindependent design and simulation of the logic blocks for the upgraded Track-Finder. So far, importing all available trigger primitives seems possible. If some serious obstacles are encountered that would prevent us from doing that, we will consider returning to trigger primitive filtering in MPC (7 primitives per BX from each MPC).

Additionally, simulations are being developed for matching CSC and Tracker trigger primitives to achieve better trigger system performance.

#### IV. REFERENCES

- <span id="page-287-0"></span>1. "US CMS SLHC Muon Trigger R&D", presentation by Darin Acosta (University of Florida) [http://indico.cern.ch/getFile.py/access?contribId=47&](http://indico.cern.ch/getFile.py/access?contribId=47&sessionId=5&resId=1&materialId=slides&confId=48781) [sessionId=5&resId=1&materialId=slides&confId=487](http://indico.cern.ch/getFile.py/access?contribId=47&sessionId=5&resId=1&materialId=slides&confId=48781) [81](http://indico.cern.ch/getFile.py/access?contribId=47&sessionId=5&resId=1&materialId=slides&confId=48781)
- 2. I.A.Golutvin et al, "ME1/1 cathode strip chambers for CMS experiment", Physics of Particles and Nuclei Letters, Volume 6, Number 4 / July, 2009

Does not include halo extrapolations. For ME1 extrapolations, there are more  $\theta$  EUs than  $\varphi$  EUs because of ME1/1 θ duplication.
## The GCT Matrix Card and its Applications

J. Jones<sup>a</sup>, C. Foudas<sup>b</sup>, G. Iles<sup>b</sup>, M. Hansen<sup>c</sup>

<sup>a</sup> Princeton University, Princeton, NJ, USA <sup>b</sup> Imperial College, London, UK. <sup>c</sup> CERN, Switzerland

neutrinodeathray@gmail.com

## *Abstract*

The Matrix card is the first in what is expected to be a series of xTCA cards produced for a variety of projects at CMS. It was developed as a joint collaboration between colleagues at Princeton, Imperial College, LANL and CERN. The device comprises the latest generation of readilyavailable Xilinx FPGAs, cross-point switch technology and high-density optical links in a 3U form factor. In this paper we will discuss the development and test results of the Matrix card, followed by some of the tasks to which it is being applied.

#### I. INTRODUCTION

The Matrix card was originally designed as part of the CMS GCT Muon and Quiet Bit System [1]. As such it was developed to provide a combination of reconfigurable optical links and firmware that can be adapted to different tasks without the redesign of the hardware itself. In this paper we will discuss the board's design, and the testing of the prototypes. This includes the infrastructure required to control the board (based on Ethernet). The I/O and computing performance of the card have been studied in detail and these results are also discussed. Since the production of two prototypes the board has been included in the design of a number of projects, including the LLRF control system for the FERMI free electron laser at Trieste and the calorimeter trigger upgrade project at CMS. In the FERMI project, the Matrix card provides a central timing and control point for the RF system. For the calorimeter trigger, its flexibility allows for changes in the algorithms without modification of the basic hardware and a reduction in latency by utilising wirespeed data duplication.

## *A. Card Specifications*

The Matrix card design has been specified previously in [1][2][3]. In summary, it is a 3U (standard width), full height Advanced Mezzanine Card (AMC). The key components are MTP optics (SNAP12 and POP4), a large Xilinx Virtex-5 FPGA (XC5VLX110T-3) and a Mindspeed 21141 72x72 4Gb/s protocol-agnostic cross-point switch. The latter of these components is the key feature of this design, allowing the reconfiguration of a system to handle different processing topologies. It also provides the possibility of wire-speed data duplication and dynamic redundancy management.



Figure 1: Schematic of the matrix card. 16 input and output channels are provided by the MTP optics on the front of the card while there are 20 channels on the edge connector that plugs into the backplane.

A variety of host functionality is required for an AMC, and this is provided by an NXP LPC2366 micro-controller. The controller is also responsible for programming the FPGA and its corresponding FLASH PROM, and has its own dedicated Ethernet interface, shown in figure 2. This interface is only used for testing. Reprogramming of the board in a crate can also be achieved using  $I^2C$  over the backplane.

## *B. Prototype Testing*

A prototype board was received from manufacture in December 2008. Since then the design has been extensively tested. Several minor flaws were discovered in the original design. However none of these were critical, as most of them involved design oversight resulting in missing bias resistors on non-BGA components or configuration lines to the microcontroller. All of these faults were corrected for by board rework.

The clock system has been tested, including driving it to and from a standard micro-TCA backplane and MCH. No issues have been observed.

The DDR2 memory has been tested at 300MHz (600Mb/pin DDR), with no errors during a 24 hour test period on one of the prototypes. However, the tests so-far carried out are not believed to be thorough enough to guarantee long-term reliability and further study is required.

For the micro-controller, a UDP/IP firmware has been implemented and tested allowing 4MB/s communication with the board (performance is limited by the micro-controller clock frequency). A packetized FIFO interface has been built that connects the micro-controller and the FPGA. Further to this a programming interface has been developed that allows the Xilinx Impact tools to view the FPGA and PROM as devices attached to a parallel cable, whereas in fact the JTAG control is forwarded over a UDP interface to the LPC2366. This allows for seamless reprogramming of the board.

The serial links and cross-point switch have been tested extensively on all channels at a line rate of more than 3Gb/s. Results show a BER of less that one part in  $10^{12}$  at 95% C.L. in most cases. However six of the transmitter links have shown data instability which has been traced to a correlation with noise from the switching regulators on the board. This issue is currently under investigation.



Figure 2: Top view of a Matrix card. MTP optics, the FPGA, crosspoint switch and Ethernet interface can be seen as well as various power regulators.



Figure 3: Bottom view of a Matrix card. The DDR2 memory, edge connector and micro-controller are visible. Also note the large number of capacitors.

## II. THE CMS TRIGGER UPGRADE

It is envisaged that from 2011 onwards the CMS Level 1 trigger will be progressively upgraded to adapt to the physics requirements of the experiment. The Matrix card is expected to play a key role in this process as a development tool for new algorithms. Based on this a new implementation of the trigger system has been studied. This new algorithm will be described in the context of the calorimeter trigger.

#### *A. The Current Calorimeter Trigger*

The calorimeter trigger of CMS can be divided into four distinct components: the first of these is the front end of the

detector and its corresponding readout system off-detector, which produces trigger primitives (energy clusters) and is therefore called a Trigger Primitive Generator (TPG). There are two kinds of calorimeter TPGs in CMS, those that come from the hadronic calorimeter and those that come from the electromagnetic calorimeter. The second link in the chain is the Regional Calorimeter Trigger (RCT), which performs electron finding and coarse-graining of data. The third component is the Global Calorimeter Trigger, which sorts the electrons by energy and searches for jets using the coarsegrained information from the RCT. Finally the results of this process are passed to the Global Trigger (GT) which makes a decision on whether the data collected about the proton collision is worth saving based on the summary information provided by the GCT and Global Muon Trigger (GMT – not described here). In CMS, the top four candidates ranked by energy of every type of trigger object (jet, electron, etc.) are used to make this decision.

A multi-layered system like this creates several complications when considering changes to the trigger system. Most improvements in processing algorithms will require a corresponding increase in the data density of the processing system, and so ideally one would wish to merge the RCT, GCT and GT into a combined processing unit. The front-end is an exception to this because it its output bandwidth is determined by the capabilities of the on-detector digitisation and readout electronics, which in turn is determined by many factors (e.g. power consumption) that are not critical for the off-detector components of the trigger system. In CMS each stage apart from the front end results in approximately a 20x reduction in data rate. An obvious target for an improved reconstruction path in CMS would be the use of full-resolution information in the reconstruction of jets, as is currently used in the Higher Level Trigger (HLT) [5]. Even with the advent of modern FPGAs with fast serial links, a brute-force attempt at this often runs into several issues [4]. Ultimately, increasing the input bandwidth of a processing system does not resolve scaling issues until the bandwidth of the link technology significantly exceeds the bandwidth requirements of data sharing imposed by the size of a trigger object. As a result of the typical size of a jet in CMS, the data sharing fraction required to contain it is significant. In fact this is a key reason why the RCT and GCT were separated in CMS in the first place, combined with the fact that serial link technology was far slower ten years ago than it is today.

## *B. A Future Calorimeter Trigger*

It is often stated that a serious issue pertaining to the use of serial links in a trigger system is their latency. In the context of the latest generation of modern hardware, this statement can be seen to be incorrect for two reasons:

Firstly, the latest generation of serial links (as found in a Xilinx Virtex-5) are capable of operating in an extremely lowlatency mode, using fewer than four bunch-crossings of latency to serialise and de-serialise a parallel data stream. At a 6.5Gb/s line rate, this decreases to a latency similar to that of a standard I/O.

Secondly, serial links were not designed to be used for simplistic processing in geometric fashion where the data remains in a given device (FPGA) for a very brief time

(~50ns). One should attempt to pipeline a processing algorithm and process data within a single device for as long as possible. The name itself implies the correct serial link usage model: *serialise*.

Based on the second of these points, we have considered a radically different topology for a future trigger system. The topology lends itself to comparison with the HLT in CMS, which is a time-multiplexed system where a many PCs are used. A single PC is responsible for processing all the data in any given event over many bunch crossings.

In the current CMS trigger the TPG system receives approximately the same number of fibres as it transmits. Of these each input fibre transmits the data representing a specific detector region for each bunch crossing; in other words, the dimension of time flows through the fibre whereas the dimensions of eta and phi flow across the fibres themselves.

One can imagine a system where this is not the case, but instead the dimension of phi flows through the fibre for a given bunch crossing, and the dimensions of eta and time flow across the output fibres up to a user-defined granularity (see figure 4).



Figure 4: Time-multiplexed serialisation. The input data to the TPGs arrives in phi, eta segments per bunch crossing. A dynamic

multiplexer is implemented in the TPG that converts this so that the output fibres have an entire detector segment in phi in each outgoing fibre.

This data ordering cannot be achieved on-detector given the fact its fundamental mode of operation is to capture data in a time-ordered fashion. However the TPG is capable of reordering this data in such a way.

This creates a latency penalty equal to the number of bunch crossings delay caused by the multiplexer (which itself is equal to the number of fibres entering the TPG). In typical implementations that have been studied a 16:16 multiplexer was implemented at a resolution of 32 bits per channel. Such an implementation has a synthesised resource utilisation of 2% of an FPGA similar to the one on the Matrix card. The estimated maximum clock speed is so high as to have no effect on any algorithm implemented in the device (Xilinx tools estimate the performance at approximately 1GHz). 32 bit resolution corresponds to a 6.5Gb/s link at a quarter of the byte clock frequency (~160MHz). One of the important advantages of this implementation which will be discussed later is the fact that redundancy can be easily incorporated into such a system by expanding the output bandwidth of the TPGs.

At first it might seem strange to deliberately delay the data processing chain at the start for no obvious gain. However the benefits further along the processing chain more than outweigh the additional TPG latency.

In the context of CMS, such a system can absorb an entire phi-ring of data from the calorimeter in a single fibre when using a serial link operating at the peak line rate of a Matrix card (3.75Gb/s). Corresponding to 72 towers in phi, this implementation eliminates *all* boundary data sharing in that dimension and therefore also allows the serialisation of the processing algorithms, something that has never been previously achievable. Hence one observes a dramatic improvement in clock speed from the pipelined processing architecture, a task that FPGAs are well suited to.

When considered in equivalent terms, a traditional bruteforce approach would result in a data sharing link to input link ratio of approximately 32:1 at 6.5Gb/s line rate for a full granularity processing system in phi and a quarter resolution in eta. The slowest line rate at which the links are even usable in a traditional scheme is 6.5Gb/s due to the data sharing constraints. By contrast the new system requires a sharing ratio of approximately 2:1 at a line rate of 3.75Gb/s. Such a system can be achieved using 16 copies of a 10 Matrix card system, or 160 cards in total. If one desires a full-granularity system, adding a further six cards per partition makes this achievable. For the brute-force approach this would result in approximately a 129:1 data sharing ratio and several thousand processing cards, which is completely infeasible. Table 1 shows the relationship between line rate and data sharing for each architecture.

Table 1: Data sharing ratios for different link speeds and architectures at CMS. These calculations assume a processing card with sixteen inputs and sixteen outputs, like the Matrix card. The numbers in brackets are the number of processing cards required for the implementation of a full trigger system.

|                     | $3.75$ Gb/s | 6.5Gb/s    |  |
|---------------------|-------------|------------|--|
| Serialised,         | 1.82(160)   | 1.27(64)   |  |
| partial             |             |            |  |
| granularity         |             |            |  |
| Non-serialised,     | N/A         | 32 (1440)  |  |
| partial             |             |            |  |
| granularity         |             |            |  |
| Serialised,<br>full | 2.91(256)   | 1.39(64)   |  |
| granularity         |             |            |  |
| Non-serialised,     | N/A         | 129 (5544) |  |
| full granularity    |             |            |  |

A unique feature of this new approach is that the trigger system after the TPGs is effectively split into N identical modules (most likely individual processing crates), one of which might look like the one shown in figure 5.



Figure 5: A possible configuration for a trigger partition in a new trigger system. The input data from the TPGs for a given time slice would be received through 88 fibres into 10 matrix cards. Each card would share 4 fibres with each nearest neighbour, corresponding to the overlap region of a coarse-grained jet finder. The results would be sent to a final decision card for sorting. A matrix card would be inappropriate for the final decision card as it has too few input links, so a new board called the Mini-T is under construction that has a

higher link capacity. In addition, an auxiliary card is required to provide CMS interfaces.

This system conveys several advantages:

- **System redundancy** by providing additional spare output channels at the TPG, backup crates can be included that take over from a failed partition at runtime. Furthermore if one does fail, it results in increased trigger dead time rather than a blind spot in the detector.
- **System reliability** reduced data sharing requirements lower the demands on system connectivity. Ultra high-speed links are harder to manufacture and use, and should be avoided if possible.
- **Capacity for future expansion** corresponding to the previous point, the lower line rate / fibre usage provides room for the addition of muon and tracker information in the future.
- **Separate testing partitions** during periods when the LHC beam is not available, the trigger can be split into its partitions. Rather than requiring an individual sub-detector to only use their system component, a full trigger chain can be made available to each one for a 'slice' of the time. Furthermore the full trigger chain can be easily tested in a small setup at an individual institution.
- **Ease of understanding** the system is only partitioned in one dimension, making the individual processing elements far easier to understand.
- **Processing speed** while the initial multiplexer loses most likely eight or sixteen bunch crossings by serialisation of the data, the final sorting algorithm gains a similar performance benefit, negating the effect. Furthermore the serialisation of the processing algorithm results in significantly higher clock speed. In the GCT the processing system runs at 40MHz. Studies of the new system show that it will operate at over 200MHz.

## III. TIME SYNCHRONISATION AT FERMI

FERMI is a  $4<sup>th</sup>$  generation Free Electron Laser (FEL) light source, currently under construction at the Synchrotrone Trieste site in Italy [6]. It operates as an approximately 3GHz RF system with a few components operating at approximately 12GHz using Travelling Wave Tubes (TWTs) for electron acceleration. As with all FELs, the quality of the light source is directly dependent on the accuracy of the phase and amplitude of the power driving each TWT in the system. In Trieste these constraints are very difficult to achieve, with a specification of less than 0.1 degrees error in phase relative to a master time reference and less than 0.1% in amplitude per cavity. 0.1 degrees of a 3GHz system corresponds to approximately 300fs, and so a timing precision greater than this must be achieved.

When converted to its master reference frequency, the system clock is approximately 2.4GHz. This is in an ideal operating frequency for a Xilinx gigabit transceiver, and so is used directly to provide a star-topology control system with the Matrix card at its centre. It is envisaged that this central processing system will be able to calibrate itself by measuring the loop propagation delay through a bi-directional optical link to each RF station. Knowing this it is theoretically possible to re-phase all the RF stations such that the control system is aligned to within 50ps at all stations. This has been achieved with an accuracy of 300ps but so far there is an error of 1UI which appears to be caused by the internal operation of the Xilinx GTPs. However, this already greatly exceeds the requirements for the operation of the control system (4ns resolution). Furthermore it is believed that using the Matrix card, the GTPs can be substituted with LVDS I/O operating at up to 1.25Gb/s, which have a completely deterministic behaviour.

The advantage of this approach is that any variation in the propagation through a fibre in one direction will likely correspond to the change in propagation time in the other direction (for example due to temperature variation). While the absolute limits of this approach are not yet known, for many applications the current results are already more accurate than necessary. One important detail of this approach though is that the reference clock at each end of the system must have a constant phase relationship. Therefore one must either have a reliable global clock network or a local VCXO at the slave end that can be locked to the recovered clock from the serial link. In figure 6 the first of these approaches is shown; the LLRF stations in Fermi also have a highperformance OCXO on each station that can be used instead of a global clock network.



Figure 6: The timing synchronisation system for the Matrix – Low Level RF (LLRF) system at FERMI. The cable delay is expected to be equal in each direction so by measuring the loop time through the system one can accurately calibrate and correct for it.

#### IV. CONCLUSIONS

The Matrix card has been tested thoroughly and shown to operate at its anticipated maximum performance with few problems. Most of these issues have been resolved in a revision of the board that was recently received from manufacture. It remains to be seen whether the board revision also resolves the issues seen with six serial link transmitters.

A L1 trigger architecture has been developed that uses the Matrix card as its template and that results in a significant performance improvement over previous generations of hardware. It is expected that within the next year a development platform will be built based on this architecture that uses the Matrix card or one of its successors, and processing algorithms will be demonstrated.

The FERMI light source is currently undergoing installation and commissioning and it is expected to begin operation by the end of 2010.

#### V. ACKNOWLEDGEMENTS

The authors acknowledge the prior work of Matt Stettler and John Power at LANL in the development of the Matrix card hardware and the involvement of Tony Rohlev at Synchrotrone Trieste in the conception of the LLRF control system.

#### VI. REFERENCES

[1] M. Stettler et al., "The GCT Muon and Quiet Bit System, Design Production and Status", TWEPP, September 2008, Naxos, Greece.

[2] M. Stettler et al., "Modular Trigger Processing, the LHC GCT Muon and Quiet Bit System", IEEE NSS, October 2008, Dresden, Germany.

[3] J. Jones et al., "DAQ and Control Interfaces for the CMS Global Calorimeter Trigger Matrix Processor", IEEE NSS, October 2008, Dresden, Germany.

[4] G. Iles et al., "Trigger R&D for CMS at SLHC", TWEPP, September 2009, Paris, France.

[5] G. Bagliesi, "CMS High-Level Trigger Selection", Eur. Phys. J. C 33 (2004) s1035-s1037.

[6] T. Rohlev et al., "Sub-Nanosecond Machine Timing and Frequency Distribution via Serial Data Links", TWEPP, September 2008. Naxos, Greece.

# *WEDNESDAY 23 SEPTEMBER 2009*

# *PARALLEL SESSION B4 POWER, GROUNDING AND SHIELDING*

## Progress on DC-DC Converters for a Silicon Tracker for the sLHC Upgrade

S. Dhawan<sup>a</sup>, O. Baker<sup>a</sup>, H. Chen<sup>b</sup>, R. Khanna<sup>c</sup>, J. Kierstead<sup>b</sup>, F. Lanni<sup>b</sup>, D. Lynn<sup>b</sup>, C. Musso<sup>d</sup>, S. Rescia<sup>b</sup>, H. Smith<sup>a</sup>, P. Tipton<sup>a</sup>, M. Weber<sup>e</sup>

> <sup>a</sup> Yale University, New Haven, CT USA<br><sup>b</sup> Brookhaven National Laboratory, Upton, NY USA <sup>c</sup> National Semiconductor Corp, Richardson, TX, USA <sup>d</sup> New York University, New York, NY, USA <sup>e</sup> Rutherford Appleton Laboratory, Chilton, Didcot, UK

## *Abstract*

There is a need for DC-DC converters which can operate in the extremely harsh environment of the sLHC Si Tracker. The environment requires radiation qualification to a total ionizing radiation dose of 50 Mrad and a displacement damage fluence of 5 x  $10^{14}$  /cm<sup>2</sup> of 1 MeV equivalent neutrons. In addition a static magnetic field of 2 Tesla or greater prevents the use of any magnetic components or materials. In February 2007 an Enpirion EN5360 was qualified for the sLHC radiation dosage but the converter has an input voltage limited to a maximum of 5.5V. From a systems point of view this input voltage was not sufficient for the application. Commercial LDMOS FETs have developed using a 0.25  $\mu$ m process which provided a 12 volt input and were still radiation hard. These results are reported here and in previous papers. Plug in power cards with ×10 voltage ratio are being developed for testing the hybrids with ABCN chips. These plug-in cards have air coils but use commercial chips that are not designed to be radiation hard. This development helps in evaluating system noise and performance. GaN FETs are tested for radiation hardness to ionizing radiation and displacement damage and preliminary results are given.

#### I. INTRODUCTION

The Silicon Tracker of the Inner Detector of Atlas for sLHC presents a difficult environment for electronics and power supply development in particular. With the high 2 Tesla magnetic field all magnetic materials would go into saturation and not be usable. For inductors and transformers this leaves only nonmagnetic cores which greatly increase the size of the components. For a DC-DC converter the most promising approach is a buck converter. It can be constructed with only one inductor, an integrated circuit and a few discrete components.

In addition to the strong magnetic field there is also a harsh radiation environment. The requirement is a Total Ionizing Dose (TID) of about 50 Mrad along with a Non Ionizing Energy Loss (NIEL) requirement of  $5 \times 10^{14}$  /cm<sup>2</sup> of 1 MeV equivalent neutrons. This excludes almost all switching devices that could be used in a buck converter and until recently did not have a technical solution.

It is known from previous work at CERN and elsewhere that some small feature CMOS processes are radiation hard. Starting from this point; in February 2007 an Enpirion EN5360 Converter was exposed to 100 Mrads of gammas with no appreciable changes. Many commercial buck converters based on small feature processes were tested for radiation hardness but with one exception (EN5360) the tested converters failed after only a few hundred krad. Investigating this one exception led to the discovery of the foundry that fabricated the device and provided us insight into the radiation hardness process/mechanism.

## II. RADIATION EFFECTS IN MOSFETS AND **OXIDES**

 The oxide layers in CMOS technology are known to be affected by ionizing radiation. As implied, ionizing radiation generates electron/hole pairs in the device. Particularly, if there is an electric field across the oxide of the device the electrons which are the more mobile of the two charges are swept from the oxide leaving the less mobile holes behind. The holes migrate through the oxide until they either recombine with an electron or are immobilized in a trap. This trapped positive charge in the oxide creates an electric field which can affect the behaviour of the device by causing voltage shifts or current leakage. Specifically, in gate oxide the positive charge produces a gate threshold shift which can prematurely turn the device on and taken to extremes leaves the device permanently conducting.

Table 1. Known radiation hard processes used at Cern. Note that the oxide thickness is limited to 7 nm or less

| <b>IBM Foundry Oxide Thickness</b> |                               |           |                  |  |  |
|------------------------------------|-------------------------------|-----------|------------------|--|--|
| Lithography                        | Operating<br>Oxide<br>Process |           |                  |  |  |
|                                    | Name                          | Voltage   | <b>Thickness</b> |  |  |
|                                    |                               |           | nm               |  |  |
| $0.25 \mu m$                       | 6SF                           | 2.5       | 5                |  |  |
|                                    |                               | 3.3       |                  |  |  |
|                                    |                               |           |                  |  |  |
| $0.13 \mu m$                       | 8RF                           | 1.2 & 1.5 | 2.2              |  |  |
|                                    |                               | 2.2 & 3.3 | 5.2              |  |  |

The magnitude of this radiation effect also depends on the thickness of the oxide  $(t_{ox})$ . Quantitatively, the voltage shift/unit dose changes approximately proportionally to  $(t_{ox})^2$ . However, at thicknesses of about 10 nm or less the change/unit dose decreases rapidly until below some threshold the change is negligible [1].

The CERN microelectronics group has used IBM processes that have been shown to be rad hard [2]. These processes along with the oxide thicknesses used in shown in Table 1.

This apparent immunity is consistent with the theory that the trapped positive charge in the thin oxides is neutralized by electrons tunnelling from the  $SiO<sub>2</sub>$  /Si interface [3]. This prevents any long term build-up of the positive charge in the oxide. In Figure 1 is shown an example of how this could occur [4]. Two regions are defined.

- 1) The volume where charges would recombine (Tunneling Region) would be approximately 5 nm thick. No stable positive charge would remain.
- 2) Oxide farther than 5 nm from the  $SiO<sub>2</sub>$  /Si interface would define a 2<sup>nd</sup> region (Oxide Trap Region) where fixed positive charge would remain and shift the gate threshold voltage.

 When the thickness of the Oxide Trap Region decreases to near zero only switching states would remain making the oxide resistant to ionizing radiation. This is consistent with the observations we have made on buck converters and single devices from 2 foundries. However devices from another foundry did not survive.

 Our conclusion is that the thin oxide is a necessary condition for the functional immunity of CMOS devices to ionizing radiation. However, the thin oxide is not a sufficient condition as the preparation of the oxide; epi-layer and other properties also contribute to radiation hardness. This parameterization would be for future work when a sufficiently large enough sample of higher voltage rated  $(> 12 \text{ V})$  CMOS devices with thin oxides are obtained from different sources (e.g. foundries).



Figure 1: Physical Location of Defects from their Electrical response in CMOS devices;

Some ionizing radiation measurements on LDMOS devices constructed with thin oxides have been made. Some of the results on IHP foundry devices can be found in [5-6]. A more recent result is shown in Figure 2 which shows the ionizing radiation response of a LDMOS FET from another foundry.



Figure 2: LDMOS N-channel MOSFET constructed with 7nm gate oxide thickness. The device shows exceptional immunity to ionizing radiation effects to the final dose of 52 Mrad.

Table 2 shows a compendium of the radiation measurements made recently with the oxide thicknesses, final dose and state of the device at the end of the test.

Table 2. Compendium of recent radiation measurements made on MOSFETs and Buck Converters

| <b>Courses</b>                | Devices         | Process               | <b>Persons</b>                      | Oxide            | Thereby      | Doge before     | Observation                       |
|-------------------------------|-----------------|-----------------------|-------------------------------------|------------------|--------------|-----------------|-----------------------------------|
|                               |                 | <b>Nomed Newtown</b>  | <b>Morrow</b>                       | Thiekness        | Serente      | Douglass covers | Danmara Moria                     |
|                               |                 |                       | Country                             | nm               |              |                 |                                   |
|                               |                 |                       |                                     |                  |              |                 |                                   |
| æ                             | ASC nature      | 800900000             | <b>HP. Germany</b>                  | 5                |              | 53 Mrad         | ship dramat                       |
|                               |                 |                       |                                     |                  |              |                 |                                   |
| KySwal                        | PET 2 mars      | <b>Hill continued</b> | $G_{\rm max}$                       | $\overline{7}$   |              | 52 Mrad         | a hindi Garcer                    |
| <b>XvSent</b>                 | XP2201          | HUMOSZOWNYZI          | <b>Citie</b>                        | $\overline{7}$   |              |                 | In Development                    |
| <b>Xudiana</b>                | <b>STRANGE</b>  | <b>Hill Continued</b> | <b>China</b>                        | $\overline{7}$   |              |                 | In Development<br>Synch Buck      |
| Xv9mml                        | XP8082          |                       | $_{\rm cm}$                         | 12.3             | 400          | 44 brad         | kasa W <sub>ari</sub> ngulotan    |
| n                             |                 |                       |                                     |                  |              |                 |                                   |
|                               | TPS84620        | <b>LECS 0.35 am</b>   |                                     | 20               | $\mathbf{m}$ | <b>Zilber</b>   | dwettuur                          |
|                               |                 |                       |                                     |                  |              |                 |                                   |
| к                             | <b>R384T</b>    |                       |                                     | 98.25            | 230          | 13 Roads        | loca el Vent reculcion            |
|                               |                 |                       |                                     |                  |              |                 |                                   |
| <b>Building</b>               | 13835           | GO8 0.25mm            | <b>Drawing HTML</b><br><b>Borne</b> | $\overline{5}$   | 11.600       | <b>AS break</b> | Insecting laps)<br><b>Connect</b> |
| <b><i><u>Projekte</u></i></b> | <b>PURSEY</b>   | $CIRIS 0.25$ pa       | Dongtou HT als.<br><b>Egyen</b>     | $\sigma_{\rm s}$ | 2000         | <b>111 Rode</b> | insectV_register                  |
| <b>Contract</b>               | <b>BURSHARE</b> | 500W(HP)              | HP Geneva                           | 5                | $220$ mm     | 100 Mrads       | <b>Mained Deseroes</b>            |
| <u>epidan</u>                 | <b>BR30041</b>  | <b>SO25V(INF)</b>     | <b>MP. Outware</b>                  | 5                | 100aga       | 48 Mrads        | <b>Mahval Dramaya</b>             |

## III. PLUG IN CARDS WITH AIR COILS

Yale model 2151 (Figure 3) is designed with two different commercial converters Max8654 and IR3841; the former is monolithic while the IR unit contains three die in a package with optimized top and bottom FETs. The monolithic FETs compromise performance with the controller circuitry requirements.

Figure 3 shows boards with three different air coils that are being developed. The various types are 1) coils embedded in a PCB with 3 Oz copper, 2) copper coils etched from 0.25 mm copper coil and 3) 10 µH solenoid with ferrite rod removed.



Figure 3: Plug In cards with embedded, copper coil and solenoid air coils. Top right is the Carrier board.

The power in and out are on opposite ends with Kelvin voltage monitoring points on the input connector side. In addition an enable pin can be used to pulse the power on/off. The boards plug in to a carrier board (shown on top right in Figure 3) that can be installed/ wired on the detector under test. This makes it convenient to evaluate the noise studies with different versions of the cards.



Figure 4: Embedded Coupled Spiral inductor with Inner layers 4 Oz Cu. Outer spiral for shielding

Fig. 4 shows four spirals in a four layer printed circuit board. The outer spirals serve as shield and can be left floating or connected to ground at one end. The inner spirals connected in series have 3 oz copper and are spaced 0.35 mm apart while the outer spirals are much farther separated. This spacing is determined empirically and is a compromise in the desired increased inductance due to mutual inductance coupling between the coil fields. The adverse effect is from the Proximity effect that increases the ac resistance of a coil due to the electromagnetic field choking off a section of coil to current flow in it. This effect is frequency dependent [7-9].

#### Noise Measurement with Detector

The tests were done in September 2009 at the Liverpool University with a Stave 09 hybrid using ABCN25 readout chips. The detector had a faraday cage made from aluminium foil and the Plug in card was outside but adjacent to it.

The noise measurements with various cards with/ without a clip on common mode choke are shown in the Table 3. For comparison the noise was also measured with readout chips powered by laboratory power supplies. There was no significant difference in noise with various combinations except that the solenoid produced 30% higher noise.

Next a plug card with embedded spiral coil was placed on to top of the hybrid (Fig.5) and a plastic mechanical protection spacer.



Figure 5: Plug in card on top of readout hybrid. 1 cm above sensor

The latter had a  $20 \mu m$  Al foil for shielding. A few years ago we determined that this thickness of foil provided sufficient shielding. The embedded coil card was about 1 cm from the silicon sensor. This was the closest that we can place the card. Our conclusions were that the placement of an embedded air coil card 1 cm from sensor had no effect on the noise.

## IV. GAN FETS

Other possibilities for radiation hard performance are devices made from III-V technology. One very promising group of candidates for this are High Electron Mobility Transistors (HEMTs) produced in GaN on top of a substrate of sapphire, SiC, or Silicon. Commercial devices are available that operate in a depletion mode (normally on). The gate has significant leakage compared to the oxide in MOSFETs but correspondingly has no possibility of charge trapping causing voltage shifts.

Shown in Figure 6 are the results of irradiating a Nitronex 25015 HEMT with  ${}^{60}$ Co ionizing radiation. As can be seen the effect is very slight up to the total dose of 17.3 Mrad.



Figure 6: 25015 HEMT irradiated with <sup>60</sup>Co gamma radiation

| Coil         | Board #            | Common<br>Mode<br>Choke | Power<br>To<br>$DC-DC$ | Input Noise<br>Electrons rms |  |
|--------------|--------------------|-------------------------|------------------------|------------------------------|--|
|              |                    |                         |                        |                              |  |
| Solenoid     | Max#2              | No                      |                        | 881                          |  |
| $^{\dagger}$ | "                  | $^{\dagger}$            |                        | 885                          |  |
|              |                    |                         |                        |                              |  |
|              |                    |                         |                        |                              |  |
| Copper Coil  | IR #17             | No                      | Switching              | 666                          |  |
| $^{\dagger}$ | $^{\dagger}$       | Yes                     | $^{\dagger}$           | 634                          |  |
| $^{\dagger}$ | $^{\prime \prime}$ | Yes                     | Linear                 | 664                          |  |
|              |                    |                         |                        |                              |  |
| Embedded     | Max 12             | N <sub>0</sub>          | Linear                 | 686                          |  |
| $^{\dagger}$ | "                  | Yes                     | $^{\dagger}$           | 641                          |  |
| $^{\dagger}$ | $^{\prime \prime}$ | Yes                     | $^{\dagger}$           | 648                          |  |

Table 3. Equivalent noise charge of a DC-DC powered hybrid circuit using various buck inductors.

Three other devices from Nitronex, Eudyna and Cree have been irradiated to doses greater than 25 Mrad (as high as 200 Mrad with protons) with the devices placed in a switching mode during irradiation. In all 3 devices no effect of the ionizing irradiation has been observed except for a small change in drain current during irradiation which reverts when the irradiation is ended. Devices have also been irradiated with neutrons but the measurements are still incomplete.

#### V. SUMMARY AND FUTURE WORK

The primary goal of this work is to produce a DC-DC buck converter that can be used in the upgraded Atlas Silicon Tracker at the sLHC. It would have to survive a total ionizing radiation dose of 50 Mrad and a displacement damage fluence of 5 x  $10^{14}$  /cm<sup>2</sup> of 1 MeV equivalent neutrons. It would have to operate in a > 2 Tesla magnetic field while providing a 1.2 volt output at several Amperes with a 12 volt or greater input. In 2007 a commercial buck converter (Enpirion) based on a 0.25 µm process was found that would survive the ionizing dose requirement although it did not have the input voltage rating. To date this is the only commercial product that has met this requirement. In 2008 the foundry (IHP Microelectronics) that produced the Enpirion converter successfully added a 12 V MOSFET based on a 0.25  $\mu$ m process. This MOSFET also proved to be radiation hard. Since then XYSemi which uses a different foundry than IHP has produced radiation hard MOSFETs on a similar process. No commercial products exist at this time and but the work in ongoing.

In parallel with the above work plug-in power cards with commercial converters are being developed to test upgrade hybrids of the Si Tracker group. Commercial buck converters are used that are unlikely to be radiation hard but will allow testing of the form/fit/function of the buck converters.

Converter chips used for this purpose are the Maxim 8864 and the IR3841 with spiral and spring/solenoid coils.

More recently an investigation has started into the suitability of GaN HEMT devices for these applications. The results to date have been promising. All GaN devices tested to date have survived to 17 Mrad or greater ionizing dose. Displacement damage tests have started and are ongoing.

In the future the efforts described above will be combined into buck converters which will specifically target the electrical, environmental and size requirements of the upgrade Silicon Tracker at the sLHC.

#### VI. REFERENCES

- [1] N. S. Saks, M. G. Ancona, and J. A. Modolo, "Radiation effects in MOS capacitors with very thin oxides at 80 K", *IEEE Trans. Nucl. Sci.*, vol. NS-31, p. 1249, 1984
- [2] TID and SEE performance of a commercial 0.13  $\mu$ m CMOS technology Kurt Hansler et al. Proceedings of RADECS 2003: Radiation and its Effects on components and Systems, Noordwijk. The Netherlands. I5 - I9 September 2003 (ESA SP-536. The Netherlands
- [3] J. M. Benedetto, H. E. Boesch, Jr., F. B. McLean, and J. P. Mize, "Hole removal in thin gate MOSFET's by tunneling," *IEEE Trans. Nucl. Sci.*, vol. NS-32, p. 3916, 1985
- [4] Oldham, T.R. Book "Ionizing Radiation Effects In MOS Oxides" World Scientific1999
- [5] IHP SGB25VD First irradiation report F.Faccio CERN/PH/ESE Dated January 05, 2009 Private Communication
- [6] Dhawan et al Proceedings of the IEEE RT 2009 Conference Beijing China May 10-15, 2009; Submitted to IEEE Transactions on Nuclear Science
- [7] Lotfi, IEEE Trans on Magnetics, Vol.28, No 5, September 1992
- [8] Bruce Carsten, "High Frequency Conductor Losses in Switchmode Magnetics" seminar, www.bcarsten.com
- [9] F.E. Terman, "Radio Engineers' Handbook," McGraw-Hill 19

# Experimental Studies Towards a DC-DC Conversion Powering Scheme for the CMS Silicon Strip Tracker at SLHC

K. Klein, L. Feld, R. Jussen, W. Karpinski, J. Merz, J. Sammet

1. Physikalisches Institut B, RWTH Aachen University, 52074 Aachen, Germany

Katja.Klein@cern.ch

#### *Abstract*

The upgrade of the CMS silicon tracker for the Super-LHC presents many challenges. The distribution of power to the tracker is considered particularly difficult, as the tracker power consumption is expected to be similar to or higher than today, while the operating voltage will decrease and power cables cannot be exchanged or added. The CMS tracker has adopted parallel powering with DC-DC conversion as the baseline solution to the powering problem. In this paper, experimental studies of such a DC-DC conversion powering scheme are presented, including system test measurements with custom DC-DC converters and current strip tracker structures, studies of the detector susceptibility to conductive noise, and simulations of the effect of novel powering schemes on the strip tracker material budget.

### I. INTRODUCTION

The Super-LHC (SLHC) is a proposed luminosity-upgrade of the LHC. It is currently foreseen to increase the peak luminosity in two phases: by a factor of two with respect to the nominal LHC peak luminosity four to five years after the start-up of the LHC (phase-1), and by a further factor of five ten years after LHC start-up (phase-2). This would lead to a drastic increase in the number of particles per event in the CMS tracker [1], from about 1 000 at design luminosity to 15 000-20 000 at SLHC phase-2. As a consequence, for phase-2 the sensitive cell size in the strip tracker must be reduced to limit the detector occupancy, and tracking information must be delivered to, and used by, the first level trigger, to keep the level-1 trigger rate at its current level [2]. Due to the increase in the number of readout channels and the need for fast, complex digital electronics it is unlikely that the strip tracker power consumption will decrease significantly compared to the current value of 34 kW. The use of smaller feature-size CMOS processes with lower operating voltages will lead to larger supply currents even for a constant power budget. While the long power cables that connect the detector to the power supply units are installed in a way that virtually excludes their replacement during the lifetime of the experiment, there is a strong desire to reduce the material inside the sensitive detector volume, in order to improve the performance of the upgraded detector.

Following a review process, at the beginning of 2009 the CMS Tracker Collaboration chose parallel powering with DC-DC conversion as its future powering scheme. Serial powering [3] serves as back-up solution. Reverting to the back-up must remain possible until the feasibility of a DC-DC conversion powering scheme has been proven.

DC-DC converters will be used to convert a high input voltage  $V_{in}$  to the operating voltage  $V_{out}$  required by the detector modules (likely to be 1.2 V or lower). The actual required conversion ratio, here defined as  $r = V_{in}/V_{out}$ , depends on the layout of the future tracker. Conversion ratios as low as two might be sufficient for the upgraded pixel detector at phase-1, whereas a factor of ten might be required for the track trigger layers at phase-2. Resistive power losses in supply cables are reduced by  $(1/\epsilon \cdot r)^2$ , where  $\epsilon$  denotes the converter efficiency.

The buck converter [4] is the simplest inductor-based stepdown converter. With relatively few components and the ability to deliver currents of several Amperes at efficiencies of 70-80 %, even for high conversion ratios, this DC-DC converter type is currently the best candidate for use in the CMS tracker. However, several challenges exist on the system level and must be adressed: switching with frequencies in the MHz range might inject conductive noise into the detector system; air-core inductors, needed because of saturation of ferrite cores in the 3.8 T magnetic field of CMS, might radiate electro-magnetic noise; the converter's size and mass must be reduced as much as possible, without degrading its electrical performance. A low efficiency would cancel out the advantages of DC-DC conversion.

## II. DC-DC CONVERTER DEVELOPMENT

#### *A. The AC2 Buck Converters*

Building on our previous experience reported in [5], we have developed DC-DC buck converters based on a commercial, not radiation-hard buck converter chip. The aim was to develop a small, light and low noise device as a proof-of-principle.

The basic schematics of the 2-layer PCB is shown in Fig. 1. The buck converter chip EQ5382D from Enpirion [6] delivers currents up to 0.8 A, up to a recommended maximal input voltage of 5.5 V, at a switching frequency of 4 MHz. Two types of custom toroidal air-core inductors with a diameter of 6 mm are used: the *Mini Toroid* with a height of 7 mm, an inductance of  $\approx 600 \text{ nH}$  and a DC-resistance of 80-100 m $\Omega$ , and the *Tiny Toroid* with a height of 4 mm, an inductance of  $\approx 220$  nH and a DC-resistance of 40-50 m $\Omega$  (Fig. 2). Filter capacitors are implemeted at the input and output of the converter. Different types of capacitors have been tested: standard capacitors are implemented on the *AC2-StandardC* board (Fig. 2, left), low-ESL capacitors in reverse geometry on the board *AC2-ReverseC*, and low-ESL InterDigitated Capacitors (IDC) with eight terminals on variant *AC2-IDC*. Our buck converters are 12 mm wide, 19 / 25 / 27 mm long (StandardC / ReverseC / IDC; without connectors) and 10 mm high. The weight amounts to about 1 g.



Figure 1: Schematics of the AC2-StandardC PCB.



Figure 2: Left: buck converter of type AC2-StandardC, with a toroid coil of type Tiny Toroid. Right: Mini Toroid. Details are given in the text.

#### *B. Material Budget*

One of the motivations for novel powering schemes is the possibility to reduce the material inside the sensitive tracker volume. To understand in a quantitative way the gain that can be expected, simulation studies have been performed within the CMS software framework, CMSSW, based on GEANT4 [7]. The geometry implementation of the current strip tracker has been used as a starting point, and only components relevant for power provision have been added or changed.

One AC2-StandardC converter with Mini Toroid has been "placed" close to the front-end hybrid for each silicon strip module. All components have been modelled in the software as realistically as possible, taking into account their size and material composition: the PCB with its copper layers; capacitors and resistors; the chip; the toroid coil (shielded) and the connectors. In Fig. 3, left hand-side, the contribution of the buck converters in the Tracker End Caps (TEC) is shown in units of radiation lengths,  $x/X_0$ , versus the pseudorapidity. The material contributed by the converters amounts to about 10 % of the material of the silicon strip modules.

When DC-DC converters are used less copper is required in power cables and motherboards, as the input currents are reduced by the conversion ratio. A conversion ratio of eight and a converter efficiency of 80 % has been assumed in the simulation. The new cross-sections of conductors in power cables have been calculated by demanding that the voltage drop in these cables does not exceed the maximum allowed voltage drop of to-

whole TEC material budget. Simulations for the complete CMS p<br>C  $\frac{1}{2}\sqrt{2}$  30.9% of the material in these categories can be saved within the applied model, which corresponds to a saving of  $8\%$  for the day's power supply system (4V). The width of the power and ground rails in the motherboards has been computed allowing for a maximum power loss of 3 % in those boards. The material budget of all components belonging to the relevant categories of electronics or cables is shown in Fig. 3, right. For the TECs, strip tracker are less detailed but show consistent results.

> A similar study has been performed for a Serial Powering scheme [3]. All 17-28 modules of a TEC substructure were powered in series. Additional simulated components per module include a dedicated Serial Powering chip; a bypass transistor as a safety device; and capacitors and resistors for AC-coupling of data lines. The amount of copper in cables and motherboards has been estimated as for DC-DC conversion. The gain is found to be similar: for Serial Powering, 29.0 % of the material for TEC electronics and cables and 7.5 % of the total TEC material could be saved with our assumptions.



Figure 3: TEC material budget, for (left) all strip modules (open histogram) and all DC-DC converters (filled histogram), and (right) for the categories electronics and cables, in schemes without (open histogram) or with (filled histogram) DC-DC converters. The number in the legend of the right plot corresponds to the saving.

#### *C. AC2 Noise Characterization*

The effect of the AC2 buck converters on the noise behaviour of the current strip tracker modules has been studied in system tests. The set-up, described in detail in [5] and references therein, consists of a TEC substructure (*petal*) equipped with four silicon strip modules. The optical readout and control system is realized using prototype CMS tracker DAQ hard- and software. The APV25 readout ASIC [8] is a 128-channel chip manufactured in a  $0.25 \mu$ m CMOS process. For each channel, a charge-sensitive pre-amplifier, a CR-RC filter with a time constant of 50 ns, and a 192 cells deep pipeline are implemented. The read-out is fully analogue. The APV25 operating voltages, 2.5 V and 1.25 V, are provided by two DC-DC converters per module, which are integrated with an additional adapter board. Input voltages are provided by external lab power supplies. Supply currents per module amount to about 0.5 A and 0.25 A for 2.5 V and 1.25 V, respectively.

The quantity studied is the raw or total strip noise, defined as

the RMS of the fluctuations around the pedestal. Module edge channels (strips 1 and 512) are capacitively coupled to the bias ring, which itself is AC-coupled to ground. Since the APV25 pre-amplifier input transistor is referenced to 1.25 V, noise (ripple) on this power line leads to an artificial (i.e. noise) signal at the pre-amplifier output. In addition, a common mode subtraction algorithm is implemented in the APV25, which subtracts common mode noise effectively for most channels except the noisier edge channels [5]. In consequence, edge channels provide a more direct access to the noise sensitivity of the strip module than other strips. The noise of strips 1 and 512 is added in quadrature.

A summary of results is shown in Fig. 4. The noise of the previous buck converter generation (AC1) as presented in [5] is compared with the new AC2 boards. Improvements in the AC2 with respect to AC1 include a more "linear" layout with well separated input and output rails and a larger distance between inductor solder pads. The AC1 board has been integrated using a similar adapter as for the AC2 boards. The different lengths of the AC2 boards have been compensated by additional connectors, to assure comparability of the measurements. Boards equipped with Mini Toroids or Tiny Toroids have been tested. With the Tiny Toroid, the low-ESL capacitors show a clear advantage over the standard capacitors. The IDCs in particular offer a good filtering performance. This and the fact that shielding the coil or increasing the distance did not lead to improvement suggests that the noise increase is mainly due to conductive coupling. The lower noise with Mini Toroids can be explained by the fact that the larger inductance reduces the current ripple.



Figure 4: Combined edge strip noise for AC1, AC2-StandardC, AC2- ReverseC and AC2-IDC converters, with Mini Toroids (squares) or Tiny Toroids (circles). Here and in Figs. 7 and 8 the horizontal line represents the measurement without DC-DC converter, and its width is an estimate of the long-term reproducibility of the measurement.

The converter noise spectra have been measured with a dedicated EMC set-up [9]. The DC-DC converter is powered from a power supply via a Line Impedance Stabilization Network (LISN), and connected to an Impedance Stabilised Load. The Differential Mode (DM) or Common Mode (CM) noise current is picked up by a current probe at the input or output of the converter, and is analyzed with a spectrum analyzer. As examples, the DM noise spectra at the output are shown in Fig. 5 for the AC2-StandardC and AC2-IDC boards. The peaks up to 30 MHz have been added in quadrature, resulting in sums of 43.8 dB $\mu$ A, 41.6 dB $\mu$ A and 32.2 dB $\mu$ A for the AC2-StandardC, AC2-ReverseC and AC2-IDC, respectively. This confirms the trend observed in the system test. In contrast, the respective CM numbers of the three boards are quite similar to each other. The current strip modules are thus sensitive mainly to DM noise.



Figure 5: Differential Mode output noise spectra for (left) AC2- StandardC and (right) AC2-IDC, for an input voltage of 5.5 V, an output voltage of 1.3 V and a load current of 0.5 A.

## *D. AC2 Efficiency*

The efficiency  $\eta = P_{out}/P_{in}$  of the AC2 DC-DC converters has been measured with a dedicated set-up in which both the input voltage and the load current are programmable. Both parameters were swept within the specifications of the chip. The efficiency of the AC2-StandardC board with Mini Toroid is shown in Fig. 6 for an output voltage of 1.3V. Efficiencies vary between 75 % and 85 % in most of the parameter space. For half the conversion ratio the efficiency is up to (abs.) 15 % higher. Differences between capacitor types are negligible  $(< 1\%$ ).



Figure 6: Efficiency of the AC2-StandardC board with Mini Toroid, for an output voltage of 1.3 V, as a function of input voltage and output current.

A significant difference in efficiency is however observed between Mini Toroids and Tiny Toroids: the efficiency with

the Mini Toroid is 5-30 % higher than with the Tiny Toroid, in spite of the lower DC-resistance of the latter. A larger current ripple  $\Delta I$  in Tiny Toroids and thus higher associated losses  $\propto (I_{out}+\Delta I)^2$  in the coil and losses  $\propto (\Delta I)^2$  in output filter capacitors might be the reason. Mini Toroids with their three times higher inductance are therefore preferred over Tiny Toroids, in spite of their slightly larger mass and size.

#### *E. AC2 Boards with Filters*

As is evident from system tests, the noise increase in current strip modules with the AC2 boards is mainly due to conductive DM noise, i.e. a ripple on the power line. Filtering should therefore improve the situation further. Two options have been studied: " $\pi$ -filters" (Butterworth filter) with two equal capacitors and one inductor, and a Low DropOut (LDO) regulator.

The filters have been realized as independent small PCBs that can be plugged to the AC2 boards, either at the input or the output. As LDO regulator the LTC3026 from Linear Technology [10] was used, with a dropout of 50 mV. Four versions of  $\pi$ -filters have been tested:  $L = 2.55$  nH /  $C = 22 \mu$ F or  $L = 18.5$  nH /  $C = 3.2 \mu$ F for a cutoff frequency of 0.95 MHz; and  $L = 2.55$  nH /  $C = 2.2 \mu$ F or  $L = 18.5$  nH /  $C = 220$  nF for a cutoff frequency of 3 MHz. The combinations for one cutoff frequency differ in the characteristic impedance. The 2.55 nH coils with a DC-resistance of 5 m $\Omega$  would be preferred, as they add less material and require less space.

Results for filtering at the converter output are shown in Fig. 7, for all three variants of AC2 boards (eqipped with Tiny Toroids). Both with LDO regulator and  $\pi$ -filter a drastic decrease of the edge strip noise is observed for all three AC2 variants. *Dummy* corresponds to an unpopulated PCB of the size of the filter boards with a direct solder connection between the inductor pads. This cross-check shows that the board itself and the associated change of position leads to a slight decrease of noise, but cannot explain the improvement observed with real filters. Measurements with the EMC set-up described above confirm that the DM noise is reduced to a level below the sensitivity of the set-up, except for the filter with 18.5 nH / 220 nF (which still shows a drastic improvement). As expected, the CM noise was not reduced by filtering.

Filtering the input of the converter was tested as well but did not improve the edge strip noise significantly.

A high efficiency is crucial and measures to reduce the noise impact of the converters should deteriorate the efficiency as little as possible. The efficiency with LDO filter or  $\pi$ -filter was measured and compared with the efficiency without filter. While the LDO regulator reduces the efficiency by typically 5%, the efficiency loss with  $\pi$ -filter is below 1% in the whole accessible parameter range. The  $\pi$ -filter is thus the favoured filtering device, due to its good filtering performance, small efficiency loss, low complexity and intrinsic radiation-hardness.

Figure 8 shows the result of a scan of the input voltage. While both the previous board (AC1) and the AC2-StandardC show a rise of the noise with input voltage, the measurement of AC2-StandardC with  $\pi$ -filter is on top of the measurement without converter across the whole input voltage range. These measurements have been performed with the Mini Toroid.



Figure 7: Combined edge strip noise for AC2-StandardC (circles); AC2-ReverseC (squares) and AC2-IDC (triangles) converters, for various filtering options. Details are given in the text.



Figure 8: Combined edge strip noise for AC1 (circles); AC2-StandardC (squares); and AC2-StandardC with  $\pi$ -filter (triangles); as a function of the input voltage.

#### III. NOISE SUSCEPTIBILITY STUDIES

The commercial buck converter used on the AC2 boards switches at 4 MHz, while custom radiation-hard converters will be optimized for switching frequencies of 1-2 MHz, to reduce switching losses. It is important to understand the susceptibility of the future tracker modules to conductive noise as a function of the noise frequency, in order to identify critical bandwidths that should be avoided for the converter switching frequency. A test bench based on the Bulk Current Injection (BCI) method has been set up [9]. As the proposed successor of the APV25, the *CMS Binary Chip* [11], will not be available before early 2010, the susceptibility of today's silicon strip modules is currently being studied.

A strip module is powered via a LISN directly from a lab power supply. Noise is generated by a sine wave generator, amplified by a +50 dB amplifier and injected by an inductive current probe into the power lines. A second current probe is used to pick up the injected noise current, whose amplitude is then measured with a spectrum analyzer. While the noise frequency is swept, the amplitude of the noise current is kept constant. Noise currents in DM and CM of  $70 \text{ dB} \mu$ A have been injected into the 2.5 V and 1.25 V power lines.

Figure 9 shows the result for the peak readout mode of the APV25, in which only one sample is used (results in deconvolution readout mode, in which a weighted sum of three consecutive samples is formed, are similar). A peak at 6-8 MHz is observed. From the APV25 shaping time of 50 ns the highest susceptibility is expected at 3.2 MHz. The response is therefore not dominated by the bare front-end electronics but reflects the behaviour of the whole module. The observed peak is well above the expected future switching frequency, although higher harmonics peaks will extend into the sensitive region.

The susceptibility is highest for injection of DM noise at 1.25 V. This is understood to be due to the fact that the preamplifier is referenced to 1.25 V. A ripple on this power line leads to artificial noise injection, as indicated earlier. This has been proven experimentally with a modified silicon module, in which the bias ring was AC-coupled to 1.25 V instead of ground. This module showed very little sensitivity to injected noise. In the CMS Binary Chip, the pre-amplifier will be referenced to ground.

## IV. DC-DC CONVERTERS FOR THE CMS TRACKER UPGRADE

## *A. Pixel Upgrade for SLHC Phase-1*

The current pixel detector will be replaced for phase-1 with a larger device. The number of barrel layers will be increased from three to four, and the number of forward disks will grow from two to three per side. The number of readout chips per cable and power supply increases considerably, leading to larger supply currents and consequently higher voltage drops on supply cables. The possibility of a bare power supply upgrade has been studied and found to be unfeasible. However, DC-DC converters with a conversion ratio around two could be used with only lightly modified power supplies. Buck converters would be installed on the pixel supply tube at a pseudorapidity of  $\approx 4$ , i.e. outside the sensitive tracker region, where more space is available and the mass of the converter is not so critical. Due to the distance to the pixel modules on the one hand and the fact that the readout ASICs are equipped with linear regulators on the other hand a certain amount of conductive and radiative noise will be tolerable.

## *B. Outer Tracker Upgrade for SLHC Phase-2*

The layout of the future outer tracker is under development. DC-DC buck converters are currently foreseen both for track trigger layers, where currents of several Amps per module and a high conversion ratio might be required, as well as for the less demanding readout layers. As the modules are being optimised for low mass, the space constraints are severe. Separate "power boards" carrying the converters seem most feasible and could be integrated on the module periphery or the support structure.



Figure 9: BCI results for a noise current of  $70 \text{ dB}\mu\text{A}$ , for DM (solid lines) and CM (dashed lines) at 1.25 V (black) and 2.5 V (grey/green). The noise of strip 512 is shown as a function of the noise frequency. The step width was 100 kHz up to 10 MHz and 1.0 MHz above.

## V. SUMMARY

DC-DC buck converters based on a commercial, not radiation-hard chip, and small, light-weight air-core toroids have been developed. The noise performance has been studied extensively in system tests. In combination with  $\pi$ -filters, which lead to an efficiency loss below 1 % , the boards can be operated across the whole allowed input voltage range without adding extra noise to the test system. The material budget of the AC2 converters amounts to 10 % of the material of a current strip module. Due to savings in cables and motherboards, about 8 % of material could be saved by using such converters (for an efficiency of 80 % and a conversion ratio of eight). Plans exist to use buck converters for the pixel detector already in phase-1 and in the outer tracker during phase-2. These studies will therefore be continued using custom radiation-hard converter ASICs.

#### **REFERENCES**

- [1] The CMS Collaboration, JINST 3 S08004M 2008.
- [2] G. Hall, TIPP09, CMS CR-2009/042, 2009.
- [3] N. Wermes *et al.*, Nucl. Instrum. Meth. A565 (113-118), 2006.
- [4] R. W. Erickson, *DC-DC Power Converters*, Wiley Encyclopedia of Electrical and Electronics Engineering, 2007.
- [5] K. Klein *et al.*, TWEPP-08, CERN-2008-008.
- [6] Enpirion, USA; http://www.enpirion.com/
- [7] IEEE Trans. Nucl. Sci. 53 (270-278) 2006.
- [8] M. Raymond *et al.*, LEB 2000, CERN-2000-010 (2000).
- [9] R. Jussen, Diploma Thesis, RWTH Aachen University, CMS TS-2009/009.
- [10] Linear Technology, USA; http://www.linear.com/
- [11] M. Raymond and G. Hall, TWEPP-08, CERN-2008-008.

## System Integration Issues of DC to DC converters in the sLHC Trackers

B. Allongue<sup>a</sup>, G. Blanchot<sup>a</sup>, F. Faccio<sup>a</sup>, C. Fuentes<sup>a,b</sup>, S. Michelis<sup>a,c</sup>, S. Orlandi<sup>a</sup>

<sup>a</sup>CERN, 1211 Geneva 23, Switzerland <sup>b</sup> UTFSM, Valparaiso, Chile <sup>c</sup>EPFL, Lausanne, Switzerland

#### [georges.blanchot@cern.ch](file:///C:/Documents%20and%20Settings/blanchot/My%20Documents/PH-ESE/SLHC%20Power%20Project/Publications/IEEE/ICIT2010/georges.blanchot@cern.ch)

#### *Abstract*

*The upgrade of the trackers at the sLHC experiments requires implementing new powering schemes that will provide an increased power density with reduced losses and material budget. A scheme based on buck and switched capacitors DC to DC converters has been proposed as an optimal solution. The buck converter is based on a power ASIC, connected to a custom made air core inductor. The arrangement of the parts and the board layout of the power module are designed to minimize the emissions of EMI in a compact volume, enabling its integration on the tracker modules and staves.*

## I. POWERING TRACKERS AT THE SLHC

Today's high energy physics experiments at LHC embed large and very sensitive front-end electronics systems that are usually remotely powered through long cables. The innermost region of the experiments, the trackers, are those providing the largest density of channels, that must be powered with the minimal mass of cables and with reduced heat dissipation to avoid complex and massive cooling systems.

With the upgrade of the accelerator and its physics experiments already being planned, the detectors will require an increased number of electronic readout channels, which will demand more power. This increase of delivered power should be achieved without the addition of material in the detector volume, because of lack of physical space to run more cables and because material in this volume is detrimental to the physics performance of the detector. A solution to deliver more power without increasing the cable volume and mass relies on the distribution of power through on-detector DC–DC converters. These converters must be capable of reliable operation in high radiation (total ionizing dose of 250 Mrad(SiO2) and neutron fluencies of  $2.5 \times 10^{15}$ n/cm<sup>2</sup>, 1 MeV neutron equivalent, based on the simulated environment in the central tracker detector over its projected lifetime) and strong DC magnetic field environment (up to 4 T) of the detector.

To be compatible with this harsh environment, the electronic devices need to be designed in specific technologies that have been qualified for the required doses and fluencies. Together with the high degree of miniaturization required, this fact imposes the development of a custom ASIC for the implementation of the power controller and switches in a known, radiation qualified technology [\[1\].](#page-309-0)

The LHC tracker operates with magnetic fields up to 4 T to bend the particles thus allowing their identification. The DC-DC converters will be exposed to this DC magnetic field. This forbids the use of conventional ferromagnetic cores, since they saturate at flux densities below 3 T. Coreless (aircore) inductors have to be used instead, limiting the accessible values of inductance below 700 nH in order to maintain affordable size and mass [\[2\].](#page-309-1)

A comparative study indicated that the buck converter is one of the most suitable converter topology for the intended application [\[3\].](#page-309-2) Given the range of available coreless inductors, the switching frequency has to be set beyond 1 MHz in order to limit the current ripple.

A typical tracker front-end system is made of strip detectors that are bonded to front-end hybrid circuits. These hybrids are fitted with several front-end chips. Several hybrid and detector modules are then mounted together to form a stave [\[4\].](#page-309-3) Based on this and on the estimated power requirements of the hybrids, an optimal powering scheme based on DC-DC converters [\(Figure 1\)](#page-305-0) has been defined [\[3\],](#page-309-2) that relies on an input voltage bus (10V) distributed along the stave to all the hybrids. Each hybrid circuit would be equipped with one Buck DC/DC converter delivering an intermediate bus voltage (2.5V) that brings the power to each front-end chip with a conversion efficiency of 80%. Each front-end chip would then convert the intermediate voltage down to the levels that it requires (1.2V and 0.9V) through integrated switched capacitors point-of-load DC/DC converters, whose efficiency is expected to be around 95%.



<span id="page-305-0"></span>Figure 1: Powering topology.

Beyond the environmental constrains that are set to this powering scheme, the electromagnetic compatibility between the tracker electronics and the DC-DC converter used to power it is essential. The sLHC tracker powered from DC/DC converters in close proximity of the front-end electronics must be able to achieve levels of performance equivalent to those obtained when using remote, regulated power supplies in the present system. The proximity of switching converters with the strips and front-end ASICs (less than 5 cm) expose the front-end electronics to conducted and radiated couplings that could compromise the tracker performance. The compatibility can be achieved by appropriate design of the converter,

together with an adequate integration in the front-end system. In order to succeed, the susceptibility of the front-end system to conducted and radiated noise needs to be explored. On the other hand, the conducted and radiated noise properties of the converters need to be characterized in a standard manner, enabling their EMC optimization for the targeted system.

#### II. RADIATED COUPLINGS AND INDUCTORS

Some preliminary system tests have put in evidence the sensitivity of the hybrid modules to radiated magnetic fields [\[5\].](#page-309-4) Several sources of magnetic noise emissions can be identified in a buck converter: the top side switch current, the low side switch current, and the output filter inductor current.



<span id="page-306-0"></span>Figure 2: current in buck output inductor.

The variation of current in the inductor [\(Figure 2\)](#page-306-0) results in a radiated magnetic field whose magnitude and direction considerably depends on the inductor topology. Three types of air-core inductors topologies (200 nH) have been characterized: air core solenoid, air core toroid and flat PCB toroid [\(Figure 3\)](#page-306-1). Appropriate shielding options were explored as well.



<span id="page-306-1"></span>Figure 3: solenoid (left), air core toroid (center), PCB toroid (right).

The magnetic field radiated by these inductors, driven with an RF source of 1.55 MHz at 0.9 A (peak), was measured up to distances of 10 cm in steps of 1cm, using a calibrated magnetic field probe.

The solenoid, which is the most commonly available topology, emits the full magnetic field along its axis. The field surrounds the coil, which leads to the largest radiated emissions [\(Figure 4\)](#page-306-2). The addition of a shield aiming to attenuate the main magnetic field would result in a reduction of the inductance that could only be compensated by a larger number of loops, hence more material [\[2\].](#page-309-1)

The toroidal topology allows enclosing the main magnetic field (that sets the inductance value) inside the coil volume. A parasitic field is still emitted through the central hole of the toroid, as a result of the current flowing along the toroid loop

[\(Figure 3\)](#page-306-1). This parasitic field is equivalent to that of a single loop turn having the diameter as large as the central hole of the toroid. This topology enables the introduction of a shield without reducing significantly the inductance value. The radiated emissions of this topology are 25 dB lower than those of the equivalent solenoid [\(Figure 4\)](#page-306-2). The addition of a shield brings a further reduction of 5 to 10 dB.



<span id="page-306-2"></span>Figure 4: Magnetic field radiated by the inductors.

The third topology explored is made of a printed circuit board toroidal inductor (3.2 mm high, 15 mm diameter). In order to obtain the required inductance, its flat geometry must be compensated with a larger diameter (and area), that results in a non negligible radiated field in its unshielded version. The addition of a copper shield (35 µm copper) wrapped around the inductor board reduces the magnetic emission down to a level that is comparable with the one of the shielded air core toroid.

The connection pins of the inductors actually form an additional loop that originates a magnetic field emission whose amplitude is comparable with the field emitted by the coil itself. The placement of these pins as close as possible between them results in the reduction of the loop area and hence of the radiated field. In addition to this, the shield of the coil can be extended to the pins as well, achieving in this manner the lowest emission of magnetic field [\(Figure](#page-306-2) **4**).

#### III. BOARD LAYOUT ISSUES

The inductor is not the unique source of noise emitted by the DC/DC converter. The currents flowing on the board tracks and the voltage waveforms originate couplings to the surrounding components of the system, and within the converter itself. This noise gets visible in the form of common mode (CM) and differential mode (DM) currents that are conducted on input and output ports. The CM and DM currents are measured on a reference test stand [\[6\]](#page-309-5) in frequency domain with calibrated probes connected to an EMI receiver; they are compared with reference levels.

To explore the impact of the board design for the resulting conducted noise, two DC/DC converters prototypes built on the basis of the same schematic are compared. The two converters used a radiation tolerant buck converter ASIC prototype (AMIS2) [\[7\]](#page-309-6) that integrated the switches and the

controller. A shielded external PCB inductor was mounted on top of the boards to provide the main filter [\(Figure 5a](#page-307-0) and 5b).



Figure 5: Converters prototypes, separated input and output (a) on V1, top (b) and bottom (d) PCB inductor mounting on V2, with solenoid inductor (c) on V2.

#### <span id="page-307-0"></span>*A. Board Layout.*

The mitigation of the radiated magnetic field is achieved with the reduction of the current loop areas, while the mitigation of the electric field emission is obtained through the reduction of copper areas that are subject to fast voltage transitions. However these basic guidelines find their limits in the choices made during the placement of the components and connectors.

The first converter (V1) features a physical segregation between the input and the output ports that leads to a non negligible ground inductance between them. This ground path carries switched power currents, resulting in a common mode voltage between the input and the output ports. This configuration develops common mode currents that are 25 dB above those developed by the second board [\(Figure 6,](#page-307-1) top). By placing the input and output connectors close together instead (V2), the ground inductance, and hence the CM voltage and currents, are significantly reduced.



<span id="page-307-1"></span>Figure 6: CM (top) and DM (bottom) noise for V1 (left) and V2 (right) layouts with top mounted PCB inductors.

However, the reduced distance between the input and output blocks increases the magnetic coupling between the input and output filter coils. As a result of this, the second board (V2) develops larger DM noise (up to 10 dB, [Figure](#page-307-1) 6, bottom).



<span id="page-307-2"></span>*B. Inductor Placement. B.M. Hoise* of prototype Figure 7: DM noise of prototype V2 with top (left) and

Because it is a source of magnetic field emission, the inductor couples some noise currents onto the board that hosts it. An appropriate position of the inductor that minimizes the couplings with the components and loops of the PCB would reduce the levels of conducted and radiated noise. The conducted noise was measured on the second converter (V2) with the PCB inductor mounted on the top and on the bottom positions of the board. Moving the shielded PCB inductor from the top to the bottom side of the converter provided a reduction of the DM currents by up to 10 dB beyond 3 MHz [\(Figure 7\)](#page-307-2): in this position, the converter ground plane acts as a shield against the couplings between the coil and the other components. Attenuation is also observed in the CM noise at the switching frequency and its first harmonic.

## *C. Inductor Type.*

The effectiveness of the magnetic coupling between the inductor and the other components depends on the magnetic field lines radiated by the coil, the distance and the direction with respect to the other parts. The CM current of the second prototype was compared when using the PCB shielded inductor or the unshielded solenoid. Using an appropriate placement, the latter is inducing slightly less CM and DM noise (< 6 dB of difference) because of its reduced size that allows keeping some distance between the filters and the connectors and at the same time orienting the magnetic axis perpendicularly to the filters. However, it was already seen that the solenoid actually radiates 40 dB more magnetic field towards the detector than the shielded PCB inductor.

#### IV. SUSCEPTIBILITY OF MODULES

The optimization of the noise performance of a front-end system is achieved by means of:

- The mitigation of the noise sources, for instance from the DC/DC converter.
- The improvement of the immunity of the system against these noise sources.

Independently of the powering scheme used, the noise performance of a front-end system can be improved significantly through appropriate layout choices. Uncontrolled powering loops, exposed preamplifier inputs or inadequate pin assignments on connectors and ASICs can radically compromise a system: the sensitive areas need therefore to be identified so that the powering device can be tuned to mitigate the coupling of critical noise frequencies and also to allow for system layout corrections.

### *A. Susceptibility of Hybrids.*

The noise susceptibility of two versions of the hybrid prototypes for the ATLAS Short Strip Tracker (SST) has been explored when powering them with DC/DC converter prototypes. These hybrids incorporate twenty ABCn front-end chips that can process 128 input strip channels each; however this setup was not bonded to any strip detector, enabling the study of the noise susceptibility of the hybrids exclusively.

The first hybrid circuit (LPL) required 2.5 V at 4.5 A from one converter to power the front-end chips; the power for the analogue circuitry of the ABCn chips was derived from low dropout linear regulators that are embedded in the front-end ASICs. The second hybrid circuit (KEK) required two converters, one delivering 2.5V for the digital section, the other one delivering 2.2V for the analogue section of the front-end ASICs.

The gain of every input channel was first calibrated using the reference charge injection circuit of the ASICs. Then a threshold scan with a reference input charge of 2 fC was performed to obtain individual S-curves. The RMS parameter of the fitted curve is then divided by the calibrated gain to obtain the equivalent noise charge (ENC) of the channel.

For both circuits, the ENC distribution was not degraded when powering them with the DC/DC converters, in comparison with the distribution obtained using linear power supplies [\(Table 1\)](#page-308-0). Furthermore, the ENC distribution of the KEK hybrid obtained with two DC/DC converters placed straight on top of the ASICs did not reveal any noise degradation either [\(Figure 8,](#page-308-1) a). This puts in evidence the full compatibility of the hybrid circuit with DC/DC converters, even when those are in close proximity with the ASICs.

<span id="page-308-0"></span>

| $ENC$ at $2 fC$      |       | Average | <b>RMS</b>       |       |
|----------------------|-------|---------|------------------|-------|
|                      | Row 0 | Row 1   | Row <sub>0</sub> | Row 1 |
| <b>LPL Linear PS</b> | 392.3 | 390.8   | 27.5             | 27.7  |
| LPL with DC/DC       | 392.6 | 390.9   | 27.0             | 27.9  |
| <b>KEK Linear PS</b> | 388.1 | 390.0   | 26.4             | 27.6  |
| KEK with DC/DC       | 386.1 | 387.0   | 27.4             | 26.0  |
| KEK, DC/DC on ASICs  | 386.4 | 388.0   | 27.8             | 26.3  |

**Table 1:** Table 1: Noise on hybrids without strips.

## *B. Tests with Strips.*

Similar measurements were carried out on a similar setup [\[8\],](#page-309-7) using one LPL hybrid bonded to a strip detector [\(Figure](#page-308-1)  [8,](#page-308-1) b,c,d). Here, the gain calibration is followed by the gain equalization of all channels. A threshold scan is then performed on every channel without injection of a test charge, and the resulting S-curves are fitted to obtain the threshold voltage and the RMS parameter. The equalized gains were measured to be about 110 mV/fC, enabling the scaling of the fitted RMS voltages into ENC. The measurement was carried out with three different conditions and the results were compared with those obtained with a linear power supply. First, the hybrid was powered with the DC/DC converter using a 40 cm long cable [\(Figure 8,](#page-308-1) b). Afterwards, the converter was moved within 5cm of the side of the hybrid [\(Figure 8,](#page-308-1) c), and finally the converter was moved as close as possible to one of the rows of ASICs (row 1) with the inductor facing the strips (less than 2 cm, [Figure 8,](#page-308-1) d).

The capacitance of the strip detector increases the reference noise obtained with the linear power supply, reaching around 550 electrons. The measurements performed with the DC/DC converter at distances of 40 cm and 5 cm do not show any significant deviation with respect to the reference values. The only noise degradation is observed on the row 1 when the converter is facing it at a distance of less than 2 cm [\(Figure 9\)](#page-308-2). Even in this configuration, the neighbouring row appears to be insensitive to the field radiated by the converter and by the inductor [\(Table 2\)](#page-308-3).



<span id="page-308-1"></span>Figure 8: system tests on a KEK hybrid without strip (a), and on an LPL module at far (b), close (c) and edge (d) positions of the converter.

**Table 2:** Table 2: Noise from hybrids with strips.

<span id="page-308-3"></span>

| $ENC$ at $0 fC$       |                  | Average | <b>RMS</b>       |       |
|-----------------------|------------------|---------|------------------|-------|
|                       | Row <sub>0</sub> | Row 1   | Row <sub>0</sub> | Row 1 |
| Reference (Linear PS) | 579              | 574     | 23.2             | 18.2  |
| Far $(40 \text{ cm})$ | 558              | 557     | 22.7             | 17.0  |
| Close (5 cm)          | 557              | 556     | 21.6             | 17.0  |
| Edge $(< 2 cm)$       | 574              | 716     | 21.0             | 134.0 |



<span id="page-308-2"></span>Figure 9: noise distribution of LPL module on row 1 using a linear power supply (left) and a DC/DC converter facing the row 1 strips and bondings (right). No effect was observed on row 0.

A more detailed analysis of the S-curves parameters allows putting in evidence the magnetic coupling on the input connection of the front-end ASICs and eventually on the strip

itself. Effectively, the input pads of the ABCn are arranged in a way that requires the bondings to the strips to be stacked on two layers, resulting in different loop areas between the bond wires and the ground plane [\(Figure 10\)](#page-309-8). These alternating pick-up loop areas result in an alternating noise pattern that gets visible in the S-curve RMS plot as a function of the channels [\(Figure 11\)](#page-309-9).



<span id="page-309-8"></span>Figure 10: Bonding pattern of the ABCn chips



<span id="page-309-9"></span>Figure 11: Noise pattern of the ABCn channels.

## V. CONCLUSIONS

The upgrade of the trackers at LHC requires new powering solutions to be explored. To cope with the increased demand in terms of power, a front-end power conversion system will be required, introducing new challenges to keep up with the required noise performance of the front-end systems. The proposed powering scheme based on DC to DC converter would enable a very efficient distribution of power. Those should be located on the front-end modules exposing them to new noise sources.

The inductor is a dominant source of magnetic field in the converter. The comparison of the inductors and their different shielding options allow excluding the solenoid geometry, favoring instead the toroidal topology. Despite the difficulties to manufacture it, the shielded PCB toroid has shown the lowest emission of magnetic field. As second option, the air core toroid provides a good reduction of the emissions as well. In both cases, care must be taken to minimize the connection loop between the inductor and the board that was found to be a non negligible magnetic field emitter; shielding this loop was found to bring a significant reduction of the emitted field.

The board layout determines also the noise emission of the converter, in particular the conducted emissions. The lowest common mode current is obtained reducing the inductance between the input and the output, placing the connectors close together. Proximity couplings between the filters and in particular with the coil should be avoided by means of a careful orientation and respecting minimal distances between them. If a large PCB inductor is used, it should preferably be placed at the bottom of the board, benefiting from the ground plane as a shield.

The compatibility between an optimized and unshielded DC to DC converter prototype that used discrete components and a shielded inductor, and a front-end hybrid prototype that used the ABCn ASICs was explored. The tested front-end system was found to be sensitive to magnetic couplings from the DC-DC converter at the inputs of the front-end chips and eventually at the strips, within distances of 2 cm. No susceptibility was observed on the hybrids themselves or at distances beyond 5 cm. To achieve the compatibility between the converter and the system, a careful layout of the interface between the strips and the input channels together with adequate interconnection technologies are required. Also, the magnetic field emitted by the converters has to be minimized.

Given this, the powering of new front-end systems appears to be possible using custom DC to DC converters that use magnetic field tolerant inductors. Proper layout of the hybrid and the use of appropriate interconnection technologies that would minimize the pick-up loops at the front-end inputs will insure the compatibility with compact custom DC to DC converters specifically designed for this application.

## VI. REFERENCES

- <span id="page-309-0"></span>[1] F. Faccio et al., "TID and displacement damage effects in Vertical and Lateral Power MOSFETs for integrated DC-DC converters", Proc. of RADECS 2009.
- <span id="page-309-1"></span>[2] S. Orlandi et al., "Optimization of shielded PCB aircore toroids for high efficiency dc-dc converters", Proc. of ECCE 2009.
- <span id="page-309-2"></span>[3] F. Faccio et al., "Custom DC-DC converters for distributing power in SLHC trackers", Proc. of TWEPP 2008.
- <span id="page-309-3"></span>[4] F. Farthouat et al., "Readout architecture of the ATLAS upgraded tracker", Proc. of TWEPP 2008.
- <span id="page-309-4"></span>[5] K. Klein et al., "System tests with DC-DC converters for the CMS silicon strip tracker at SLHC", Proc. of TWEPP 2008.
- <span id="page-309-5"></span>[6] G. Blanchot et al., "Characterization of the noise properties of DC to DC converters for the sLHC", Proc. of TWEPP 2008.
- <span id="page-309-6"></span>[7] S. Michelis et al., "ASIC buck converter prototypes for LHC upgrades", Proc. of TWEPP 2009.
- <span id="page-309-7"></span>[8] A. Greenhall, "Prototype flex hybrid and module designs for the ATLAS Inner Detector Upgrade utilising the ABCN-25 readout chip and Hamamatsu large area Silicon sensors", Proc. of TWEPP 2009.

# Performance and Comparison of Custom Serial Powering Regulators and Architectures for SLHC Silicon Trackers

T. Tic<sup>ab</sup>, P. W. Phillips<sup>a</sup>, M. Weber<sup>a</sup>

<sup>a</sup> STFC RAL, <sup>b</sup> IoP ASCR

Tomas.Tic@cern.ch

## *Abstract*

Serial powering is an elegant solution to power the SLHC inner trackers with a minimum volume of cables. Previously R&D on the serial powering of silicon strip detector modules had been based on discrete commercial electronics, but with the delivery of the Atlas Binary Chip Next chip in 0.25 micron CMOS technology (ABCN-25) and the Serial Powering Interface chip (SPi), custom elements of shunt regulators and transistors became available. These ASICs can be used to implement three complementary serial powering architectures. The features of these schemes and their performance with 10 and 20 chip ABCN-25 hybrids will be presented.

## I. INTRODUCTION TO SERIAL POWERING

The following subsections will introduce serial powering in the context of experiments on SLHC.

#### *A. A problem and a solution*

In the current ATLAS experiment at CERN's Large Hadron Collider (LHC) the SemiConductor Tracker (SCT) comprises 4088 detector modules, each powered by its own power supplies through its own cable. The overall mass of the electrical services is significant and since the path between the detector modules and its radiation intolerant power supplies is long, power losses in the cables are also significant. One of the most important differences between LHC and its upgraded form, super-LHC (sLHC), is the ten times higher projected beam luminosity, resulting in much higher particle hit occupancy. There are not many options how to decrease the number of ghost hits of a micro-strip silicon tracker with binary read out other than making the strips shorter. This in turn results in a much higher number of readout hybrids. With the current power distribution scheme the mass and volume of the electrical services, and the power dissipated in them, would be simply unbearable.

The problem sounds similar to domestic power distribution. Currently we are adjusting the power plant output voltages so that independently powered households receive their 230 volts. Some sort of voltage vs. current trade-off with power management moved closer to the detector readout hybrids will be required with the SLHC in order to keep the services volume and power losses bearable. The comparison with household power distribution is not chosen randomly as an alternative to serial powering would be to employ DC/DC converters, similar to the way in which households employ transformers for AC/AC conversion. Serial powering is less conventional in this respect - and reminds us of occasionally troublesome Christmas tree lights -

but in the case of particle detectors it provides a viable solution to the power distribution problem.

## *B. Serial Powering - System Overview*

The voltage vs. current trade-off is very simple with serial powering. A number of detector readout hybrids are connected in series and supplied from a single current source as shown in Figure 1. Several tens of hybrids can easily be connected in this manner. The voltage on each hybrid is regulated locally using one of the shunt regulators, the main topic of this talk. The underlying concept is that the hybrids are electrically similar, drawing similar currents, so the current overhead needed to ensure that the correct voltage may be maintained on each hybrid in the chain can be kept very low. Only one voltage is obtained directly from the shunt and any other required voltages must either be obtained from it or provided separately. High voltage for biasing the sensor shall most likely be provided common to two detector hybrids (as they will sit on the same sensor). Even though the hybrids in a serially powered chain sit at different potentials with respect to ground, communication to and from the outside world does not require especially advanced coupling. Protection circuitry is also required to maintain the integrity of the chain in the event of open loop failures. Just a few '0402' sized capacitors and perhaps 15 mm<sup>2</sup> of silicon can be the total mass overhead of serial powering.



Figure 1: System overview with serial powering [3]

#### *C. Serially powered chain of detector hybrids*

In order to look at the chain of serially powered detector hybrids a little bit more analytically, it is convenient to introduce a couple of simplifying conditions like linearity, time invariance

and e.g. Norton's representation of the power supply. These are acceptable conditions in the small signal region and the model of the chain then reduces to its dynamic impedances as illustrated in Figure 2.



Figure 2: A Generic Chain of Serially Powered Devices

It is useful to quantify how much of a small signal voltage generated on one hybrid in a serially powered chain is transferred onto the others  $(A_{V_{noise}})$ . This coupling is described by Formula 1.

$$
A_{V_{noise}} = \frac{Z(\omega)}{(n-1) \cdot Z(\omega) + Z_0(\omega)}\tag{1}
$$

One can immediately see that the choice of current source not only simplifies the DC conditions of the chain but it also prevents individual detector hybrids from "seeing" each other. The impedance of the detector hybrid should be as low as possible and the output impedance of the current source should be as high as possible. In the low frequency range this is important to prevent oscillation modes in the chain while in the high frequency range this is important to minimize the spreading of any noise to which the detector system may be sensitive. One should make two important remarks at this point. Firstly, no oscillation modes have ever been observed during our studies, even with obsolete shunt regulator designs and current limited laboratory power supplies. Secondly, the circuit designs of the shunt regulators and current source matter only up to a few MHz, above this frequency the impedances are dominated by other factors such as the hybrid layout, decoupling and power cable impedances.

#### II. SERIAL POWERING AND ABCX CHIPS

This section will only present the information useful for further elaboration in the text.

#### *A. Serial Powering and ABCD chip*

To give a "proof of principle" result, several serially powered staves were constructed using the ABCD chip (the chip used in the current ATLAS SemiConductor Tracker). The largest of these staves operated a chain of 30 hybrids using serial powering interface boards such as the one shown in Figure 3, based on commerical components. All staves were successful and any initial worries about the concept of serial powering were despatched.



Figure 3: Serial Powering Interface Board, comprising a shunt regulator and AC coupled LVDS buffers for communication.

#### *B. Description of the current ABCn chip/hybrid*

The ABCN-25 readout chip incorporates several functional blocks for serial powering which will now be discussed in detail. The chip requires digital and analogue supply voltages. The nominal digital voltage (2.5V), is only slightly higher than the nominal analogue voltage (2.2V), Hence a linear regulator is present on the chip to derive the analogue voltage from the digital voltage. When considering the performance of a shunt regulator it should be taken into account that the overall power supply rejection ratio (the PSRR of the linear regulator convoluted with that of the analogue front end) of the ABCN-25 chip is rather large. If any increase in equivalent noise charge is observed, the disturbance to the digital voltage must be enormous. The likely cause would then be that the digital voltage is too low, with fast changes.

The current consumption of the ABCN-25 chip was expected to be largely dependent on whether the clock signals were present and upon how its internal registers had been configured. Only a minimal, short increase in digital current consumption was expected each time a L1A trigger is received by the chip, corresponding to increased switching activity as the readout cycle begins. Under such circumstances the task of a shunt regulator would be to maintain the correct DC voltage and to provide some small signal filtering, as large changes in current should not occur.

Unfortunately in this version of the chip, the digital current consumption exhibits strong variations coming long after (∼400 µs) L1 accept. The size of this current "bump" depends strongly and monotonically on the discriminator threshold and weakly on many other parameters, but it is always there. If another L1 comes before the previous "bump" has ended the whole process resets and the current consumption drops very quickly to the original level.

This effect produces very sharp, large, time structured peaks in the current consumption of the chip (up to 1.4 A for a 20 chip hybrid, but also observed using single chip ABCN-25 test PCBs).

This is a much more demanding test for a shunt regulator than the expected rare and accidental "worst case" of turning the clock off. Some examples of this are shown in Figure 4. Since L1 accept trigger rate is much higher than 1 per 400  $\mu s$ this effect does not pose any troubles during standard data taking. Efforts to understand this effect continue. It is likely to disappear with the next version of the chip.



Figure 4: The Current Bump. Red triangles indicate L1A triggers. Left: The bump after single trigger at threshold set to zero; Middle: The bump after single trigger at threshold set to 200; Right: Multiple triggers. The scales are: time - 1 ms/div, digital voltage - 200 mV/div, digital current - 500 mA/div

#### III. THE THREE SHUNT REGULATORS

In this section the three main shunt regulator architectures will be presented. Each is now available in fully custom circuitry. The two distributed options are named after their designers. The 'M' scheme designed by Mitch Newcomer is a distributed shunt with external feedback. The 'W' scheme designed by Wladyslaw Dabrowski is a distributed shunt with the internal feedback. The stand-alone shunt regulator option employs the Serial Powering Interface chip (SPi), a chip which provides additional functionality as will be discussed later. The key differences between the schemes are emphasized in Figure 5.



Figure 5: The Three Shunt Regulators Schemes [3]

#### *A. 'M'*

As can be seen in Figure 5 this scheme consists of two parts. The shunt transistor-like components are integrated in the ABCN-25 chip while the control scheme is to be somewhere else on the hybrid. First, let's have a look at what is inside of each ABCN-25 chip. There are two sets of current mirrors which can be seen in the schematic shown in Figure 6.



Figure 6: The shunt device present in ABCN-25 (two per chip) - current mirrors [2]

The two sets can be driven separately from a current limited dual output op-amp for improved reliability. This choice of shunting element has many advantages. The input capacitance is small. Charge injected into the control node of the chip is not directly transferred into the digital voltage. Small changes in the control voltage result in small changes in shunting current which deals with the matching problem due to the voltage drops across the hybrid and some small noise on the control bus which can result only in small noise on digital voltage. The linearity is very high when all the transistors are in the strong inversion region. Finally, the current mirrors are very fast and can go from zero to full current (∼140 mA) in less than 50 ns. The shunt is simply perfect and it can be decided later how to use it. The two shunts require very little silicon area and the scheme can be tuned without a new submission of the ABCN-25 chip. The digital voltage on the hybrid with respect to the control voltage exhibits all the properties of a plant (process gain, time constant, dead time) so it is no surprise that the basic control scheme (by M.N.) shown in Figure 7 (left) is reminiscent of an analogue PID regulator.



Figure 7: Left: Basic control circuit for the 'M' scheme, Right: Photograph of the current implementation of the control scheme with the hybrid

Intuitively it can be seen that the transfer of the circuit goes to the value given by the two resistors in the negative feed-back for higher frequencies. In the Figure 7 (right) the current implementation of the control scheme with the hybrid is displayed. It makes no sense to fine tune the circuit at this moment as especially the dead time depends on the position with respect to the hybrid. The next iteration of the Liverpool hybrid will have the control scheme incorporated. The 'M' architecture does, however, provide excellent results already. In Figure 8 the transient response of the system and its control voltage to a step in input current is shown. It can be seen that the control voltage reacts almost immediately and there is no overshoot on the digital

voltage. The time constant of the hybrid is  $\sim$ 17  $\mu s$ .



chips shunting all the shunt current. The special design to overcome this difficulty is best explained by the conceptual diagram shown in Figure 10.



Figure 10: Conceptual diagram of the 'W' scheme shunt regulator present in ABCN-25. [1]

Figure 8: Transient response of the system in 'M' scheme (digital and control voltages as measured at the control circuit) to a step in current

This is extremely encouraging considering the improvised nature of the connection between the control circuit and the hybrid as used for these tests. If there is sufficient current in the chain, the bumps and slopes of the hybrid current consumption no longer appear on the digital voltage rails, as seen by an oscilloscope trace. Even if there is not enough current to cover the bumps, the superior step response of the circuit "softens" the digital voltage time profile so that no increase in ENC (equivalent noise charge) is observed. Figure 9 shows a typical ENC chart for a hybrid operated using this scheme. The ENC value is just below 400 electrons, in good agreement with the design value for the ABCN-25 chip, and the same as is obtained for a hybrid powered from a voltage source. The hybrid was not trimmed.



Figure 9: Typical ENC plot obtained using the 'M' scheme. For clarity, only half the channels are shown.

## *B. 'W'*

As suggested in Figure 5 this scheme utilizes one complete shunt regulator within each read-out chip. The scheme is tempting because it does not require any external shunt regulation components. A classical design of a shunt regulator consists of a voltage reference, op-amp and shunt transistor. This design cannot work at all in the case of many shunt regulators connected in parallel. There are some IR drops across the hybrid and also the voltage references cannot be perfectly matched from manufacture which would both result in small number of

The shunt transistor is a P-MOS and its current is sensed and compared with six different current references. There is a transresistance amplifier which adjusts the reference voltage of the shunt op-amp. If the shunt current goes above one of the reference currents, the corresponding correction current source gets connected to the input of the trans-resistance amplifier thus adjusting the set-point voltage of the shunt regulator and the shunt current. One of the reference currents provides over-current functionality while the five others serve for shunt current redistribution within the hybrid during start-up. The 'W' scheme considers huge decline in current consumption of the ABCN-25 to be an accidental situation at which over-current protection should be activated. The shunt transistor is rather large for improved reliability.



Figure 11: Infrared pictures of the hybrid. Left: Over-current protection activated simultaneously after turning off the clock in the 10 chip hybrid, Right: ABCN-25 shunting extreme current without damage

Figure 11 (left) shows an infrared image of a Liverpool hybrid fitted with 10 ABCN-25 chips with over-current protection activated on all the chips after the clock was turned off. For hybrids fitted with 20 chips, the over-current protection does not work as well as expected. The shunt regulators in this scheme were expected to shunt rather small currents so increasing the supply current in order to cover the ABCN-25's current bumps is not really possible. During the tests with the DAQ system of the hybrid supplied with increased current, the sharp, time dependent peaks of ABCN25's current requirements sometimes make one or a few chips shunt much more current than the other chips. Such a hot-spot is shown in Figure 11 (right). This chip

was likely to be shunting a high current at the time (∼1 A) but the situation has never damaged any chips. As stated earlier the bumps on power consumption cannot be observed at high trigger rates as may be expected during SLHC running, but what about the calibration? The DAQ software can be modified to accommodate the bump by separating the L1 accept triggers in time. In such case the performance of this scheme in terms of ENC plots is as good as with the 'M' scheme and no hot-spot appearance can be observed.

#### *C. 'SPi'*

Serial powering interface chip is a versatile chip designed by Marcel Trimpl (FNAL), M.Newcomer and N.Dressnandt (Penn). The idea of the chip is to provide an universal solution for serial powering. It contains linear regulators, LVDS buffers, dual output current limited op-amp for the 'M' control scheme, its own shunt regulator with selectable output voltage etc. Its block diagram is shown in Figure 12.



Figure 12: Serial Powering Interface chip block diagram. [3]

The whole talk dedicated to SPi in the Power Working Group session was given by Richard Holt (RAL). Let's just emphasize in this place that SPi has been tested thoroughly in a test stand and used with the hybrid as well.

#### IV. CURRENT DEVELOPMENT

Both short and long staves will be constructed with the ABCN-25 chip in the year 2010. These staves will bring together all parts of a serially powered system (including protection etc.) for the first time. The individual shunt regulator architectures will use plug-in boards so that a variety of schemes may be studies. The next iteration of the Liverpool hybrid is specifically designed to accommodate serial powering. The next version of the ABCN chip, ABCN-13, will be built in 130nm technology, and the development of new powering blocks for this ASIC is in progress.

#### V. SUMMARY

The development of serial powering and its shunt regulators and other powering blocks goes hand in hand with the development of the ABCx ASICs. Previously shunt regulator circuitry based on commercial electronics had been used to build several demonstrator staves based on the ABCD ASIC, and these were seen to perform well. Currently three main shunt regulator options have been implemented in full custom silicon and they are all functional. The characterization of these blocks has provided useful feedback to refine future designs. Several new serially powered stavelets and a full stave will soon be constructed with the ABCN-25 chip. Future ASICs, such as the ABCN-13 and MCC chips, will contain new powering blocks in 130 nm technologies.

#### **REFERENCES**

- [1] W.Dabrowski, Design of the distributed shunt regulator integrated in the ABC-N ASIC, ABC-N Final Design Review (31.1.2008).
- [2] M.Newcomer, Distributed Slave Shunt Approach: Description and Simulations (11.12.2007).
- [3] The Figures were kindly provided by Richard Holt (RAL).

## Power and Submarine Cable Systems for the KM3NeT kilometre cube Neutrino Telescope.

M. Sedita<sup>a</sup>, R. Cocimano<sup>a</sup>, G. Hallewell<sup>b</sup>. Representing the KM3NeT Consortium

<sup>a</sup> INFN-LNS, Via S. Sofia 62, Catania, Italy; <sup>b</sup> Centre de Physique des Particules de163 Avenue de Luminy, Case 902, 13288 Marseille Cedex 09, France

[sedita@lns.infn.it](mailto:sedita@lns.infn.it)

#### *Abstract*

The KM3NeT EU-funded consortium, pursuing a cubic kilometre scale neutrino telescope in the Mediterranean Sea, is developing technical solutions for the construction of this challenging project, to be realized several kilometres below the sea level.

In this framework a proposed DC/DC power system has been designed, maximizing reliability and minimizing difficulties and expensive underwater activities.

The power conversion, delivery, transmission and distribution network will be described with particular attention to: the main electro-optical cable, on shore and deep sea power conversion, the subsea distribution network and connection systems, together with installation and maintenance issues.

## I. INTRODUCTION

The KM3NeT consortium [1], including members of the ANTARES, NeMO and NESTOR collaborations, is developing a kilometre cube-scale neutrino telescope for the Mediterranean sea with associated nodes for deep sea sciences.

The construction of such a detector will require the solution of technological problems common to many deep submarine installations.

Several hundred vertical detection units (DUs) containing photomultipliers will be deployed on a seafloor site up to 100 km from the shore and several kilometres below sea level

The power system is composed of an AC/DC shore power feeding station, a management and control system, a standard, single conductor 10 kV DC-rated electrooptical telecommunications cable with sea-water current return and a distribution network to deliver power to the neutrino telescope. On the seabed specially-developed DC/DC converters will reduce the transmission voltage to 400 V for distribution to the DUs. The estimated total power is about 50 kW. The estimated bandwidth for the full data transport system is of the order of 100 Gb/s. For the deep sea sciences associated infrastructure the equivalent numbers are less well defined but estimated to be less than 10 kW and 100 Mb/s.

The sea-floor network will consist of several junction boxes linked by electro-optical cables to the telescope DUs and to the deep sea sciences nodes. The final design of the network is still under development and will incorporate extensive redundancy to mitigate single point failures.

The design requirements for an ocean observatory site-to-shore cable are compatible with standard capabilities of telecommunications cables, for which a wide range of industry-approved standard connection boxes, couplings and penetrators exists, and which can be adapted to interface with scientific equipment.

Underwater connection technologies, available in the telecommunications, oil and gas markets - including deep-sea wet-mateable optical, electric and hybrid electro-optic connectors - have been adapted and developed to fulfil the project requirements.

The installation and maintenance operations for such detectors are difficult and expensive. In the deep-sea system design special attention is being paid to maximizing reliability and minimizing underwater operations. All components must survive both the mechanical rigours of installation (torsion, tension due to self-weight and ship movement) and must have high reliability and long lifetime under the extreme seabed conditions (high ambient pressure of 250-400 bar, an aggressive and corrosive environment, lateral and torsional forces due to deep sea currents etc.).

The various technical aspects of this unusual power supply system are discussed in the following sections.

#### II. CABLE POWER TRANSMISSION CONCEPTS

For undersea observatories, both AC and DC power systems are viable and have their particular advantages and disadvantages. Although, even at conventional AC frequencies (50 Hz) cable shunt capacitance requires inductive compensation, an AC power system allows for the use of transformers in the shore and deep sea nodes and efficient high voltage cable transmission. Power interruption is simpler than in a DC system. Furthermore, DC systems have insulation problems that have no counterpart in AC systems; long-term high voltage DC excitation can cause eventual breakdown of solid cable insulation. Therefore, although DC is conventionally used on long-haul undersea telecommunications cables, AC alternatives are also being considered.

For a qualified decision, the power system must be evaluated taking into account the cables, transformers, DC/DC converters, rectifiers, the required voltage stability and the level of short circuit capability. Each item is likely to impact significantly the total price of the power transmission network. Considering the power required and the distance over which it must be delivered, the use of 10 kV nominal voltage is unavoidable. The maximum voltage that can be applied to a cable is limited

by insulation breakdown. A maximum of 10 kV is typical for undersea telecommunication cables and is considered as an upper limit.

The current-carrying capability depends on the conductor heating and the voltage drop. The resistance of a typical telecommunication cable is around 1 Ω/km so that over the distances typical for KM3NeT the current is limited to around 10 Amperes. Power can be delivered in the following ways:

- Three-phase AC (multi-conductor cable)
- DC with cable current return (multi-conductor cable)
- DC with current return through the sea (conventional single conductor telecom cable)
- AC mono-phase with current return through the sea (conventional single conductor telecom cable).

#### III. MAIN ELECTRO-OPTICAL BACKBONE CABLE

The design requirements for an ocean observatory site-to-shore cable are compatible with the standard capabilities of telecommunications industry components which can be readily adapted to interface with scientific equipment. The low failure rate among the large number of such components in service suggests mean times between failures of several thousand years. As standard, a submarine telecommunications cable has to provide a service life of at least 25 years. It must be easy to deploy and repair at sea. The longevity of the installed cable depends on minimising the strain induced on the optical fibres during the dynamics of installation and the longterm seabed environment of high ambient pressure, abrasion risks, unsupported spans, etc.).

The cost of a submarine cable repair at sea is substantial. However, since 1999, under the Mediterranean Cable Maintenance Agreement (MECMA) cable ships, fully equipped with Remote Operated submarine Vehicles (ROVs), are maintained on constant readiness at Catania (Italy) and La Seyne-sur-Mer (France), (Figure 1). These ships provide repair services for subsea cables owned by member organisations (cable operators: around 44 as of 2009). The insurance character of this agreement offers members a repair capability for an affordable yearly contribution in proportion to the relevant cable mileage. Two of the pilot projects are members of MECMA.

The five major submarine cable manufacturing companies have formed the Universal Jointing Consortium which offers qualified and proven jointing techniques for a wide range of cable types ("Universal Joint" (UJ) and "Universal Quick Joint" (UQJ)). MECMA ships support universal jointing.

Virtually all reported submarine cable failures are due to human activity (Figure 2)- notably fishing and anchor falls in shallow water - although natural chafing, abrasion and earthquakes in the deep ocean also occur, as shown in Figure 2. To mitigate these risks, careful route planning is essential, and sea-bed burial is used where circumstances require it.



Fig. 1. The MECMA consortium with the two cable-ship operatoring bases and storage depots.



Fig. 2. Submarine cables: causes of fault. (MECMA 2008).

Submarine cable armouring is selected to be compatible with the specific route; therefore the cable mechanical characteristics are an integral component of the overall system design. Submarine telecommunications cables can be equipped with virtually any fibre type and any reasonable number of fibres. At present all the major cable manufacturers deliver telecommunications cables with a number of fibres that does not routinely exceed 48. This is mainly due to the advent of Dense Wave-length Division Multiplexing (DWDM) technology and to the requirements of simplifying the cable mechanics. The fibre types used for submarine transmission are optimised for minimum attenuation over the full C-band (1530-1570 nm) with dispersion characteristics that depend on the application. The cable optical properties are an integral part of the optical communications system specification.



Fig. 3. Examples of different armouring on submarine cables.

Many types of submarine telecommunication cables are commercially available. The design varies depending on manufacturer, fibre count, power requirements, and the external protection. Figure 3 shows a range of mechanical configurations of telecommunications cables.

The armouring is strongly related to the characteristics of soil, water, marine current, depth and installation methods

The interface between a cable and the submerged infrastructure is complex. Not only must the connection provide load transfer through a mechanical discontinuity in the cable, but it must also maintain electrical insulation relative to the sea potential, while supporting the safe connectivity of both optical fibres and electrical conductors. Any submerged component, such as a telecom repeater, is connected to the cable through socalled extremity boxes, each effectively forming one half of a cable-to-cable joint.

## a. Cable Design Examples

The design is likely to be driven by availability from telecommunications cable suppliers. In the following sections some presently available cable designs are discussed, together with the different power options. These should be seen as examples of what is possible.

#### b. Monopolar Power Delivery

A monopolar system incorporates a current return via the seawater and will generally result in the smallest cable dimension and weight. Due to the extremely small resistance in the sea return this system has low power losses. Cables usable for this system are in fact the most commonly used in the telecommunications industry. To allow for the current return via the sea this system must incorporate sea electrodes both at the shore and in the deep sea. An example of such a cable is shown in Figure 4. The most significant technical problem with a DC monopolar system is the danger of corrosion of neighbouring structures and installations. Due to electrochemical reactions on the sea-return electrodes chlorine gas may be generated. Where such a system is used these issues must be addressed.

## c. Bipolar Power Delivery

In a bipolar system a return conductor is required. This can be achieved by incorporating a return conductor a single cable or having a separate return cable. The choice will be driven by the relative cost. Figure 5 illustrates an example of a submarine cable [3] which contains four conductors; two for supply and two for return.

## d. Three-Phase AC Power Delivery

In this system three conductors which share the current are required in the cable. An example of a cable usable for this system [3] is shown in Figure 5. Such a system requires a balancing of the loads on each conductor. If this is not fully achieved extra power losses are incurred.



Fig. 4. Standard monopolar submarine cable – internal structure.



Fig. 5. Bipolar submarine cable example.



Fig. 6. Three-Phase submarine cable example.

## IV. A POWER TRANSMISSION SYSTEM EXAMPLE FROM THE NEMO PHASE-2 PILOT PROJECT

A site location located on a 3500 m deep abyssal plateau approximately 40 NM south east of Capo Passero, Sicily,  $(36^{\circ} 20^{\circ} N; 16^{\circ} 05^{\circ} E)$  has been proposed by the NeMO collaboration for the installation of a  $km<sup>3</sup>$  scale detector. The oceanographic and environmental properties of the site have been measured in more than 30 sea campaigns over nine years. The NeMO Phase-2 project is under realization on this site and will allow the installation of prototypes of  $km<sup>3</sup>$  detector components at 3500 m, also providing an on-line continuous monitoring of the water properties.

## a. The backbone cable

The backbone cable is a DC cable, manufactured by Alcatel-Lucent [2] and deployed in July 2007. It carries a single electrical conductor, that can be operated up to 10 kV DC allowing a power transport of more than 50 kW, and 20 single mode ITU-T G655-compatible optical fibres for data transmission. The cable total length is about 100 km.

## b. On shore power feeding equipment

The shore Power Feeding Equipment, (PFE), is an AC-DC converter providing 50 kW at 10 kV DC with sea current return. The PFE, from HEINZINGER Electronic GmbH has the following main characteristics:



#### c. Submerged plant

At the end of the submarine cable a mechanical frame hosts the CTA (Cable Termination Assembly), that splits the power and fibreoptics functions, a MVC (Medium Voltage Converter:10 kV $\rightarrow$  400 V DC), together with a splitter box providing three electro-optic ROV-mateable connectors (400 V DC and 4 optical fibres) as shown in Figure 6 [3, 4].

A NeMO tower prototype will be connected and powered through the frame to validate the proposed technologies of the NeMO project, and to provide a continuous on-line monitoring of the deep sea site.



Fig. 6. NeMO Submerged plant: mechanical frame with cable termination, power conversion and power/signal distribution.

#### d. Deep-sea power conversion

The MVC is based on a design developed by JPL NASA for the NEPTUNE Project [5, 6], and was deployed in the MARS[7] and Neptune Canada [8] projects. It is built from a number of low power sub-converters blocks arranged in a series-parallel configuration, (Fig. 7), to share the load and provide redundancy [9].



Fig. 7. Medium Voltage Converter: DC/DC Converter layout.

The converter has an input of up to 10 kV DC and output of 375 VDC/28 A. The measured efficiency exceeds 87% at full load. The converter configuration contains 48 Power Converter Building Blocks (PCBB) arranged as matrix of 6 parallel legs with 8 in series in each leg. This arrangement allows for faults within some PCBB's without a failure of the full converter.

The PCBB is a pulse-width modulated switching forward converter with an input of 200 V and an output of 50 V at around 200 W. Each block has four MOSFETs, two working as a primary switch and two on the secondary side as a synchronous rectifier. A block diagram of the circuit is shown in Figure 7. The various transformers are able to withstand continuous 10kV operation in a dielectric fluid.



Fig. 7. Medium Voltage Converter: PCBB block diagram.

The entire power converter is housed in a pressure vessel, filled with Fluorinert® dielectric cooling fluid. A parallel stack, containing eight PCBBs on four boards, together with a control board is shown in Figure 8. Its final complete arrangement is shown in Figure 9.



Fig. 8. Medium Voltage Converter: complete parallel 'stack'



Fig. 9. Medium Voltage Converter: complete assembly.



Fig. 10. Final assembly of CTA, MVC and ROV connectors.

#### V. THE DEEP-SEA CONNECTION SYSTEM

Connectivity issues present particular challenges when there is a practical need for wet-mate connections.

The technical challenges associated with current and planned seabed observatories include:

- *Water Depth:* Down to 4,500 meters
- *High Voltages:* 10,000 VDC
- *High Bandwidth:* The desire to bring real-time science data from individual experiments directly to the shore drives up bandwidth requirements to several Gbits/sec per optical fibre.

During the last two decades the wet-mate connectivity and sea-floor maintainability on the seafloor have benefited from the use of Remotely Operated Vehicles (ROVs). Prior to this time, cabled systems were hardwired and required the system to be harvested from the seafloor for maintenance or re-configuration.



Fig. 11. Wet-mateable ROV connectors in use in ANTARES (top left) and NeMO (bottom).

The enabling technology of wet-mate connectivity is well known throughout the telecommunication, oil & gas industries and the ocean research community. Wet-mate connectivity encompasses not only low-power electrical transmission and all-optical connectors, but also electrooptical hybrid configurations (optics and electrics in one connector) and high-power electrical connectivity. Figure 11 shows examples of this technology in use for NeMO and ANTARES.

#### VI. THE SEABED POWER DISTRIBUTION

The distribution system represents a network that carries power and data from each DU to and from the main cable. The distribution geometry is under

investigation and two possible solutions are under consideration, the Star solution, (Fig. 12) and the Ring solution (Fig. 13). The main difference is in the location of the power conversion system, concentrated in the centre in the first case, or distributed circumferentially in the second. The chosen layout must allow for easy deployment and connection operations as well as for postinstallation maintenance operations, which can be difficult and expensive. Special attention must be paid to techniques for maximizing reliability and minimizing underwater operations.



Fig. 12. A possible sea-floor layout using star distribution



Fig. 13. A possible sea-floor layout using ring distribution

## VII. ACKNOWLEDGEMENTS

This work is supported through the EU-funded FP6 KM3NeT Design Study Contract No. 011937.

#### VIII. REFERENCES

- [1] [http://www.km3net.org](http://www.km3net.org/)/
- [2] [http://www.alcatel-lucent.com](http://www.alcatel-lucent.com/)

[3] M. Sedita, "Electro-optical cable and power feeding system for the NEMO Phase-2 project" Proc. 2nd Intl. Proc. VLVnT05. Catania, Italy, Nov. 8-11, 2005. Nucl. Instr. & Meth. A567(2006)531.

[4] M. Sedita. The NeMO Project Technical aspects present and future operations. IEEE Fourth International Workshop on Scientific Use of Submarine Cables & Related Technologies. Dublin 07-10/02/2006. 284

[5] B. Howe et al., IEEE J. Oceans Eng. 27 (2002) 267

[6] [http://www.neptune.washington.edu](http://www.neptune.washington.edu/)

- [7]<http://www.mbari.org/mars/>
- [8]<http://neptunecanada.ca/>

[9] A. Lecroart, et al., (Alcatel-Lucent Submarine Networks, Centre de Villarceaux, 91620 Nozay, France), "Power and optical communications for long tie-backs", Proc. VLVnT08, Toulon, France April 22-24, 2008; Nucl. Instr. & Meth. 602(2009)246