# Irradiation tests of the complete ALICE TPC Front-End Electronics chain

K. Røed<sup>1,2</sup>, J. Alme<sup>1</sup>, R. Campagnolo<sup>5</sup>, C.G. Guitierrez<sup>5</sup>, H. Helstrup<sup>2</sup>, D. Larsen<sup>1</sup>, V. Lindenstruth<sup>3</sup>, L. Musa<sup>5</sup>, E. Olsen<sup>4</sup>, A. Prokofiev<sup>6</sup>, M. Richter<sup>1</sup>, D. Röhrich<sup>1</sup>, B. Skaali<sup>4</sup>, G.Tröger<sup>3</sup>, K. Ullaland<sup>1</sup>, J. Wikne<sup>4</sup>

- Department of Physics and Technology, University of Bergen, Norway
   Bergen University College, Norway
  - 3. Kirchhoff Institute for Physics, University of Heidelberg, Germany
    - 4. Department of Physics, University of Oslo, Norway
      - 5. CERN, Geneva, Switzerland
    - 6. The Svedberg Laboratory, Uppsala University, Sweden

#### Abstract

The ALICE TPC Front End Electronics will be operated in a radiation field of up to 800 hadrons/ $cm^2 - sec$ . SRAM-based FPGAs are used on the Front-End Cards (FEC) and the Readout Control Units (RCU). Several irradiation tests of all components on the cards have ensured that the components selected are able to withstand the radiation environment, but have also shown that single event upsets will occur in the FPGAs. As system stability and endurance are major concerns, effort has been put into reducing the implications of radiation effects to a minimum, for instance by using the active reconfiguration option as given by the Xilinx Virtex-II pro FPGA.

# hat single event upsets will occur in the FPGAs. As sysbility and endurance are major concerns, effort has been reducing the implications of radiation effects to a minor instance by using the active reconfiguration option as by the Xilinx Virtex-II pro FPGA. Monitoring and board last the property of the property of

I. Introduction

The ALICE experiment described in [1] will investigate Pb-Pb collisions at a center of mass energy of about 5.5 TeV per nucleon pair and p-p collisions at 14 TeV. The detectors are optimized for charged particle multiplicities of up to  $dN_{ch}/d\eta$  of 8000 in the central rapidity region. The Time Projection Chamber (TPC) is one of the main tracking detectors of the AL-ICE experiment. Charged particles ionize the gas volume on their way through the detector, the produced electrons drift in an electric field towards the readout chambers at the end-caps where the charge is amplified and collected by a 2-dimensional readout system. Together with the drift time this provides a 3-dimensional spatial information. The TPC consists of 36 sectors which are read out by 4356 Front-End Cards (FEC) serving roughly 560000 channels. All FECs have to be configured and monitored. Furthermore programmable logic devices are widely used in all hardware devices of the TPC Front-end electronics to keep the system open and flexible. The configuration thus must include the upload of firmware to the FP-GAs. Fig. 1 shows an overview of the Front-End Electronics (FEE). One single node of the TPC FEE chain consists of up



Figure 1: Components of the TPC Front-end electronics and dataflow.

to 26 Front End Cards (FECs) connected to a Read Out Control Unit (RCU). It contains the RCU motherboard which hosts two additional interface boards customized for the ALICE experiment. The Detector Data Link Source Interface Unit (DDL SIU) is the ALICE standard interface to the DAQ. The second card, the DCS board, is an embedded computer which is in charge of monitoring and controlling the system. More details about the electronics can be found in [2].

The RCU shown in Fig. 2 relies on the use of Commercial Off The Shelf (COTS) components, such as Field Programmable Gate Arrays (FPGAs). SRAM-based FPGAs are chosen because this technology offers great flexibility, but as the Front End Electronics will be operated in a radiation environment of up to 800 hadrons/cm<sup>2</sup> sec, single event upsets in the configuration memory is a major concern.

In order to qualify the components for the TPC radiation environment, proton beams at the Cyclotron in Oslo and at TSL in Uppsala have been used to test Altera, Xilinx and Actel FP-GAs. The complete setup, including FECs, RCU, DCS, SIU



Figure 2: Conceptual schematic of the Readout Control Unit

and Trigger system have also been tested using a lower intensity neutron beam at TSL.

#### II. FEE and radiation concern

SRAM based devices have shown to be sensitive to radiation effects related to the flux, such as Single Event Effects (SEEs). They are statistical in nature and are therefore treated in terms of their probability to occur. This is device specific and dependent of the nature of the incident particles. Single event effects are of great concern to the ALICE readout electronics since they can cause the electronics to fail at any time during operation, leading to potential loss of important data.

#### A. Single Event Upset, SEU

A single event upset, a type of SEE, corresponds to a soft error appearing in a device due to the energy deposited in silicon by an ionizing particle [3]. The main concern are high-energetic (E > 20 MeV) particles (protons, neutrons, pions) which induce complex nuclear reactions in the silicon. The heavy recoil ion created in these reactions in turn ionizes the device material which through it travels, and leaves behind a track of electronhole pairs. If this happens near to for instance a CMOS transistor, the newly created carriers will drift in the electric field in the material and will be collected at a nearby node. If the charge exceeds the critical charge for a transistor to change its logic state, this will cause a Single Event Upset. An SEU is non-destructive and a rewriting or reprogramming of the device will return the device to normal behavior thereafter.

A large part of the FGPA consists of a comprehensive routing network. Depending of the firmware implemented, only parts of the routing resources might be used. Thus an SEU induced in one of the SRAM bits controlling unused resources, might necesseraly not induce a functional failure in the firmware. On the other hand if an SEU causes a functional fail-

ure in the firmware this is referred to as a Single Event Functionl Interrupt (SEFI).

#### B. Cumulative effects

Cummulative effects are due to radiation effects accumulating over time. Total Ionising Dose (TID) and displacement damage can ulitimately lead to device failure. These effects can alter the electrical properties of the device and thus increase for instance the SEU succeptibility. Howver, due to the relatively low expected dose rates for the TPC front-end electronics, cummulative effects are not of main concern.

#### C. ALICE TPC radiation levels

In order to estimate an expected SEFI rate for the FEE it is necessary to map the radiation environment which the FEE will be exposed to. Simulations of the ALICE radiation levels have been preformed and figure 3 shows examples for the kinetic energy spectrum of protons and neutrons. With distribution peaks between 100-200 MeV, both protons and neutrons will contribute to the nuclear interactions in the silicon. Further more it favours the use of the 180 MeV proton and neutron beams at The Svedberg Laboratoy. The expected hadron fluence for the



Figure 3: Energy spectra of produced particles for one central event.[4]

TPC FEE is estimated to approximately  $10^{11} hadrons/cm^2$  for  $E_{kin} > 20 MeV$ , corresponding to a dose of 0.6 kRad [4].

## III. Experimental setup

Irradiation tests have been carried out at both the Oslo Cyclotron, University of Oslo, and The Svedberg Laboratoy, Uppsala University. The Device Under Test (DUT) is fixed in the beam line at a given point depending on wanted beam configuration properties such as for instance the beam profile. It is



Figure 4: Experiment area for proton irradiation at TSL Blue Hall.

1. DAQ PC, 2. DUT, 3. Alignement laser, 4. Graphite collimator,

5. Scintillation telescope, 6. TFBC, 7. Beam exit point, 8. Beam line/path

further connected to a shielded DAQ PC which in turn is supervised from a remotely placed counting room using the local area network.

#### A. Proton beam at OCL

The Oslo Cyclotron can deliver an external proton beam up 29 MeV. A setup consisting of a laser, a mirror, a Uranium fission target and a Thin Film Breakdown Counter (TFBC) [5] is used to align the beam and measure the beam profile and flux. Using fluxes ranging from  $10^6 - 10^7 \ p/cm^2s$  and a beam profile of 2x2 cm, the FLASH based ACTEL proASIC APA075 and the SRAM based ALTERA APEX20K400E and Xilinx Virtex-II Pro 7 FPGAs have been tested at OCL.

#### B. Proton beam at TSL

At TSL we have been provided with 38 MeV and 180 MeV proton beams in the flux range of  $10^6 \ p/cm^2 s$  with a beam profile of 2x2 cm. Figure 4 shows the setup for the proton runs in the Blue Hall at TSL. A Uranium fission target and a TFBC, together with a relative flux monitor (scintillator) measuring scattered protons, are the main components of the beam line configuration. The TSL proton beam have been used to investigate the SEFI cross-section for the ALTERA APEX20K400E and the Xilinx VP7 FPGA, and the SEFI cross-section of the DCS board embedded computer.

#### C. Neutron beam at TSL

During the irradiation test of the TPC Front-end electronics at The Svedberg Laboratory in May 2005, the control system was tested extensively with neutron beam. The purpose of the full system irradiation test was to qualify the system as a whole for its radiation tolerance, during normal data taking conditions.

The setup area was placed several meters upstream from the beam outlet in order to get a beam profile large enough to cover the full setup. As secondary customer we had no influence on the beam energy, fluxes and beam time. Beam tuning and alignment was provided by the TSL personel.

Two setups where tested during the neutron run. One setup consisted of an RCU motherboard, a DCS board, a SIU board and 9 Front-End Cards attached to the RCU. A second setup used one RCU board equipped with a DCS board running without any FEC attached.

#### IV. Results

Irradiation tests have been carried out over several periods and shifts at both the OSL and TSL. For the SRAM based FPGAs the main test scheme used is a matrix of flip-flops organised as a shift register. A bit pattern is synchronously shuffeld through the shift register and compared at the output for any bit flips. This is a dynamic test where the SEFI cross-section is measured. The beam tests are performed with much higher intensities than what the system will be exposed to in real-life, giving a test of the endurance of the componets as well.

#### A. ALTERA APEX20K400E

Table 1: SEFI Cross-section vs. energy

| Energy [MeV] | SEFI CS [cm <sup>2</sup> ]                 |
|--------------|--------------------------------------------|
| 7            | 0                                          |
| 25           | $3.4 \cdot 10^{-9} \pm 9.6 \cdot 10^{-10}$ |
| 28           | $3.3 \cdot 10^{-9} \pm 1.9 \cdot 10^{-9}$  |
| 38           | $6.9 \cdot 10^{-9} \pm 2.1 \cdot 10^{-9}$  |
| 180          | $6.0 \cdot 10^{-9} \pm 1.1 \cdot 10^{-9}$  |

Table 1 summarises the cross-section for the configuration SEUs in the shift register design of the SRAM based Altera FPGA. Using simulated values for the hadron flux at the different sectors of the TPC gives a total expected numbers of SEFIs in the for the whole TPC detector (216 ALTERA FP-GAs) per 4 hour run in the order of 3-4.

#### B. Xilinx Virtex-II Pro VP7

Table 2: SEFI Cross-section vs. energy for the Virtex-II Pro VP7

| Energy [MeV] | SEFI CS [cm <sup>2</sup> ]                |
|--------------|-------------------------------------------|
| 26           | $9.4 \cdot 10^{-9} \pm 5.5 \cdot 10^{-9}$ |
| 180          | $2.8 \cdot 10^{-9} \pm 1.3 \cdot 10^{-9}$ |

The SRAM based Xilinx Virtex-II was tested both at OCL and TSL with cross-section results in the same order of magnitude for 26 MeV and 180 MeV protons, as also was the case for the ALTERA FPGA. In Fig. 5 the SEFI cross-section result is plotted as a function of the energy for both the Altera and the



Figure 5: Cross-Section plotted versus energy for the ALTERA APEX20KE400 FPGA

Xilinx Virtex-II FPGA. The SEU cross-section was measured for the Xilinx Virtex-II to be in the order of 10-20 higher than the SEFI cross-section.

#### C. ACTEL APA075

The ACTEL pro ASIC APA075 is a flash based FPGA and the configuration memory is therefore expected to be SEU resistant compared to the SRAM based FPGAs, eventhough SEUs still can apear in sequential logic such as state machines. However for flash based FPGAs it is the total dose over the ALICE lifetime which is the critical measurement. The ACTEL APA075 have been tested with this in mind. ACTEL software (ref actel flashpro) was used to periodically readback and verify the content of the configuration memory, and the total dose is determined at the first sign of failure. Table 3 summerizes

Table 3: 26 MeV proton irradiation results for the ACTEL APA075

| Sample | Typ. Flux[ $cm^2$ s] | Fluence[p/cm <sup>2</sup> ] | Dose[kRad] |
|--------|----------------------|-----------------------------|------------|
| 1      | $4 - 13 \cdot 10^7$  | $1.7 \cdot 10^{11}$         | 44.7       |
| 2      | $1.3 \cdot 10^{8}$   | $1.4 \cdot 10^{11}$         | 36.8       |
| 3      | $5.2 \cdot 10^7$     | $4.7 \cdot 10^{10}$         | 12.0       |
| 4      | $5.2 \cdot 10^7$     | $2.6 \cdot 10^{10}$         | 6.8        |

the proton irradiation results for the ACTEL FPGA at OCL. It shows the total fluence and dose at the first occurance of failure, where the lower limit is at 6.8 kRad. It is interesting to note that the readback and verification procedure had a definite influence on the radiation sensitivity of the device. There was a clear sign of increased current consumption during a readback and verification cycle, wich in turn eventually resulted in device failure. Looking at table 3 the difference in the total dose between the samples can be explained with the increased number of readback and verification cycles for the two last samples. Readback and verification of the ACTEL configuration memory will not be used during normal operation. Thus doing more irradiation tests without using the readback procedure and focusing on the first occurance of a SEFI in the firmware, we ex-

pect that a higher value for the lower limit of the total dose can be achieved. It is therefore expected that the ACTEL FPGA will survive at least 10 ALICE lifetimes.

#### D. DCS card system test

The DCS board embedded computer, which runs on an Altera EPXA1F484C1-ARM SRAM based FPGA, has been tested as a stand alone system with a 180 MeV nominal proton beam at TSL in october 2004. After being irradiated with a total dose of 1.5 kRad, corresponding to 2 ALICE lifetimes, it was still fully functional and was removed from the beam without any permanent damage. During irradiation the typical failure signature of the DCS embedded computer was the loss of ethernet connection. This is expected due to it being the larger part of the firmware design. The mean time until first error was 316 seconds at a flux of  $1.5 \cdot 10^6 protons/cm^2 s$ , which gives a cross-section for the DCS board embedded computer of  $2.1 \cdot 10^{-9} cm^2$ . This gives a mean time between failure (worst case of 800  $hadrons/cm^2s$ ) of 0.8 - 1.6 hours per 216 DCS boards in the TPC, where the ranges[x1-x2] reflect the uncertainty in the simulations. Since the DCS board is not in the direct data path this result is satisfying.

#### E. Full system integration test

The full system integration test was carried out at the The Svedberg Laboratory in Uppsala. At this beam test the system was operated in normal data-taking mode, and the software and firmware used are fully functional prototype versions of the final design. During approximately 25 hours of 95 MeV neutron beam only a few errors were observed, of which 5 were related to the embedded computer on the 2 DCS boards and 4 to the data readout chain. The system was exposed to a typical flux of  $10^3 - 10^4$  neutrons/cm<sup>2</sup> leading to a total flux of  $1.5 \cdot 10^9 neutrons/cm^2$ . Based on the cross-section result from section D. the number of errors for one DCS card per 25 hour run with a flux of  $1.7 \cdot 10^4 n/cm^2 s$  is calculated to 2-3. The 5 errors observered, all connected to the loss of ethernet connection, are therefore consistent with previous tests. Of the 4 observerd errors in the data readout chain 3 were cleared when a reset and reconfiguration of the Xilinx VP7 was carried out, and can thus be located to the VP7 firmware. The fourth error is most likely induced somewhere in the data path between the Xilinx FPGA and the DAQ PC. Again, based on the 180 MeV proton cross-section in table 2, the expected number of errors for the Xilinx VP7 during a 25 hour integration is calculated to approximately 4. A 14 hour reference without beam was also carried out and no errors were observed.

# V. Xilinx reconfiguration scheme

The functionality of both the DCS board and the RCU motherboard is based on FPGAs which can experience errors at some point due to single event upsets. By reloading the firmware into the configuration memory of the FPGA, these errors can be corrected immediately. The firmware files are stored in Flash memory on the different boards. However, reloading the configuration and rebooting the DCS board embedded computer causes downtime of the specific node. Since the data-path is independent of the status of the DCS board, occasional downtime of the DCS board node is irrelevant. This simple ap-



Figure 6: Reconfi guration scheme for the Xilinx FPGA on the RCU board.

proach is not satisfying for the FPGA on the RCU board, as it will interrupt the data-flow. To overcome this problem, automatic checking and refreshing of the firmware has been implemented. The scheme is sketched in Fig. 6. It is based on the Xilinx Virtex-II Pro which allows one to refresh the firmware without interrupting the operation. This technique is called Active Partial Reconfiguration [6]. The configuration bits in the Virtex-II Pro are organized in columns of data frames, that can consist of RAM elements, Configurable Logic Blocks (CLBs), IOs and such. When doing Active Partial Reconfiguration, one addresses these frames by using a major address and a minor address. The major address is the column number and the minor address is the frame number within that column. It is in this way possible to refresh only one single frame. The configuration data stream is first transmitted to a shadow frame. When the data in shadow frame is ready, all registers in the addressed frame are updated in parallel. These operations are carried out using the SelectMap bus interface of the Virtex-II Pro, which is an 8 bit parallel bus with 6 control lines and 1 clock line used for configuring the FPGA.

A FLASH based ACTEL FPGA makes both the Flash memory on the RCU board and the Configuration memory of the Virtex-II Pro available for the DCS board computer. This allows for the high level control of the configuration scheme to be implemented in software on the DCS board computer, leaving only the radiation critical operations to the ACTEL FPGA firmware. The main task of the ACTEL FPGA will therefore be to read back the configuration memory of the Xilinx FPGA frame by frame and verify it against the original data frame stored in the Flash memory. If an error is detected the read

process will be interrupted to correct the error by refreshing the complete frame. When the error has been removed, the read process will continue. In addition simple error detection and corrections techniques will be implemented in critical parts of the both the ACTEL and Xilinx firmware, such as for instance hamming coding of state machines. This setup makes it possible to remove errors in configuration memory of the Xilinx FPGA before they can affect the behavior of the system.

### VI. Summary and Conclusion

The Xilinx, Altera and Actel FPGAs have been irradiated with proton and neutron beams as single components and on a system level in the integration test of the RCU. As the DCS board embedded computer is not in the main data-path, occasional down time can be tolerated. 1-2 errors leading to the same number reboots of the DCS embedded computer for the whole TPC during a 4 hour ALICE run is therefore acceptable. A similar consideration does not apply for the Xilinx Virtex-II Pro since it is directly involved in the data-path. The results for the Altera APEX20K400E originally inteded for the RCU motherboard was at the limit of what could be accepted. Eventhough the replacement device, being the Xilinx Virtex-II, gives a cross-section result in the same order of magnitude as the Altera FPGA, it does offer the possibility of active partial reconfiguration. This option together with different error detecting and correcting coding techniques, is expected to reduced the influence of SEUs to a negligable level. Preliminary irradiation tests implementing these measures are showing promising results.

#### REFERENCES

- [1] ALICE Collaboration CERN/LHCC/1995-71.
- [2] L. Musa *et al*, The ALICE TPC Front End Electronics, *Proc. of the IEEE Nuclear Science Symposium*, Portland, October 2003
- [3] F.Faccio,COTS for the LHC radiation environment: the rules of the game
- [4] A.Fasso, P.Foka, A.Morsch, A.Sandoval, G.Tsiledakis, Radiation in the ALICE TPC detector. ALICE oct. 2003 Internal Note-TRD
- [5] A.V.Prokofiev, A.N.Smirnov, P-U.Renberg, A Monitor of Intermediate-Energy Neutrons Based on Thin Film Breakdown Counters, TSL/ISV-99-0203 January 1999
- [6] Virtex Series Configuration Architecture User Guide, *Xilinx XAPP151* (v1.6), March 2003