## **Data Flow Simulations through the ATLAS Muon Front-End Electronics**

J. Chapman, R. Ball, J. Kuah, J. Mann, M. Schneider, J. Uzelac, and L. Hu University of Michigan (email: umjwc@umich.edu)

Abstract

(ED) 00en-2000-098 29/10/1999 A VerilogHDL simulation of the data flow along the readout chain of the ATLAS MDT front-end is presented. The input rates for this simulation are taken from the chamber occupancies as provided by the ATLAS physics Monte Carlo. The chamber hit-rates include backgrounds as well as hits for collisions of interest. The program has been used to study various trigger tower groupings and to examine the buffer occupancies at a range of luminosities.

### **1. INTRODUCTION**

The ATLAS muon precision chambers are instrumented with Amplifier Shaper Discriminator circuits (ASD) and Time to Digital Converters (TDC) mounted directly on the chamber ends. The ASD units convert the track ionization signals to digital form and the TDC units digitize, store, and transmit time data along serial lines in LVDS form to the readout system. This study begins with this digital information and examines the performance of various designs up through the final on-chamber module called the CSM. Future studies will examine the data flow up to the readout buffers, called ROBs.

Since the performance of the ATLAS muon TDC has previously been examined in simulation[1], the first step in this simulation is to verify that the TDC is appropriately modeled for the hit rates of interest. This is done by matching the output buffer occupancies seen in the full TDC simulation to that modeled for this work. A simple model of the TDC is needed here since this simulation requires the emulation of 18 TDC units and full TDC simulation would be prohibitively slow. Eventually this simulation is expected to be extended to a full ROB group for which 108 TDC models would have to be simultaneously run.

With a suitable TDC model for emulation of the trigger and data rates complete and tested, a design for the next module along the readout chain, the CSM was undertaken. This unit is required to accept serial data from 18 TDC chips, buffer them to avoid data loss, multiplex them into a single output path that is also buffered awaiting transmission to the next unit located in a Tower Summary Crate (TSC) and called the Muon ReadOut Driver (MROD). To date this simulation has been completed up through the CSM module.

A preliminary version of the CSM is also to be fabricated for chamber testing. This version of the CSM, called the CSM-0, is being designed in VME format and will not be chamber mounted as in the final design. It will also not rely on the MROD units for event assembly. The simulations described in this note are for the CSM-0 which does contain event assembly logic. The VME card implementation will also be discussed briefly.

## 2. DATA RATES AND TDC MODELING

### 2.1 Predicted Rates in the MDT Chambers

Data from the MDT tubes flows along a data path as indicated schematically in Figure 1. The individual tube sense wires attach to ASD inputs within a Faraday cage covering the tube ends. The mezzanine card that holds three 8 channel ASD chips also contains a 24 channel TDC. The single tube shown in Figure 1 is thus 1 of 24 input to the TDC. Data from all 24 channels is directed to the output upon receipt of a trigger signal to the TDC.

Figure 1 also shows the TDC serial data entering the next module, the CSM. This connection is one of 18 such serial links from 18 distinct TDC chips. The CSM must process each of these 18 sources into a single line to the TSC based MROD. The data path to the MROD is expected to be a fibre link running at 640Mb/s or greater. The MROD is required to handle six CSM outputs and must therefore deal with the data from 108 TDC chips or 2592 tube channels.

The MROD is designed to accept data from a full trigger group, which in the ATLAS MDT requires it to accept up to 6 chamber units or 6 CSM outputs. A preliminary grouping of chambers into towers has been used for this study. Although this grouping is not final it is representative of the choices that are likely. In order to examine the data rates from the MDT up to the ROB a preliminary choice has been made for the chamber groups feeding each MROD. The rates expressed in the Table 1 are for this preliminary grouping. The table has been truncated to a section of the barrel for simplification.

The physics Monte Carlo designated TP43 was used to calculate the hit rates given in Table 1 for each chamber of the MDT. This Monte Carlo contains hits from events of interest and hits from background processes. All backgrounds are included except halo muons, which are expected to be negligible compared to those included. All hits estimated by the physics Monte Carlo must be handled by the ASD and stored within the TDC. All edges, however, are not transmitted by the TDC to the CSM. Only those found to be within the drift time window are processed and sent to the output FIFO within the TDC upon receipt of an external trigger signal. These hits are serialized and sent along the data path to the CSM. Table 1 shows the number of tubes for each chamber (#Chn), the average tube rates (KHz/Chn), the composite rate of all channel hits accepted by the TDC within the drift interval (MHz), the number of mega-bits sent from the TDC each second (Mb/s/TDC), the number of mega-bits sent from the CSM each second (Mb/s/CSM), the number of CSM units attached to each MROD (CSM/TSC), and the number of giga-bits sent each second (Gb/s/ROB) from each MROD. Clearly, handling these rates will be a challenge. The rates from Table 1 represent the range of values the CSM and MROD designs must accept.



Figure 1 A block diagram of the units through which data flow from the MDT tubes up to the ROB.

|         | #Chn | <u>KHz</u> | MHz | Mb/s | Mb/s  | CSM | Gb/s  |
|---------|------|------------|-----|------|-------|-----|-------|
|         |      | Chn        |     | TDC  | CSM   | TSC | ROB   |
| group 1 |      |            |     |      |       |     |       |
| BIL 1   | 240  | 36         | 0.9 | 2.3  | 22.7  | 1   | 22.7  |
| BIL 2   | 288  | 41         | 1.0 | 2.5  | 29.9  | 1   | 29.9  |
| BML 1   | 336  | 107        | 2.6 | 6.0  | 84.5  | 1   | 84.5  |
| BML 2   | 288  | 107        | 2.6 | 6.0  | 72.5  | 1   | 72.5  |
| BOL 1   | 432  | 75         | 1.8 | 4.3  | 77.6  | 2   | 38.8  |
| BOL 2   | 432  | 75         | 1.8 | 4.3  | 77.6  | 2   | 38.8  |
| group 2 |      |            |     |      |       |     |       |
| BIL 3   | 288  | 405        | 9.7 | 22.0 | 263.6 | 2   | 131.8 |
| BML 3   | 288  | 107        | 2.6 | 6.0  | 72.5  | 1   | 72.5  |
| BML 4   | 288  | 107        | 2.6 | 6.0  | 72.5  | 1   | 72.5  |
| BOL 3   | 432  | 75         | 1.8 | 4.3  | 77.6  | 2   | 38.8  |
| BOL 4   | 336  | 75         | 1.8 | 4.3  | 60.4  | 2   | 30.2  |
| group 3 |      |            |     |      |       |     |       |
| BIL 4   | 288  | 41         | 1.0 | 2.5  | 29.9  | 1   | 29.9  |
| BIL 5   | 288  | 41         | 1.0 | 2.5  | 29.9  | 1   | 29.9  |
| BIL 6   | 288  | 41         | 1.0 | 2.5  | 29.9  | 1   | 29.9  |
| BML 5   | 288  | 107        | 2.6 | 6.0  | 72.5  | 1   | 72.5  |
| BML 6   | 288  | 107        | 2.6 | 6.0  | 72.5  | 1   | 72.5  |

 Table 1: Average Data Rates



Figure 2a The TDC output FIFO occupancy from the full TDC simulation.



Figure 2b The output FIFO occupancy for the simplified TDC model for various rates from 1x to 9x.

# 2.2 TDC Modeling

The TDC design was formulated in VerilogHDL and a full simulation of its performance exists. Since the character of the simulation is most critical at high rates, a comparison of the full TDC simulation at the highest rate with the simplified version used in this data-flow simulation has been made. Also since the serialization unit of the TDC processes data from the output FIFO only, this particular unit has been simulated only. Thus, for each trigger, the number of hits for each TDC is generated and injected into the output FIFO. The process begins with a trigger, defined to occur a randomly selected time (exponential distribution) after the previous with the appropriate average to produce the desired rate. For each trigger a number of hits is selected randomly from a Poisson distribution with the appropriate mean for the average hit rate.

Figures 2a and 2b show the TDC output buffer occupancy for the full TDC simulation at 9 times the TP43 value, 2a, and for the simplified simulation for values from 1x to 9x. The occupancies match well for the 9x situation. One difference is clearly observed. The TDC has 32 locations in its output FIFO where the

simplified simulation has 64. Since the TDC has a other internal buffers for data, the final position of its output FIFO is seen to be occupied for cases when the data in the simplified simulation extends beyond 32. This difference is not important since when the FIFO is highly occupied the serial unit operates continuously unloading the output FIFO. The simplified version must have 64 positions in order to avoid loosing data since it has no other place to hold the hits.

## **3. THE DATA FLOW SIMULATION**

#### 3.1 The Components

That part of the simulation concerned with modeling the TDC has already been described. It is shown in bubble of Figure 3 labelled "Emulate 18 TDCs". For the results shown the simulation was performed at 5 Hits/TDC and a trigger rate of 75KHz. Other rates have also been studied.



Figure 3 The components of the VerilogHDL simulation including those that provide the input specification, module definition, and performance monitoring.

A second bubble labelled Storage and VME I/O has also been represented. This part of the simulation is used for initialization of the TDC and CSM but is not described since it does not function during data flow and is not timing critical.

A third bubble labelled TTCem represents the simulation code for emulating the trigger, timing, and control in accordance with the LHC design. This code is needed for development of the control signals to the TDC and CSM modules but is not specific to the MDT system and is not described further. The actual VerilogHDL code for the TTCem is synthesized so that the CSM-0 module performs the appropriate trigger, timing, and control.

The primary unit studied in this report is the CSM module. It contains the core of the data flow functions. It deserializes the data from the 18 TDC chips and FIFO buffers them awaiting acceptance by a scanning multiplexer. Data from the multiplexer is accepted if it represents input for the current event being sought. The

CSM-0 must therefore have an event FIFO that is loaded from the TTCem event stream and unloaded in turn as events are sought from the multiplexer. The CSM includes an output FIFO that accumulates data for the event. The final CSM will send data from its output FIFO to the MROD within the TSC. The CSM-0, however, sends it output FIFO data to a deep FIFO on the VME card. It also sends a word count for each event along with the event ID to a second FIFO. For the CSM-0 the readout sequence includes a VME read of the word count followed by a block transfer of the complete event from the data FIFO.

A final bubble in Figure 3 represents the performance monitoring code of the simulation. This code forms histograms of the FIFO occupancies, word counts/event, and processing time/event.

## 4. THE RESULTS

A representative simulation is shown in Figure 4a through 4d. The TDC output FIFO is shown in Figure 4a for the 5 Hits/TDC at a 75kHz trigger rate.



Figure 4a The TDC output FIFO occupancy at 5 Hits/TDC and 75kHz trigger rate.



Figure 4b shows the CSM input FIFO occupancy.

Figure 4b displays the occupancy of the input FIFO buffer of the CSM module. This is the buffer that holds TDC data awaiting acceptance by the multiplexer. Although the multiplexer scans the incoming data rapidly, this buffer holds appreciable data while the CSM-0 builds an event. The event assembly logic holds off processing data for the next event awaiting data from the last TDC for a given event. During this time the other TDC chips that have continued to send new data.

Figure 4c exhibits the word count per event. For events with 5 Hits/TDC plus headers and trailers, one expects 128 words on the average. The slightly smaller peak in the distribution remains to be investigated. It may be due to round-down of the Poisson generation since the number generation is integer based.



Figure 4c Displays the words/event calculated by the CSM-0 as it processes an event.

The final plot of Figure 4d shows the processing time (latency) of the CSM-0. The longest time comes from events caught behind a burst of data from previous events. The shortest time represents the minimum transmission time of events with few hits and without contention from previous events. The largest latency is about  $50\mu$ s.

## SUMMARY

Modeling of hardware with VerilogHDL offers the advantage of performance determination for critical designs. It also provides the source for the development of actual components. If the synthesis of the HDL code into either FPGA or ASIC devices can be shown to meet the clocking specifications, the simulated performance can be delivered by the actual hardware. We expect to commit the CSM-0 code to a Xilinx FPGA and construct the module within weeks.



Figure 4d displays the distribution of processing time for events assembled by the CSM-0 at the 75kHz trigger rate and with 5 Hits/TDC.

#### 5. REFERENCES

[1] Requirements and Specifications of the TDC for ATLAS Precision Muon Tracker, Yasuo Arai and Jorgen Christensen, ATLAS Internal Note, MUON-NO-179, 14 May 1997