

#### The Digital Algorithm Processors for the ATLAS Level-1 Calorimeter Trigger

Samuel Silverstein, Stockholm University

For the ATLAS TDAQ collaboration





# Acknowledgements

- The work presented here is a major collaborative effort among the member institutes in L1Calo:
  - UK: Rutherford Appleton Laboratory, University of Birmingham, Queen Mary University of London
  - Germany: University of Heidelberg,
    Johannes Gutenberg University of Mainz
  - Sweden: Stockholm University
- Thanks to many collaborators whose work is included in this talk.
- NB: Much more information and detail in conference proceeding and TNS submission



# ATLAS Trigger/DAQ







# Sliding window algorithms

#### Jet algorithm:

EM cluster algorithm:





# Digital algorithm processors

- Two independent subsystems
  - Cluster Processor (CP)
  - Jet/Energy-sum Processor (JEP)
- These have evolved together
  - Common architecture, technology choices
  - In many cases, common hardware





- Common architecture
  - Input data links
  - Readout scheme
  - Slow control
- Hardware
- Production/commissioning experience
- Outlook (upgrade)



### Input data links

- Technology: 480 MBaud serial links
  - National Instruments Bus-LVDS
  - 10 bits data @ 40 MHz + 2 sync bits
  - 11 m shielded parallel-pair cables for all links
- Jet/Energy-sum processor (2 crates)
  - 0.2×0.2 "jet elements" (9 bits + parity)
  - ~ 1400 links per crate
- Cluster processor (4 crates)
  - 0.1×0.1 "trigger towers" from PreProcessor (8 bits + parity)
  - ~2240 em and hadronic trigger towers per crate!
  - We exploit a feature of PreProcessor to halve the number of links needed in CP (to ~1120)





# **BC** Multiplexing (BCMUX)

- Two neighboring trigger towers paired on one link
- On **non-zero tower**, **two** consecutive words
  - energy (8b)
  - parity
  - BCMUX disambiguation
- JEP cannot use BCMUX
  - 4-tower sums
  - full BC not necessarily followed by an empty one



#### Readout

- Optical fiber with HP/Agilent G-Links
  - Up to 20 parallel bits of user data at 40 MHz
  - Readout encoded in parallel bitstreams
  - Data Available (DAV) bit flags new readout frame arriving at ROD
  - Example: 84-bit CPM DAQ readout frame:

|                                             |                             |                             |          |          |          |          | S        | erialiser 0 + Threshol | d 0 count + Parity |     | G-Link |
|---------------------------------------------|-----------------------------|-----------------------------|----------|----------|----------|----------|----------|------------------------|--------------------|-----|--------|
| G<br>P                                      | 3b<br>Thr0                  | 10b TTD2                    | 10b TTC2 | 10b TTB2 | 10b TTA2 | 10b TTD1 | 10b TTC1 | 10b TTB1               | 10b TTA1           | D0  | bit    |
| Serialiser 15 + Threshold 15 count + Parity |                             |                             |          |          |          |          |          |                        |                    |     |        |
| G<br>P                                      | 3b<br>Thr1                  | 10b TTD2                    | 10b TTC2 | 10b TTB2 | 10b TTA2 | 10b TTD1 | 10b TTC1 | 10b TTB1               | 10b TTA1           | D15 |        |
|                                             | Serialiser 16 + BC + Parity |                             |          |          |          |          |          |                        |                    |     |        |
| G<br>P                                      | 3b<br>BC                    | 10b TTD2                    | 10b TTC2 | 10b TTB2 | 10b TTA2 | 10b TTD1 | 10b TTC1 | 10b TTB1               | 10b TTA1           | D16 |        |
|                                             |                             | Serialiser 19 + BC + Parity |          |          |          |          |          |                        |                    | 1   |        |
| G<br>P                                      | 3b<br>BC                    | 10b TTD2                    | 10b TTC2 | 10b TTB2 | 10b TTA2 | 10b TTD1 | 10b TTC1 | 10b TTB1               | 10b TTA1           | D19 | ♥      |
| Expanded trigger tower datum                |                             |                             |          |          |          |          |          |                        |                    |     |        |
| LD SE Trigger tower data d7 d0              |                             |                             |          |          |          |          |          |                        |                    |     |        |
| LVDS Link Down LVDS Serialiser Error        |                             |                             |          |          |          |          |          |                        |                    |     |        |
| Glinkfmt.xls/CPM DAQ                        |                             |                             |          |          |          |          |          |                        |                    |     |        |
| Time                                        |                             |                             |          |          |          |          |          |                        |                    |     |        |



### Slow control

- ATLAS DCS uses CANbus to monitor crate temperatures, voltages, etc.
- Module-level monitoring in L1Calo
  - Internal CANbus on crate backplane
  - Fujitsu microcontroller with 10-bit ADC on each board
    - FPGA temperatures
    - voltages, currents, etc
- One microcontroller provides bridge between internal CANbus and DCS





#### Cluster Processor Module (CPM)





#### CPM real time data path











### **Control and Monitoring**





# Jet/Energy Module (JEM)

- Input Stage:
  - 16 6-channel LVDS deserialisers (SCAN921260)
  - 20 FPGAs (XC2V1500)
- Processing Stage:
  2 FPGAs (XC2V3000)
- 14-layer main board with most functionality on daughter modules





# Jet/Energy Module (JEM)

(conceptual drawing)





# **Results merging**

- Two modules in each crate
  - EM, hadron clusters
  - Energy sums, Jets
- One crate performs system level merging





### **Common Merger Module**



- One multi-purpose module does cluster, jet or energy-sum merging
- Xilinx System ACE used tp automatically load correct firmware based on geographic address





### **Processor backplane**



18 layers, 8 signal layers:





# Backplane hardware

- Vertical stiffening bars provide
  - Rigidity against inject/eject forces (~450N)
  - Mount points for cable retention, LV bus bar hardware
- Rear transition modules for CMMto-CMM cabling





# Other backplane features

- Reduced VME bus (VME--)
  - Constrained by limited pin count
  - A24D16 slave cycles
  - 3.3V levels
- Extended geographic addressing
  - Position in crate
  - Which crate in system (set by rotary switch)



### Fully cabled rack (JEP)





### Other modules

- Common ROD for DAQ and ROI readout
  - FPGA based (XC2VP20)
  - Many different firmwares to process different readout formats
  - System ACE + geographic addressing used
- Timing and Control Module (TCM)
  - Distribute TTC signal
  - Bridge between DCS and internal CANbus
  - VME display
- Clock Alignment Module (CAM)
  - Phase measurement between crate and module clocks
  - important for timing-in of CP



# Summary - things learned

- Common architecture / hardware:
  - Saved much engineering/development/testing effort
  - Fewer types of module types/spares
  - Simplified software development
- Even conservative solutions come with costs
  - Many compelling reasons to use 400 Mbit/s LVDS, but resulting cable plant near limits of what was feasible
  - Parallel backplane with conservative design, but high pin counts caused own problems (insertion forces, vulnerable male backplane pins)
- Count on production delays!
  - Manufacturer related errors on many major system components



# Outlook: LHC upgrade

- Two-phase LHC luminosity upgrade expected
  - Phase 1 (2014): 2-3×10<sup>34</sup> cm<sup>-2</sup>s<sup>-1</sup>
  - Phase 2 (2018): 10<sup>35</sup> cm<sup>-2</sup>s<sup>-1</sup>
- Phase 1 has short time scale
  - no fundamental change to system, but
  - But would like to expand L1Calo capabilities
    - Topological triggers?
    - Identify overlapping features in CP and JEP?
- Q: Can we put feature positions in Level-1 real time data path for Phase 1?



# Data merger transmission

- Transmission tests of longest merger lines
  - 2.5V CMOS levels
  - Parallel destination termination (~JEM)
  - Series source termination (~CPM)





### L1Calo Phase-1 upgrade





#### Thank you for your attention!