# DEVELOPMENT OF A SECOND-GENERATION SYSTEM FOR THE RELIABLE DISTRIBUTION OF MACHINE PROTECTION PARAMETERS

S. Bolton<sup>\*</sup>, M. Blaszkiewicz, A. Colinet, L. Felsberger, J. Guasch-Martinez, C. Martin, I. Romera, R. Secondo, J. Uythoven CERN, Geneva, Switzerland

### Abstract

The Safe Machine Parameter (SMP) system is an electronic hardware-based system which has been an integral part of the LHC's machine protection strategy since it started operation. Its primary objective is to provide several parameters and interlock signals to critical machine protection users across the LHC and SPS accelerators, whilst prioritising high reliability and availability. After almost two decades of operation, it is necessary to upgrade the SMP hardware electronics. In the High Luminosity LHC era the requirements of connected systems have changed, leading to new system functions and operational requirements which must be integrated into the new design. This paper details the electronic design considerations of developing the second-generation SMP. The general distribution of parameters relies on the CERN WhiteRabbit timing network renovation, for which dedicated high-precision clock components were selected and tested on a prototype board. Details of the hardware design and validation are discussed, along with the comprehensive upgrades aimed at delivering an SMP system with expanded monitoring and diagnostic features.

# **INTRODUCTION**

The SMP widely distributes critical protection parameters across the SPS and LHC accelerators using a combination of direct links and broadcasting over the CERN timing network. In the current SMP, parameters are broadcast over the GMT [1] whilst the upgrade of the SMP will use the WhiteRabbit Timing [2]. The parameters are calculated primarily using measurements of machine energy and beam intensity provided from several different sources and connected redundantly to maximise reliability. A key user of the SMP is the Beam Interlock System [3]. It is connected directly and redundantly to the SMP, and uses SMP flags to mask interlocks during beam setup conditions, and to direct the SPS beam extraction.

It is necessary to upgrade the SMP hardware due to ageing and obsolete components and expanded requirements resulting from the High Luminosity LHC upgrade [4]. In particular the FPGAs used in the current hardware are at their capacity limit and unable to accommodate the parameter expansions. Additional components are required for compatibility with the WhiteRabbit network necessitating the upgrades.

### System Overview

The new SMP Controller consists of a number of modules in a VME64X [5] crate interconnected using a passive backplane. Separate controllers exist for the SPS and LHC, which maintain the same overall structure. The functional modules of the SMP controller use the 'Control Interlock SMP' (CIS) name space. They consist of Receivers (CISR), Generators (CISG) and the Arbiter (CISA). A fourth module, the User Interface (CISU), is used to decode the Safe Machine Parameters and output them to safety-critical users.



Figure 1: Functional block diagram of the SMP system. Boxes  $R_X$  represent receiver boards,  $G_X$  generators, A is the arbiter and U is the user board.

The interconnections of the SMP are shown in Fig. 1. The functions of each module are as follows:

- **CISR** Receives measurements of machine energy and beam intensity from sources. Up to 4 sources connect to each CISR, and 4 CISR are present in each controller resulting in up to 16 possible source inputs. After verification the data is forwarded to two duplicate CISG.
- **CISG** Calculates the critical parameters using the source measurements and software inputs. Failsafe values are applied to the parameters when certain sources are unavailable or inconsistent according to the outcomes of reliability analysis [6]. Some critical parameters are output directly from the generators to critical users maintaining redundant A and B paths. All the parameters are also output to the CISA.
- **CISA** Selects the 'safest', or most conservative of the two CISG inputs to transmit to a general distribution network. If either CISG input is unavailable failsafe values are applied.
- CISU Decodes parameters received over GMT/WhiteRabbit networks and provides them to critical users

 $<sup>^{*}</sup>$  samuel.lyndon.bolton@cern.ch

using outputs on the backplane or frontpanel. If no parameters are received in a defined timeout period failsafe values are applied.

# HARDWARE DESIGN

The base module of the SMP controller is the CISX, shown in Fig. 2, a general-purpose hardware upon which each of the functional modules can be programmed. It was designed to fill all the roles of the SMP depending on the gateware programmed on the FPGA. This results in design considerations to maximise adaptability, versatility and compatibility. The CISX is a VME form factor PCB containing two Xilinx FPGAs. The Critical FPGA implements the logic of the critical path, calculating and outputting the Safe Machine Parameters. The Monitor FPGA performs all monitoring, diagnostics, logging, and is the point of software interaction. The hardware design of the CISX reflects detailed analysis of the system requirements, evaluation of prototypes and the outcomes of reliability studies of the components incorporating failure modes, effects and criticality analysis [6].



Figure 2: The CISX PCB with key interfaces and the FPGAs highlighted.

# WhiteRabbit Interface

A key method of Parameter distribution in the SMP upgrade is the CERN WhiteRabbit Timing, upon which the Arbiter will distribute the parameters. The CISU will receive Parameters from the network and output to safety-critical users. Other non-critical users will also be able to decode the Safe Machine Parameters from the network using their own hardware.

WhiteRabbit is a Gigabit Ethernet (GbE) based network which is built on the Precise Timing Protocol [7]. For full

WhiteRabbit compatibility the CISX needs to be GbE compatible and include several additional components relating to the generation of a precise transceiver clock frequency using a high precision phased lock loop (PLL). These consist of a Voltage Output DAC, a Voltage Controlled Temperature Compensated Oscillator (VCTCXO) to supply a highly stable reference clock, and frequency synthesizer to multiply the reference clock to 125 MHz, a value suitable for the FPGA transceivers at GbE line speeds.



Figure 3: Altium Designer Schematic showing the high precision components of one of the two CISX PLLs. FPGA Logic uses the DAC to set the control voltage to finely tune the frequency of the VCTCXO, which is multiplied by the Clock Synthesizer up to 125 MHz.

When choosing components in the prototype two alternate PLLs were included to evaluate and compare. The schematic for one of these PLLs can be seen in Fig. 3. Additional SMA connectors were added to facilitate evaluation of the generated reference clocks using an oscilloscope. CISX programmed using either PLL reference clocks were connected to a WR Ethernet switch and used to attempt to send and receive data. After testing both the two PLL Outputs both were found to be suitable and the choice between which to use for the production version was made based on component availability and price.

# Other Interfaces and Expansion Capacity

The SMP is due to be installed in LHC Long Shutdown 3 (LS3) and will remain in operation for over 15 years. It is prudent to incorporate space for expansion in the design to accommodate any additional unforeseen requirements which may arise after High-Luminosity operation has started. Additionally, the SMP interfaces with several other systems and careful consideration was made when designing the CISX to ensure compatibility.

An overview of the additional interfaces of the CISX is listed below.

- Single ended TTL transceivers are used to connect between boards in the SMP Controller Crates.
- The CISR connect to Sources, sometimes over large distances resulting in a preference of optical fibre links. The connection is achieved using SFP modules and integrated high speed FPGA transceivers connected to the Sources in a direct point-to-point fashion. Spare SFP slots are available in both SMP controllers.

- The CISG receive and output Direct Flags through the backplane. For this, differential RS-485 transceivers were used, and spare transceivers included in the design. Each can be configured as an input or output to adapt to future demands.
- The CISA requires connections to both the GMT and WhiteRabbit networks.
- The CISU requires the same network connections as the Arbiter for reception. It also outputs direct Flags and serialised Parameters in several formats to a variety of Users, with different interface requirements. The parameters are output using the TTL and RS-485 transceivers on the backplane, and several TTL frontpanel outputs. A frontpanel mezzanine slot is included, and used to output directional RS-485 Flags used by experiments in the LHC. The addition of the mezzanine also serves as a general-purpose IO.

# GATEWARE DEVELOPMENT

The SMP logic is built on 16 Gateware projects targeting FPGAs across the SPS and LHC. It is important that critical paths are designed in a way which maximises safety, implementing failsafe timeouts and error detection in each design.

#### Monitoring

Effective monitoring and diagnostics are essential to the successful operation of the SMP. The monitoring of the SMP falls under two categories.

- **Registers:** Memory mapped registers display real-time status updates of the boards and ongoing operations and serve as a method for configuring board settings.
- Events: Key events and changes on the board are defined and recorded in a looping memory block, referred to as the History Buffer. Each event has a trigger and a timestamp and potentially additional details. The History Buffer is read, and a record of the event is stored externally, providing a detailed history of the boards.

The registers and events are defined in master description files. Scripts are used to automatically generate VHDL code, documentation, and software helper files. A checksum and timestamp are calculated upon each generation to allow version checks to ensure cohesion between the VHDL and software readout tools.

### Software Inputs and Test Modes

In the complex systems of SPS and LHC accelerator operation software inputs are occasionally necessary for SMP generation. For example, setting the current 'Beam Mode' in the Generators affects the calculation of certain flags. However reliance on software introduces additional risks in the possibility of network or software failures. In cases where software inputs affect critical parameters mitigation strategies are implemented by requiring periodic refresh writes and incorporating timeouts. Any software settings revert to the safest, most conservative option in case of communication failures or errors.

The incorporation of test modes in the SMP are useful for commissioning and system health checks. Certain values can be temporarily forced at SMP output points to observe logic transitions and validate the function of interfacing electronics. Risk introduced by the potential forcing of parameters is reduced by implementing timeouts on the test mode commands ensure any forcing occurs for a short time only. Where flags are output in redundant A and B paths, only one flag can be forced to a non-failsafe state at a time, ensuring that overall logic remains in the safe state. In addition, access to the test registers is restricted to select users.

# Simulation and Verification

Thorough verification of Gateware logic is essential for the function of the SMP, necessitating extensive simulation tests. Tests were created using the VUnit framework, with separate tests written to target a specific area, parameter or function of each FPGA.

For the Critical FPGAs, tests verify critical aspects such as parameter generation, logic validation, failsafe functionality and outputs. For the Monitor FPGAs dedicated tests verify each history buffer event, and memory map registers alongside their read/write permissions. Through systematic testing, potential discrepancies are unearthed and rectified, culminating in safe and verified designs.

### System Testing

For the full system test a sequential validation approach was employed, where each board serves as a test platform for the preceding module. For example, the generators verify the outputs of the receivers, and are in turn verified by the arbiter. This sequential validation ensures the integrity of the entire system. The source inputs were modelled, and the parameter outputs recorded to complete the test loop. A separate board was programmed to model the GMT network and a WhiteRabbit compatible switch was used to model the WhiteRabbit network. Python-based tools were developed, enabling dynamic monitoring and validation of the full SMP Controllers to allow real-time verification.

# CONCLUSION

The SMP upgrade is reaching its final stages in its development cycle. The design process consisted of electronics PCB design, gateware development including new functions for both the SPS and LHC, infrastructure layout, and extensive testing of each element. Attention was paid at each stage to prioritise machine safety whilst accommodating the diverse needs of connecting systems. The latest CISX is currently being validated before launching the full production. Installation of the full system will occur in 2026.

# REFERENCES

- B. Todd, "Safe Machine Parameters Interface to General Machine Timing", CERN, Geneva, Switzerland, EDMS 901688, 2008.
- T. Włostowski *et al.*, "White Rabbit and MTCA.4 Use in the LLRF Upgrade for CERN's SPS", in *Proc. ICALEPCS'21*, Shanghai, China, Oct. 2021, pp. 847–852. doi:10.18429/JACOW-ICALEPCS2021-THBR02
- [3] I. Romera *et al.*, "Design considerations for CERN's secondgeneration Beam Interlock System", in *Proc. IPAC'23*, Venice, Italy, May 2023, pp. 4105–4108. doi:10.18429/JAC0W-IPAC2023-THPA062
- [4] R. Secondo, "Safe Machine Parameter (SMP) v2 Functional Specification", CERN, Geneva, Switzerland, EDMS 2517245.
- [5] ANSI, "American National Standard for VME64 Extensions", 1997.
- [6] M. Blaszkiewicz *et al.*, "Reliability studies for CERN's new safe machine parameter system", in *Proc. IPAC'23*, Venice, Italy, May 2023, pp. 4085–4088.
  doi:10.18429/JAC0W-IPAC2023-THPA057
- [7] "IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems", IEEE Std 1588-2008, 2008.