Hierarchical Reinforcement Learning to control the AWAKE Electron Beamline

Rodriguez Mateos, Borja

Thesis
Report number	CERN-THESIS-2023-333
Title	Hierarchical Reinforcement Learning to control the AWAKE Electron Beamline
Author(s)	Rodriguez Mateos, Borja (Ecole Polytechnique, Lausanne)
Publication	69.
Thesis note	Master : Ecole Polytechnique, Lausanne : 2023
Thesis supervisor(s)	Kain, Verena ; Pieloni, Tatiana
Note	Presented 19 Sep 2023
Subject category	Accelerators and Storage Rings
Accelerator/Facility, Experiment	CERN PS ; AWAKE
Abstract	New generation particle accelerators are increasingly more complex and have more stringent re- quirements on parameter quality and efficiency. The current control algorithms for accelerators mostly rely on linearization of accelerator dynamics and/or manual tuning of parameters. In order to meet the expectations of future particle accelerators, control systems are evolving and increas- ingly embrace machine learning techniques. Deep Reinforcement Learning (DRL) is a promising approach to develop smart and efficient controllers that can potentially run particle accelerators processes autonomously. In this thesis an autonomous controller based on Hierarchical Deep Reinforcement Learning was developed and applied to learn how to correct the AWAKE electron line trajectory. The proposed algorithm consists of a goal-based two level hierarchy of policies that learn optimal control without model of the control problem. It uses so-called off-policy experience replay to be sample-efficient. By abstracting decision making into two layers of policies, studies have shown that the controller manages to solve more complex tasks with longer time horizons. The algorithm, developed in the course of this master projected, is evaluated on the AWAKE electron line steering task, popular for testing novel control methods due to its high repetition rate and low risk of damage to the machine. The final implementation of the hierarchical controller manages to score an average success rate of 99.85% for AWAKE electron line trajectory correction and is able to solve the task in two steps on average. The possibility to use pretrained agents as part of a hierarchical control structure is particularly appealing as it allows to re-use learned low-level "skills". When using a classically pretrained controller for the lower-level agent, it turned out that resulting algorithm can make good use of the available "skills". The success rate with this set-up was 80.3%. This thesis gives a brief introduction of the concepts of Reinforcement Learning, before describing in detail the AWAKE trajectory steering problem. The implementation of the chosen Hierarchical Reinforcement Learning HIRO for the AWAKE steering problem will also be summarised. The achieved performance with this RL algorithm will be shown at the end of this thesis.

Email contact: [email protected]

Back to search

Record created 2024-02-13, last modified 2024-11-12

Similar records

Fulltext:

PDF

Add to personal basket
Export as BibTeX, MARC, MARCXML, DC, EndNote, NLM, RefWorks

CERN Document Server

Access articles, reports and multimedia content in HEP

Main menu

CERN Accelerating science