Abstract
| New generation particle accelerators are increasingly more complex and have more stringent re- quirements on parameter quality and efficiency. The current control algorithms for accelerators mostly rely on linearization of accelerator dynamics and/or manual tuning of parameters. In order to meet the expectations of future particle accelerators, control systems are evolving and increas- ingly embrace machine learning techniques. Deep Reinforcement Learning (DRL) is a promising approach to develop smart and efficient controllers that can potentially run particle accelerators processes autonomously. In this thesis an autonomous controller based on Hierarchical Deep Reinforcement Learning was developed and applied to learn how to correct the AWAKE electron line trajectory. The proposed algorithm consists of a goal-based two level hierarchy of policies that learn optimal control without model of the control problem. It uses so-called off-policy experience replay to be sample-efficient. By abstracting decision making into two layers of policies, studies have shown that the controller manages to solve more complex tasks with longer time horizons. The algorithm, developed in the course of this master projected, is evaluated on the AWAKE electron line steering task, popular for testing novel control methods due to its high repetition rate and low risk of damage to the machine. The final implementation of the hierarchical controller manages to score an average success rate of 99.85% for AWAKE electron line trajectory correction and is able to solve the task in two steps on average. The possibility to use pretrained agents as part of a hierarchical control structure is particularly appealing as it allows to re-use learned low-level "skills". When using a classically pretrained controller for the lower-level agent, it turned out that resulting algorithm can make good use of the available "skills". The success rate with this set-up was 80.3%. This thesis gives a brief introduction of the concepts of Reinforcement Learning, before describing in detail the AWAKE trajectory steering problem. The implementation of the chosen Hierarchical Reinforcement Learning HIRO for the AWAKE steering problem will also be summarised. The achieved performance with this RL algorithm will be shown at the end of this thesis. |