Abstract
| One of the primary limitations on the achievable accelerating gradient in normal-conducting accelerator cavities is the occurrence of vacuum arcs, also known as RF breakdowns. A recent study on experimental data from the CLIC XBOX2 test stand at CERN proposes the use of supervised machine learning methods for predicting RF breakdowns. As RF breakdowns occur relatively infrequently during operation, the majority of the data was instead comprised of non-breakdown pulses. This phenomenon is known in the field of machine learning as class imbalance and is problematic for the training of the models. This paper proposes the use of data augmentation methods to generate synthetic data to counteract this problem. Different data augmentation methods like random transformations and pattern mixing are applied to the experimental data from the XBOX2 test stand, and their efficiency is compared. |