Adversarial Attacks On Machine Learning Cybersecurity Defences in Industrial Control Systems
Adversarial Attacks On Machine Learning Cybersecurity Defences in Industrial Control Systems
Keywords: The proliferation and application of machine learning-based Intrusion Detection Systems (IDS) have allowed for
Industrial Control Systems more flexibility and efficiency in the automated detection of cyber attacks in Industrial Control Systems (ICS).
Supervised machine learning However, the introduction of such IDSs has also created an additional attack vector; the learning models may
Adversarial Machine Learning
also be subject to cyber attacks, otherwise referred to as Adversarial Machine Learning (AML). Such attacks
Attack detection
may have severe consequences in ICS systems, as adversaries could potentially bypass the IDS. This could
Intrusion Detection System
lead to delayed attack detection which may result in infrastructure damages, financial loss, and even loss
of life. This paper explores how adversarial learning can be used to target supervised models by generating
adversarial samples using the Jacobian-based Saliency Map attack and exploring classification behaviours. The
analysis also includes the exploration of how such samples can support the robustness of supervised models
using adversarial training. An authentic power system dataset was used to support the experiments presented
herein. Overall, the classification performance of two widely used classifiers, Random Forest and J48, decreased
by 6 and 11 percentage points when adversarial samples were present. Their performances improved following
adversarial training, demonstrating their robustness towards such attacks.
1. Introduction two reasons; (a) ICS devices are resource-constrained, and (b) they
include legacy systems and devices that do not support modern security
Industrial Control Systems (ICS) play a key role in Critical National measures. Subsequently, complementary security solutions, such as
Infrastructure (CNI) concepts such as manufacturing, power/smart passive process data monitoring, are promising [3]. This has led to
grids, water treatment plants, gas and oil refineries, and health-care. a substantial increase in research focusing on ICS tailored Intrusion
Historically, ICS networks and their components were protected from
Detection Systems (ICS). Such intrusion systems operate by observing
cyber attacks as they ran on proprietary hardware and software and
the network or sensor data to detect attacks and anomalies that may
were connected in isolated networks with no external connection to
affect ICS.
the Internet [1]. However, as the world is becoming more intercon-
nected, there has been a need to connect ICS components and to other Due to their efficiency in detecting attacks, there has been a sub-
networks, allowing remote access and monitoring functionalities. As a stantial increase in the application and integration of machine learning
result, ICSs are now subject to a range of security vulnerabilities [1]. within IDSs (e.g. [1,4–10]). However, the introduction of such systems
Given the importance of these systems, they have become an at- has introduced an additional attack vector; the trained models may
tractive target to an attacker. As these systems control operations in also be subject to attacks. The act of deploying attacks towards ma-
the physical world, the cyber attacks against them may have major chine learning-based systems is known as Adversarial Machine Learning
consequences for the environment they operate in, and subsequently, (AML). The aim is to exploit the weaknesses of the pre-trained model
its users. It is therefore understandable that the security issues sur- which has ‘‘blind spots" between data points it has seen during training.
rounding such systems have become a global issue. Thus, designing More specifically, by automatically introducing slight perturbations
robust, secure, and efficient mechanisms for detecting and defending
to the unseen data points the model may cross a decision boundary
cyber attacks in ICS networks is more important than ever [2].
and classify the data as a different class. As a result, the model’s
Although there exist several security mechanisms for traditional IT
effectiveness can be reduced as it is presented with unseen data points
systems, their integration into ICS systems is challenging mainly for
∗ Corresponding author.
E-mail address: [email protected] (E. Anthi).
https://doi.org/10.1016/j.jisa.2020.102717
The study was designed as follows (see Fig. 1): (1) randomly split
the power system dataset into training and testing set, each contain- Fig. 1. An overview of the study design.
ing 60% and 40% data points respectively, (2) evaluate a range of
supervised machine learning models and identify which are the best
performing, (3) generate adversarial samples using the Jacobian-based
the practical feasibility of evading a state-of-the-art malware classifiers.
Saliency map method, (4) evaluate the performance of the trained
Their results showed that ‘‘adversarial-malware as a service’’ is a re-
models in 2 on the generated adversarial samples in 3, (5) include
alistic threat, as it was possible to automatically generate thousands
a percentage of adversarial samples from 3 in the training data and
of realistic and inconspicuous adversarial applications at scale, where
re-train and evaluate the models.
on average it took only a few minutes to generate an adversarial app.
The remainder of this paper is structured as follows: Section 2
Furthermore, Hu and Tan [18] proposed a more advanced adversarial
discusses the relevant work in this research area, Section 3 discusses technique which uses the concept of Generative Adversarial Networks
the power system testbed and the generated dataset which is used (GAN) to successfully attack malware classifiers without requiring any
to support the experiments in this paper, Section 4 evaluates the knowledge of the data and the system. This is known as a black
performance of a range of supervised classifiers, Section 5 discusses box attack. Finally, Appruzzese, Colajanni, and Marchetti [19] deploy
AML and the methodology followed to generate adversarial samples, realistic adversarial attacks against network intrusion detection sys-
Section 7 investigates the effectiveness of adversarial training as a tems that focus on identifying botnet traffic through machine learning
defence mechanism, and finally 8 concludes the paper. classifiers. The results showed that such attacks are effective.
In the context of ICSs, there exist only a handful of investiga-
2. Related work tions into AML attacks. Specifically, Zizzo et al. [20] showcased a
simple AML attack against a Long Short-Term Memory (LSTM) clas-
There has been a substantial increase in machine learning-based sifier which was applied on an ICS dataset. However, this work is
IDSs for a range of ICS systems. Table 1 presents a summary of the at a preliminary stage as the adversarial samples were generated by
existing ICS systems and associated supervised learning approaches to manually selecting the feature values to be perturbed. Yaghoubi and
attack detection and classification in these contexts. To date, there Fainekos [21] proposed a gradient-based search approach which was
has been less focus on AML in this context. Such research has mainly evaluated on a Simulink model of a steam condenser. However, this
focused on email spam classifiers, malware detection, and very recently approach is efficient only against a handful of systems that may specifi-
there has been interest in AML against network IDSs for traditional cally employ Recurrent Neural Networks (RNN) with smooth activation
networks (e.g. [11–13]). functions. Finally, Erba et al. [3] demonstrated two types of real-
More specifically, both Nelson et al. [14] and Zhou et al. [15] time evasion attacks, again using Recurrent Neural Network models,
demonstrate that an adversary can exploit and successfully bypass and used an autoencoder to generate adversarial samples. Neither of
the machine learning methods employed in spam filters by modifying these aforementioned works investigate defence methods against AML.
a small percentage of the original training data. Moreover, Grosse Conclusively, it is evident that there is room to investigate AML and
et al. [16] evaluate the robustness of a neural network trained on the the defence against such attacks for current IDSs in ICS systems that
DREBIN Android malware dataset. They report that it is possible to are supported by supervised learning. Moreover, as Table 1 shows, Re-
confuse the model by perturbing a small amount of the features in the current Neural Networks are yet to gain prominence in attack detection
training set. Such an attack is considered to be a white box attack, as to in an ICS context — with algorithms such as Naive Bayes, Random
be successful, the adversary needs to have access or knowledge of the Forest, SVM, and J48 being much more widely used. The experiments,
dataset and the features it includes. Additionally, Pierazzi et al. [17] therefore, focus on defending against AML on these methods as the state
evaluated 170K Android apps between 2017 and 2018 to demonstrate of the art in ML-driven attack detection methods for ICS.
2
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
Table 1
Summary of current work on Intrusion Detection Systems in Industrial Control Systems.
Citation Publication date Dataset Machine learning models
[22] 2019 Power System Random Forest
[23] 2019 Wind Turbines SVM
[24] 2019 SCADA Testbed Long Short Term Memory (RNN)
[25] 2018 Power system (synthetic) Naive Bayes, Random Forests, SVM
[26] 2018 SWaT SVM, J48, Random Forest
[27] 2018 Gas Pipeline SVM, Random Forest
[6] 2018 SCADA Testbed Random Forest, J48, Logistic Regression, Naive Bayes
[28] 2018 SCADA Testbed SVM, Decision Tree, and Random Forest
[29] 2018 Power System SVM, J48, Neural Network
[30] 2018 Wind Turbine Decision Trees (J48, Random Forest, CART, Ripper, etc.)
[1] 2018 SWaT 1D Convolutional Networks
[31] 2017 SCADA/ICS J48, Naive Bayes
[32] 2017 SCADA/Modbus Decision Tree, K-Nearest Neighbour, SVM, OCSVM
[33] 2017 Power Grid, Water Plant, Gas Plant J48, Random Forest, Naive Bayes, SVM, JRipper + Adaboost
[34] 2017 SCADA Testbed Decision Tree, Random Forest
[35] 2017 ICS Testbed Long Short Term Memory (RNN)
[8] 2017 SWaT Long Short Term Memory (RNN)
[36] 2015 Power System OneR, Random Forest, Naive Bayes, SVM, JRipper + Adaboost
[37] 2014 SCADA Naive Bayes, BayesNet, J48
[9] 2014 SCADA Network Traffic One-Class SVM
[4] 2013 Gas Pipeline Naive Bayes, Random Forest, SVM, J48, OneR
[10] 2009 ICS Testbed Neural Network (Error-back propagation and Levenberg–Marquardt)
[5] 2003 SCADA Testbed Bayesian Network
• R1, R2, R3, and R4 are the Intelligent Electronic Devices (IEDs)
responsible for switching the breakers (BR1, BR2, BR3, BR4),
which are automatically operated electrical switches designed to
protect electrical circuits from damage caused by excess current
from an overload or short circuit, on and off.
• Each IED automatically controls one breaker (e.g. R1 controls
BR1, R2 controls BR2, etc.)
• The IEDs use a distance protection scheme which trips the breaker
on detected faults (whether they are valid or invalid) since they
have no internal validation to detect the difference.
• Operators can also manually issue commands to the IEDs to
manually trip the breakers. The manual override is used when
performing maintenance on the lines or other system components.
• There are also other network monitoring devices connected on
the testbed, such as SNORT and Syslog servers.
3. Industrial control system case study: Power system A dataset containing both benign and malicious data points was
generated from the power system testbed by [38]. These data points
Mississippi State University and Oak Ridge National Laboratory have been further categorised into three main classes; ‘no event’ in-
implemented a scaled-down version of a power system framework. stances, ‘natural event’ instances, and ‘attack event’ instances. Both the
Although this system is relatively small, it captures the core func- ‘no event’ and ‘natural event’ instances are grouped to represent benign
tion and is considered as being a representative example of a larger activity. To generate the malicious data, attacks from 5 scenarios were
power system [38]. Fig. 2 illustrates in more detail the power system deployed on the power system. These attacks are described as follows:
framework configuration and the components used for generating the
datasets in which support the experiments in this paper. (1) Short-circuit fault. This is a short in a power line and can occur
More specifically, the components of the power system as shown in in various locations along the line. The location is indicated by
Fig. 2 include: the percentage range.
(2) Line maintenance. One or more relays are disabled on a specific
• G1 and G2 are the main generators. line to do maintenance for that line.
3
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
Table 2
Features included as part of the power system dataset.
Feature Description
PA1–PA3:VH PA1:VH–PA3:VH Phase A
PM1: V–PM3:V C Voltage Phase Angle
PA4:IH–PA6:IH Phase A–C Current Phase Angle
PM4: I–PM6: I Phase A–C Current Phase Magnitude
PA7:VH–PA9:VH Pos.–Neg.–Zero Voltage Phase Angle
PM7: V–PM9: V Pos.–Neg.–Zero Voltage Phase Magnitude
PA10:VH–PA12:VH Pos.– Neg.–Zero Current Phase Angle
PM10: V - PM1 Pos.–Neg.–Zero Current Phase Magnitude
F Frequency for relays
DF Frequency Delta (dF/dt) for relays
PA:Z Appearance Impedance for relays
PA:ZH Appearance Impedance Angle for relays
S Status Flag for relays
Fig. 3. Distribution of data points across both training and testing datasets.
(3) Remote tripping command injection attack. This is an attack
that sends a command to a relay which causes a breaker to open.
It can only be done once an attacker has penetrated outside
defences. particular problem and the properties of data that characterise the
(4) Relay setting change attack. Relays are configured with a problem. In this case, a variety of classifiers distributed as part of
distance protection scheme. The attacker changes the setting to Weka [41] were evaluated using 10-fold cross-validation using their
disable the relay function so that the relay will not trip for a default hyper-parameters.
valid fault or a valid command. To conform to other comparable IDSs in ICS systems in Table 1,
(5) Data injection attack. A valid fault is imitated by changing the classifiers were also selected based on their ability to support a
values to parameters such as the current, voltage, and sequence high-dimensional feature space. The classifiers included:
components. This attack aims to blind the operator and causes a
• Generative models that consider conditional dependencies in the
blackout.
dataset or assume conditional independence (e.g. Bayesian Net-
The final dataset consisted of 55,663 malicious and 22,714 benign work, Naive Bayes).
data points. • Discriminative models that aim to maximise information gain or
directly maps data to their respective classes without modelling
4.2. Feature selection any underlying probability or structure of the data (e.g. J48
Decision Tree, Support Vector Machine).
To perform machine learning classification experiments, it is es-
To support classification experiments, a random subset of approx-
sential to identify which attributes best describe the dataset. In this
imately 60% of the dataset described in Section 4.1 was selected for
case, the data points within the power system dataset contain attributes
training, with the remaining 40% selected for testing. Fig. 3 reports the
associated with synchrophasor measurements and basic network secu-
distributions of data points across the target values in both the training
rity mechanisms. A synchrophasor measurement unit is a device which
and testing datasets.
measures the electrical waves on an electricity grid, using a common
An uneven balance of class labels across the training dataset has the
time source for synchronisation. The dataset contains a total of 128
potential to negatively affect or may bias classification performance.
features [39]. These features are described in more detail as follows:
Given the significant uneven balance across the dataset, the class
• 29 types of measurements from each synchrophasor measurement balancing filter available in Weka was applied to balance the distri-
unit. In this specific power system testbed, there are 4 PMUs. bution of classes within the sample. In this case, the training dataset
Therefore, the dataset contains a total of 116 synchrophasor was balanced so that there were 13,725 samples of both malicious
measurement columns. and benign data points. In order to generate a representative testing
• 12 types of measurements of control panel logs, snort alerts, and dataset and comply with relevant work [42,43], where the benign
relay logs of the 4 synchrophasor measurement unit and relay. samples outnumber the malicious ones, a random sample of 40% of the
malicious packets was selected. Subsequently, the final distribution of
Table 2 summarises the features included in the dataset, as well as class labels in the testing dataset was 3560 malicious and 8989 benign
their corresponding descriptions. More specifically, the index of each data points.
feature is in the form of ‘‘R#-Signal Reference’’. The ‘‘R ‘#’ ’’ specifies Previous works which have used a very small sample of this power
the type of measurement from the synchrophasor measurement unit. system dataset to support their classification experiments have shown
For instance, ‘‘R1-PA1:VH’’ corresponds to the ‘‘Phase A voltage phase that the ensemble classifier which combines both the Adaboost and
angle’’ measured by ‘‘PMU R1’’. JRipper models was found to be the best performing [44]. Conversely,
the classifiers with the highest performances were Random Forest
4.3. Model training and Weka’s implementation of the J48 decision tree method with no
pruning respectively (see Table 3).
To explore how well supervised machine learning algorithms can
detect cyber attacks in an ICS environment, the corresponding power 5. Adversarial machine learning
system dataset was used to evaluate a range of state-of-the-art classi-
fiers. To reiterate, AML aims to automatically introduce perturbations to
The ‘‘no free lunch’’ theorem suggests that there is no univer- the unseen data points to confuse the pre-trained model. The following
sally best learning algorithm [40]. In other words, the choice of an sections introduce the types of AML attacks, as well as the methods
appropriate algorithm should be based on its performance for that used to automatically generate adversarial samples.
4
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
5
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
Table 4 Table 5
An example of how features are perturbed using JSMA. Confusion matrices for the original test set (Benign = 0, Malicious = 1)
Dataset R1-PA1:VH R1-PM1:V R1-PM4:I R3-PM6:I R3-PA8:VH Predicted Predicted
Original test data 0.7645 0.8710 0.1756 0.0261 0.5027 0 1 0 1
𝜃 = 0.1, 𝛾 = 0.1 0.7650 0.8710 0.1756 0.0261 0.5030
0 5556 3433 0 5253 3736
𝜃 = 0.5, 𝛾 = 0.5 0.7650 0.8710 0.1756 0.0261 0.5030 Actual Actual
𝜃 = 0.9, 𝛾 = 0.9 1.0000 0.8770 0.1756 0.0261 0.5070 1 1666 1930 1 1583 2013
Random Forest J48
most relevant to the model decision being one class or another. These
features, if altered, are most likely affect the classification of the target
values. More specifically, an initial percentage of features (𝜃) is chosen
to be perturbed by a (𝛾) amount of noise. Thirdly, the model establishes
whether the added noise has caused the targeted model to misclassify or
not. If the noise has not affected the model’s performance, another set
of features is selected and a new iteration occurs until a saliency map
appears which can be used to generate an adversarial sample. For the
adversarial sample generation herein, we utilised the JSMA algorithm
as described in [46].
Given that the JSMA method may take a few iterations to generate
adversarial samples, the FGSM is computationally faster [46]. More
specifically its time complexity is O(N). However, as opposed to FGSM
which alters each feature, JSMA is a more complex and elaborate
approach which represents more realistic attacks as it progressively
alters a small percentage of features at a time. The complexity of
JSMA heavily depends on the number of input features. The larger the
feature space, the more iterations it requires to establish whether the
approach is successful in generating adversarial samples which affect
a model’s performance. Nevertheless, this approach allows for more Fig. 4. Random Forest classification performance (F1-score) on adversarial samples
realistic and finer-grained AML attacks, as adversaries can define both generated using JSMA.
6. Evaluating supervised models on adversarial samples were generated from all malicious data points present in the testing
data by using a range of combinations of 𝜃 and 𝛾. The adversarial
Both the trained Random Forest and J48 models presented in Sec- samples were joined with the benign testing data points and subse-
tion 4.3 were first evaluated against the original testing dataset. The quently presented to the trained models. Figs. 4 and 5 report the overall
F1-scores achieved by both classifiers were 0.61 and 0.60 respectively. weighted-averaged Recall for all adversarial combinations of JSMA’s 𝜃
The confusion matrix in Table 5 shows how the predicted classes and 𝛾 parameters.
for each data point in the original testing dataset compare against In comparison to Random Forest, the J48 model achieved a decrease
the actual ones. In comparison to the Random Forest model, J48 in Recall across the majority of the 𝜃 and 𝛾 parameters. This may
demonstrated a higher percentage of correct predictions, thus less often indicate that J48 may be more sensitive, subsequently misclassifying
misclassifying the data points. malicious data points as benign. However, when 𝜃 = 0.1, 𝛾 = 0.2, 𝜃 =
To explore how different combinations of the JSMA parameters 0.3, 𝛾 = 0.2 and 𝜃 = 0.6, 𝛾 = 0.1, the model achieves a higher F1-score
affect the performance of the trained classifiers, adversarial samples of 0.63 (an increase of 3 percentage points). This may indicate that the
6
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
Table 6
Confusion matrices after applying Random Forest to adversarial testing samples (Benign
= 0, Malicious = 1)
Predicted Predicted
0 1 0 1
0 5253 3736 0 5253 3736
Actual Actual
1 2390 1206 1 1390 2206
𝜃 = 0.2, 𝛾 = 0.4 𝜃 = 0.5, 𝛾 = 0.9
Table 7
Confusion matrices after applying J48 to adversarial testing samples (Benign = 0,
Malicious = 1)
Predicted Predicted
0 1 0 1
0 5556 3433 0 5556 3433
Actual Actual
1 2612 984 1 1141 2455
𝜃 = 0.1, 𝛾 = 0.5 𝜃 = 0.6, 𝛾 = 0.1
7
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
Acknowledgments
8
E. Anthi et al. Journal of Information Security and Applications 58 (2021) 102717
[19] Apruzzese G, Colajanni M, Marchetti M. Evaluating the effectiveness of adversar- [36] Morris TH, Thornton Z, Turnipseed I. Industrial control system simulation and
ial attacks against botnet detectors. In: 2019 IEEE 18th international symposium data logging for intrusion detection system research. In: 7th annual southeastern
on network computing and applications. 2019. p. 1–8. cyber security summit. 2015. p. 3–4.
[20] Zizzo G, Hankin C, Maffeis S, Jones K. Adversarial machine learning beyond the [37] Werling JR. Behavioral profiling of SCADA network traffic using machine
image domain. In: 2019 56th ACM/IEEE conference on design automation. IEEE; learning algorithms. Tech. rep., Air Force Institute Of Technology; 2014.
2019, p. 1–4. [38] Pan S, Morris T, Adhikari U. Classification of disturbances and cyber-attacks in
[21] Yaghoubi S, Fainekos G. Gray-box adversarial testing for control systems with power systems using heterogeneous time-synchronized data. IEEE Trans Ind Inf
machine learning components. In: 22nd ACM international conference on hybrid 2015;11(3):650–62.
systems: Computation and control. 2019. p. 179–84. [39] Powersystem_dataset_readme.pdf. 2020, [Accessed 18 March 2020].
[22] Wang D, Wang X, Zhang Y, Jin L. Detection of power grid disturbances and [40] Wolpert DH. The supervised learning no-free-lunch theorems. In: Soft computing
cyber-attacks based on machine learning. J. Inf. Secur. Appl. 2019;46:42–52. and industry. Springer; 2002, p. 25–42.
[23] Hoxha E, Vidal Seguí Y, Pozo Montero F. Supervised classification with SCADA [41] Weka 3 - data mining with open source machine learning software in java. 2018,
data for condition monitoring of wind turbines. In: 9th ECCOMAS thematic https://www.cs.waikato.ac.nz/ml/weka/. [Accessed 03 June 2018].
conference on smart structures and materials. 2019. p. 263–73. [42] Stevanovic M, Pedersen JM. An efficient flow-based botnet detection using
[24] Gao J, Gan L, Buschendorf F, Zhang L, Liu H, Li P, et al. LSTM for SCADA supervised machine learning. In: 2014 international conference on computing,
intrusion detection. In: 2019 IEEE pacific rim conference on communications, networking and communications. IEEE; 2014, p. 797–801.
computers and signal processing. IEEE; 2019, p. 1–5. [43] Kirubavathi G, Anitha R. Botnet detection via mining of traffic flow
[25] Anton SD, Kanoor S, Fraunholz D, Schotten HD. Evaluation of machine learning- characteristics. Comput Electr Eng 2016;50:91–101.
based anomaly detection algorithms on an industrial modbus/tcp data set. In: [44] Hink RCB, Beaver JM, Buckner MA, Morris T, Adhikari U, Pan S. Machine
13th international conference on availability, reliability and security. 2018. p. learning for power system disturbance and cyber-attack discrimination. In: 2014
1–9. 7th international symposium on resilient control systems. IEEE; 2014, p. 1–8.
[26] Robles-Durazno A, Moradpoor N, McWhinnie J, Russell G. A supervised energy [45] Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD. Can machine learning be
monitoring-based machine learning approach for anomaly detection in a clean secure?. In: Proceedings of the 2006 ACM symposium on information, computer
water supply system. In: 2018 international conference on cyber security and and communications security. ACM; 2006, p. 16–25.
protection of digital services. IEEE; 2018, p. 1–8. [46] Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A. The limitations
[27] Perez RL, Adamsky F, Soua R, Engel T. Machine learning for reliable network of deep learning in adversarial settings. In: 2016 IEEE European symposium on
attack detection in scada systems. In: 2018 17th IEEE international conference security and privacy. IEEE; 2016, p. 372–87.
on trust, security and privacy in computing and communications/12th IEEE [47] Gollmann D. From insider threats to business processes that are secure-by-design.
international conference on big data science and engineering. IEEE; 2018, p. In: INCoS. Citeseer; 2011, p. 627.
633–8. [48] Liu L, De Vel O, Han Q, Zhang J, Xiang Y. Detecting and preventing cyber insider
[28] Frazão I, Abreu PH, Cruz T, Araújo H, Simões P. Denial of service attacks: threats: A survey. IEEE Commun Surv Tutor 2018;20(2):1397–417.
detecting the frailties of machine learning algorithms in the classification process. [49] Industrial control system security - insider threat. 2020, https://www.allianz-
In: International conference on critical information infrastructures security. fuer-cybersicherheit.de. [Accessed 02 September 2020].
Springer; 2018, p. 230–5. [50] Apruzzese G, Colajanni M, Ferretti L, Marchetti M. Addressing adversarial attacks
[29] Lahza H, Radke K, Foo E. Applying domain-specific knowledge to construct against security systems based on machine learning. In: 2019 11th international
features for detecting distributed denial-of-service attacks on the GOOSE and conference on cyber conflict, vol. 900. IEEE; 2019, p. 1–18.
MMS protocols. Int J Crit Infrastruct Prot 2018;20:48–67. [51] Martins N, Cruz JM, Cruz T, Abreu PH. Adversarial machine learning ap-
[30] Abdallah I, Dertimanis V, Mylonas H, Tatsis K, Chatzi E, Dervilis N, et al. Fault plied to intrusion and malware scenarios: a systematic review. IEEE Access
diagnosis of wind turbine structures using decision tree learning algorithms with 2020;8:35403–19.
big data. Saf Reliab–Safe Soc Chang World 2018;3053–61. [52] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et
[31] Ullah I, Mahmoud QH. A hybrid model for anomaly-based intrusion detection al. Generative adversarial nets. In: Advances in neural information processing
in SCADA networks. In: 2017 IEEE international conference on big data. IEEE; systems. 2014, p. 2672–80.
2017, p. 2160–7. [53] Papernot N, McDaniel P, Goodfellow I. Transferability in machine learning: from
[32] Qu H, Qin J, Liu W, Chen H. Instruction detection in SCADA/Modbus network phenomena to black-box attacks using adversarial samples. 2016, arXiv preprint
based on machine learning. In: International conference on machine learning and arXiv:1605.07277.
intelligent communications. Springer; 2017, p. 437–54. [54] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial
[33] Yeckle J, Abdelwahed S. An evaluation of selection method in the classification examples. 2014, arXiv preprint arXiv:1412.6572.
of scada datasets based on the characteristics of the data and priority of [55] Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense
performance. In: International conference on compute and data analysis. 2017. of security: Circumventing defenses to adversarial examples. In: International
p. 98–103. conference on machine learning. 2018. p. 274–83.
[34] Siddavatam IA, Satish S, Mahesh W, Kazi F. An ensemble learning for anomaly [56] Refaeilzadeh P, Tang L, Liu H. Cross-validation. Encyclopedia Database Syst
identification in SCADA system. In: 2017 7th international conference on power 2009;5:532–8.
systems. IEEE; 2017, p. 457–62.
[35] Feng C, Li T, Chana D. Multi-level anomaly detection in industrial control systems
via package signatures and lstm networks. In: 2017 47th annual IEEE/IFIP
international conference on dependable systems and networks. IEEE; 2017, p.
261–72.