0% found this document useful (0 votes)

10 views11 pages

Toward Detection and Attribution of Cyber-Attacks in Iot-Enabled Cyber-Physical Systems

Uploaded by

V Narayana Pasupuleti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views11 pages

Toward Detection and Attribution of Cyber-Attacks in Iot-Enabled Cyber-Physical Systems

Uploaded by

V Narayana Pasupuleti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3067667, IEEE Internet of
Things Journal
1

Toward Detection and Attribution of Cyber-Attacks

in IoT-enabled Cyber-physical Systems
Amir Namavar Jahromi, Hadis Karimipour, Senior Member, IEEE, Ali Dehghantanha, Senior Member, IEEE,
and Kim-Kwang Raymond Choo, Senior Member, IEEE

Abstract—Securing Internet of Things (IoT)-enabled cyber- was another campaign that targeted Ukraine power grids in
physical systems (CPS) can be challenging, as security solutions 2015, resulting in power outage that affected approximately
developed for general information / operational technology (IT 230,000 people [4]. In April 2018, there were also reports
/ OT) systems may not be as effective in a CPS setting. Thus,
this paper presents a two-level ensemble attack detection and of successful cyber-attacks affecting three U.S. gas pipeline
attribution framework designed for CPS, and more specifically firms, and resulted in the shutdown of electronic customer
in an industrial control system (ICS). At the first level, a deci- communication systems for several days [1]. Although security
sion tree combined with a novel ensemble deep representation- solutions developed for information technology (IT) and op-
learning model is developed for detecting attacks imbalanced erational technology (OT) systems are relatively mature, they
ICS environments. At the second level, an ensemble deep neural
network is designed for attack attribution. The proposed model may not be directly applicable to ICSs. For example, this could
is evaluated using real-world datasets in gas pipeline and water be the case due to the tight integration between the controlled
treatment system. Findings demonstrate that the proposed model physical environment and the cyber systems.
outperforms other competing approaches with similar computa- Therefore, system-level security methods are necessary to
tional complexity. analyze physical behaviour and maintain system operation
Index Terms—Cyber-attacks, Deep representation learning, availability [1]. ICS security goals are prioritized in the order
Cyber threat detection, Cyber threat attribution, Industrial of availability, integrity, and confidentiality, unlike most IT/OT
Control System, ICS, Cyber-physical systems, Industrial Internet systems (generally prioritized in the order of confidentiality,
of Things (IIoT)
integrity, and availability) [5]. Due to close coupling between
variables of the feedback control loop and physical processes,
I. INTRODUCTION (successful) cyber-attacks on ICS can result in severe and
Internet of Things (IoT) devices are increasingly integrated potentially fatal consequences for the society and our environ-
in cyber-physical systems (CPS), including in critical infras- ment. This reinforces the importance of designing extremely
tructure sectors such as dams and utility plants. In these robust safety and security measurements to detect and prevent
settings, IoT devices (also referred to as Industrial IoT or intrusions targeting ICS [1].
IIoT) are often part of an Industrial Control System (ICS), Popular attack detection and attribution approaches include
tasked with the reliable operation of the infrastructure. ICS those based on signatures and anomalies. To mitigate the
can be broadly defined to include supervisory control and known limitations in both signature-based and anomaly-based
data acquisition (SCADA) systems, distributed control systems detection and attribution approaches, there have been attempts
(DCS), and systems that comprise programmable logic con- to introduce hybrid-based approaches [6]. Although hybrid-
trollers (PLC) and Modbus protocols. based approaches are effective at detecting unusual activates,
The connection between ICS or IIoT-based systems with they are not reliable due to frequent network upgrades, result-
public networks, however, increases their attack surfaces and ing in different Intrusion Detection System (IDS) typologies
risks of being targeted by cyber criminals. One high-profile [7]. Beyond this, conventional attack detection and attribution
example is the Stuxnet campaign, which reportedly targeted techniques mainly rely on network metadata analysis (e.g. IP
Iranian centrifuges for nuclear enrichment in 2010, causing addresses, transmission ports, traffic duration, and packet inter-
severe damage to the equipment [1], [2]. Another example vals). Therefore, there has been renewed interest in utilizing
is that of the incident targeting a pump that resulted in the attack detection and attribution solutions based on Machine
failure of an Illinois water plant in 2011 [3]. BlackEnergy3 Learning (ML) or Deep Neural Networks (DNN) in recent
times.
Amir Namavar Jahromi and Hadis Karimipour are with the School In addition, attack detection approaches can be categorized
of Engineering, University of Guelph, Ontario, Canada (email: ana-
[email protected] and [email protected]). into network-based or host-based approaches. Supervised clus-
Ali Dehghantanha is with the School of Computer Science, University of tering, single-class or multi-class Support Vector Machine
Guelph, Ontario, Canada (email: [email protected]) (SVM), fuzzy logic, Artificial Neural Network (ANN), and
Kim-Kwang Raymond Choo is with the Department of Information Sys-
tems and Cyber Security and the Department of Electrical and Computer DNN are commonly used techniques for attack detection in
Engineering, University of Texas at San Antonio (UTSA), San Antonio, TX network traffic. These techniques analyze real-time traffic data
78249, USA. He also has courtesy appointments with UTSA’s Department of to detect malicious attacks in a timely manner. However,
Electrical and Computer Engineering and Department of Computer Science,
and UniSA STEM at the University of South Australia, Adelaide, SA 5095, attack detection that considers only network and host data
Australia. (email: [email protected]) may fail to detect sophisticated attacks or insider attacks.

2327-4662 (c) 2022IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Guelph. Downloaded on April 07,2022at 22:22:55 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3067667, IEEE Internet of
Things Journal
2

Unsupervised models that incorporate process/physical data other systems. Finally, Section VI concludes this paper.
can complement a system’s monitoring since they do not
rely on detailed knowledge of the cyber-threats. In general, II. RELATED WORK
a sophisticated attacker with sufficient knowledge and time,
ML-based attack detection techniques are generally de-
such as a nation state advanced persistent threat actor, can
signed to detect moving targets that constantly evolve by
potentially circumvent robust security solutions. Furthermore,
learning new vulnerabilities and not relying on known attack
most of the existing approaches ignore the imbalanced prop-
signatures or normal network patterns [6]. We will now discuss
erty of ICS data by modeling only a system’s normal behavior
the related literature as follows.
and reporting deviations from normal behavior as anomalies.
This is, perhaps, due to limited attack samples in existing
datasets and real-world scenarios. Although using majority A. Conventional Machine Learning
class samples is a good solution to avoid issues due to In [11], ML algorithms, such as K-Nearest Neighbor
imbalanced datasets, the trained model will have no view of (KNN), Random Forest (RF), DT, Logistic Regression (LR),
the attack samples’ patterns. In other words, such an approach ANN, Na¨ıve Bayes (NB), and SVM were compared in terms of
fails to detect unseen attacks and suffers from a high false- their effectiveness in detecting backdoor, command, and SQL
positive rate [8]. Thus, there have been attempts to utilize injection attacks in water storage systems. The comparative
DL approaches, for example, to facilitate automated feature summary suggested that the RF algorithm has the best attack
(representation) learning to model complex concepts from detection, with a recall of 0.9744; the ANN is the fifth-best
simpler ones [9] without depending on human-crafted features algorithm, with a recall of 0.8718; and the LR is the worst-
[10]. performing algorithm, with a recall of 0.4744. The authors
Motivated by the above observations, this paper presents our also reported that the ANN could not detect 12.82% of the
proposed novel two-stage ensemble deep learning-based attack attacks and considered 0.03% of the normal samples to be
detection and attack attribution framework for imbalanced ICS attacks. In addition, LR, SVM, and KNN considered many
datasets. In the first stage, an ensemble representation learning attack samples as normal samples, and these ML algorithms
model combined with a Decision Tree (DT) is designed are sensitive to imbalanced data. In other words, they are
to detect attacks in an imbalanced environment. Once the not suitable for attack detection in ICS. In [12], the authors
attack is detected, several one-vs-all classifiers will ensemble presented a KNN algorithm to detect cyber-attacks on gas
together to form a larger DNN to classify the attack attributes pipelines. To minimize the effect of using an imbalanced
with a confidence interval during the second stage. Moreover, dataset in the algorithm, they performed oversampling on the
the proposed framework is capable of detecting unseen attack dataset to achieve balance. Using the KNN on the balanced
samples. A summary of our approach in this study is as dataset, they reported an accuracy of 97%, a precision of 0.98,
follows: a recall of 0.92, and an f-measure of 0.95. In [13], the authors
1) We develop a novel two-phase ensemble ICS attack presented a Logical Analysis of Data (LAD) method to extract
detection method capable of detecting both previously patterns/rules from the sensor data and use these patterns/rules
seen and unseen attacks. We will also demonstrate to design a two-step anomaly detection system. In the first step,
that the proposed method outperforms other competing a system is classified as stable or unstable, and in the second
approaches in terms of accuracy and f-measure. The one, the presence of an attack is determined. They compared
proposed deep representation learning results in this the performance of the proposed LAD method with the DNN,
method being robust to imbalanced data. SVM, and CNN methods. Based on these experiments, the
2) We propose a novel self-tuning two-phase attack at- DNN outperformed the LAD method in the precision metric;
tribution method that ensembles several deep one-vs- however, the LAD performed better in recall and f-measure.
all classifiers using a DNN architecture for reducing
false alarm rates. The proposed method can accurately B. Deep Learning
attribute attacks with high similarity. This is the first
In [14], the authors used the DNN algorithm to detect
ML-based attack attribution method in ICS/IIoT at the
false data injection attacks in power systems. Findings of
time of this research.
their evaluation using two datasets suggested 91.80% accuracy.
3) We analyze the computational complexity of the pro-
In [15], the authors proposed an autoencoder-based method
posed attack detection and attack attribution framework,
to detect false data injection attacks and clean them using
demonstrating that despite its superior performance, its
denoising autoencoders. Their experiments showed that these
computational complexity is similar to that of other
methods outperformed the SVM-based method. To handle the
DNN-based methods in the literature.
effect of imbalanced data on the algorithm, they ignored attack
The rest of the paper will be organized as follows. Section II data in training the autoencoder. In [16], the authors presented
will introduce the relevant background and related literature. a technique based on Extreme Learning Machine (ELM) for
Section III will describe the proposed framework, followed attack detection in CPS. To address the imbalanced challenge
by the experimental setup in Section IV. In Section V, the of neural networks, training was conducted using only normal
evaluation findings based on two real-world ICS datasets data. Based on these experiments, the proposed ELM-based
demonstrate that the proposed framework outperforms several method outperformed the SVM attack detection method.

Despite promising results in both conventional ML and deep severe impacts on the environment or human life. In addi-
learning-based techniques, most existing ML algorithms suffer tion, validation of the generated samples is time-consuming.
from the curse of dimensionality due to the large data volume Moreover, removing the normal data from a dataset is not the
generated in real-world ICS. Therefore, feature engineering right solution since the number of attack samples in ICS/IIoT
must reduce the number of features or generate a new rep- datasets is usually less than 10% of the dataset, and most of
resentation of the features to reduce computational overhead. the dataset knowledge is discarded by removing 80% of the
Moreover, an imbalanced dataset of the ICS is another chal- dataset.
lenge that should be considered. Researchers have attempted To avoid the above mentioned problems in handling imbal-
to resolve this issue using oversampling/undersampling, as anced datasets, this study proposed a new deep representation
well as ignoring attack samples and building algorithms using learning method to make the DNN able to handle imbalanced
normal samples. datasets without changing, generating, or removing samples.
Attack attribution seeks to answer the question of “What This model consisted of two unsupervised stacked autoen-
kind of attack was it?” and this is generally more challenging coders, each responsible for finding patterns from one class.
to answer in ICS than in typical IT/OT systems due to the Since each model tries to extract abstract patterns of one
different network structures, industry-specific protocols, and class without considering another, the output of that model
so forth [17], [18]. While there have been a small number represented its inputs well. The stacked autoencoders had three
of ML-based malware attack attributions [19], [20], designing decoders and encoders with input and final representation
robust and effective ML-based attack attribution for ICS and layers. The encoder layers mapped the input representation
IIoT systems appears to be understudied. Thus, this paper to a higher, 800-dimensional space, a 400-dimensional space,
proposes a two-stage ensemble deep learning-based attack de- and the final 16-dimensional space. Equations 1 shows the
tection and attack attribution framework for ICS. Our approach encoder function of an autoencoder. The decoder layers did
incorporates both process and physical data to solve the im- the opposite and tried to reconstruct the input representation
balanced data problem without subsampling or oversampling. by starting from the 16-dimensional new representation and
The proposed framework utilizes an unsupervised ensemble of mapping it to the 400-dimensional, 800-dimensional, and input
learned representations from normal and attack instances for representations. Equations 2 shows the decoder function of an
attack detection. Next, using an ensemble of several one-vs-all autoencoder. These hyperparameters were selected using trial-
classifiers trained on each attack attribute, it forms a two-part and-error to have the best performance in f-measure with the
DNN to attribute the samples into their corresponding attack lowest architectural complexity.
attributes.
hi = σ(wi xi + bi ) (1)
III. THE PROPOSED FRAMEWORK In the above equation, σ denotes an activation function, w
Figure 1 shows the architecture of the proposed framework. is the weight matrix of the encoder, x is a vector of sample
In this framework, the attack detection method detects the features, b is encoder’s bias, h is the encoded representation,
attacks by analyzing the ICS input features using the com- and i ∈ {Normal, Attack}.
bination of ensembled unsupervised DNNs and a decision
tree. If an attack is detected, the sample is passed to several x̂i = σ ′ (wi′ hi + b′i ) (2)
DNNs for detailed analysis. If the attacks were previously
In the above equation, σ′ is the decoder’s activation func-
unseen/unknown, the unseen attack detection module would
tion, w′ is the weight matrix of the decoder, h is the encoded
detect it and label it as an unseen attack. This will be passed on
representation, b′ is decoder’s bias, x̂ is the reconstruction of
for detailed security analysis. Otherwise, the attack attribution
input x, and i ∈ {
Normal, Attack . }
method detects the attribute of the attack.
Each autoencoder trained individually using the loss func-
tion indicated in Equation 3.
A. Proposed Ensemble Attack Detection Method
The proposed attack detection consists of two phases, L(x, x̂) = ||x − x̂||2 = ||x − σ ′ (w ′ (wx + b) + b′ )||2 (3)
namely representation learning and detection phase. Using
a conventional unsupervised DNN on an imbalanced dataset In the above equation, L(x, x̂) denotes the loss between the
yielded a DNN model that mainly learned majority class input x and its reconstruction x̂.
patterns and missed minority class characteristics. Most re- After training the autoencoders, all observations were
searchers have tried to address this challenge by generating passed through both autoencoders, and the final representations
new samples or removing certain samples to make the dataset were fused to form a super-vector for each instance to build
balanced and then passing the data to a DNN. However, a new dataset.
in ICS/IIoT security applications, generating or removing Xnew = [Hnormal, Hattack] (4)
samples are not reasonable solutions. Due to the ICS/IIoT
systems’ sensitivity, generated samples should be validated In the above equation, Xnew is the new dataset consists of a
in a real network, which is impossible since the generated super-vector of the learned representations from normal and
attack samples may be harmful to the network and cause attack autoencoder models for each sample. The Hnormal is a

Attack Detection Method Attack Attribution Method

Normal Data Normal One-vs-All

Super Vector
Classiﬁers
(New Representation) Attack
Industrial Control System
Selected Features
Dataset

Attack Samples
Representation Learning
(Autoencoder) PCA Ensemble Model
Attack Data
Candidate
Attributes

Attack Detection
(Decision Tree)

One-Class SVM
Attack
Unseen Attack Attribute

Detection Module
Labeled as
Unseen
Decision Tree of
Candidate Attributes

Fig. 1. Proposed attack detection and attribution framework

matrix of hnormal which is part of the features show how the

Algorithm 1: The proposed two-phase attack detection
sample x can represent a normal sample while the Hattack is
component
a matrix of hattack which shows how sample x can represent
Data: Dataset including Normal and Attack samples
an attack sample.
(X) and their labels(y = {0, 1 )}
In the second phase, to make a decision based on the Training Phase:
hybrid representation, the super-vector was passed through the x − min(x)
Principal Component Analysis (PCA) algorithm [21], and the X = z(X): z = ;
max(x) − min(x)
extracted features were given to a DT classifier to facilitate Xattack = X[y == 1];
detection. Using the PCA increases a DT classifier’s speed in Xnormal = X[y == 0];
training and testing (see equations 5 and 6). Moreover, DT is ] Training Representation Learning Models:
a simple, powerful model that can be trained faster than more for number of epochs do
complex models like DNNs, specifically for small feature sets. for number of batches in Normal set do
In addition, our previous experiments [22] and certain other Train the Normal autoencoder (AEnormal):
studies [11] have shown that DT works well on ICS and CPS min L(Xnormal , X̂normal );
data. Gini function was used to train the DT (see equation 7). end
for number of batches in Attack set do
T 1 Train the Attack autoencoder (AEattack):
Xnew Xnew = PDP − (5)
min L(Xattack , X̂attack );
In the above equation, P is the eigenvector matrix, and D is end
a diagonal eigenvalue matrix that the eigenvalues are assigned end
to the main diagonal, and other values are considered zero. ] Fusion Layer:
The eigenvectors were sorted based on the eigenvalues and newRepnormal = AEnormal.predict(X);
the first k (number of extracted features) vectors was called newRepattak = AEattack.predict(X);
P ∗. Equation 6 shows the process of extracting k features from XsuperV ector =
dataset Xnew. concat(newRepnormal, newRepattack);
] Detection Model:
X∗ = Xnew P ∗ (6) Feature selection using PCA:
Selected Features(XsuperV ector) =
In the above equation, X∗ is the result of dimensionality PCA(XsuperV ector );
reduction using PCA.
Train a DT using the new features:
c DT = Train DT (Selected Features)
Σ 2i (7) Testing Phase:
gini = 1 − i=1 p x test = z(x text );
newRepnormal = AEnormal(xtest);
In the above equation, c is the number of classes, and pi is newRepattack = AEattack(xtest);
the probability of the class i in the current branch of the tree. superV ector =
To detect previously unseen attacks, a One-Class SVM concat(newRepnormal, newRepattack);
(OCSVM) was used to make a boundary around normal x̂test = Selected F eatures(superV ector);
samples and to report the others as previously unseen attacks. ŷ = DT (x̂test );
Algorithm 1 shows the algorithm of the proposed attack Output: Normal/Attack Label (ŷ)
detection component.

B. Proposed Self-Tuning Attack Attribution Method Algorithm 2: The proposed two-phase attack attribu-
The proposed self-tuning attack attribution method consists tion component
of two phases. In the first phase, a one-vs-all classifier is Data: Dataset including Attack samples from various
trained for each attribute. To train these classifiers, a dataset’s families (X) and the labels (y ∈ [1, c])
attack samples are split into several subsets based on their Training Phase:
x − min(x)
attributes, and one DNN model is trained for every set. The X = z(X): z = ;
Rectified Linear Unit (ReLU) function is used as an activation max(x) − min(x)
function for the hidden layers, and the Sigmoid function is foreach attack type i do
used as the output-layer activation function. Next, the outputs foreach sampe x ∈ X do
of all of the first phase DNNs are passed to the second phase if y[x] = i then
to attribute the instances based on one-vs-all DNNs. yi = 1
In the second phase, the one-vs-all classifiers and a DNN end
ensemble model are combined to compose a more complex else
yi = 0
DNN. This DNN is constructed from two components: a
end
partially-connected element consisting of several one-vs-all
end
classifiers and a fully-connected element fusing the first part’s
end
results and attributes of the samples into different classes .
] Training the binary DTs:
The ReLU activation function is used for the hidden layers foreach two class of attacks do
of the ensemble DNN, and the softmax function is used as Train a DT
its output activation function (equation 8). The Categorical end
Cross-Entropy (CE) is performed as the loss function of the ] Training one-vs-all classifiers:
final DNN (equation 9). In addition, the outputs of this DNN foreach attack type i do
are the two most probable attributions for the given sample. for number of epochs do
This model is called the primary attack attribution method. A for number of batches in the Attack type i do
DT classifier is trained for each pair of attack attributes used Train the one-vs-all classifier (classifieri):
for the final attack attribution from the two candidates, and min L (yi , ŷi );
this is referred to as the secondary attack attribution method. end
esi end
σ(s)i = ΣK (8) end
j=1 esj ] Ensemble model:
where K is the number of classes, and z = (z , ..., z ) ∈ RK . DNN = new neural network;
1 k
foreach classifier i do
DNN.add(classifieri);
K
Σ
CE = − yi log(σ(s)i) (9) end
i=1 DNN.add(fully − connected neural network);
for number of epochs do
where yi is the label of the −
i th class, and log(σ(s)i) is the
for number of batches in training data do
output of the softmax function.
train the whole network: min L (y, ŷ);
This method is self-tuning since it can tune itself by
end
changing the attack patterns without needing pre-processing.
end
This results from using the gradient descent technique to
Testing Phase:
simultaneously update the weights of all one-vs-all classifiers
xtest = z(xtest);
and the ensemble model. This feature is useful when a new
DNN.predict2bests(xtest);
attack attribute is discovered, and then it is added to the
Pass xtest to the DT;
attack attribution method . This work is done by passing the
Output: Attack type (ŷ)
new dataset, including the new attack attribute, through the
proposed attack attribution method. Algorithm 2 shows the
algorithm od the proposed attack attribution component.
Na¨ıve Malicious Response Injection (NMRI), Complex Mali-
IV. EXPERIMENTAL SETUP cious Response Injection (CMRI), Malicious State Command
Injection (MSCI), Malicious Parameter Command Injection
A. Dataset (MPCI), Malicious Function Code Injection (MFCI), Denial
As previously discussed, we evaluated the proposed frame- of Service (DoS), and Reconnaissance (Recon) attacks. It
work using two real-world ICS datasets. The first dataset reportedly contained 274,628 observations, in which 214,580
was collected at the Mississippi State University [23] from (78.14%) were normal samples, and the remaining 60,048
a gas pipeline system consisting of sensors and actuators, a (21.86%) samples were attack samples. This dataset also
communication network, and supervisory control. This dataset consisted of 17 features of network and field states.
consists of normal samples and seven attack types, including The second dataset was the Secure Water Treatment (SWaT)

dataset [24], collected at Singapore University of Technology • Recall indicates the number of samples that are detected
from a water treatment system, consisting of 449,920 samples. as attack correctly over the total samples of the attack in
In this dataset, 87.9% and 12.1% were normal and attack the dataset (see Equation 12).
samples, respectively. Each dataset sample was formed by 51 • F-measure is the harmonic value of precision and recall
features that were the physical measurements of the systems. (see Equation 13).
In addition, this dataset consisted of 31 different attack sce- In the detection task, the desired class is the attack one. The
narios that could be used for attack attribution. attack class is considered as the positive class for precision,
recall, and f-measure metrics.
B. Pre-Processing
As shown in Figure 1, the proposed framework consists of D. Feature Extraction
several DNNs that accept the raw features as input and map
PCA was chosen for dimensionality reduction and also to
them to new representations for attack detection and attack
extract the best features from super-vectors. It also improve
attribution. Similar to some other approaches [25], [26], [27],
the performance of the DT classifier by extracting independent
the data was normalized using the min-max technique before
features in an unsupervised manner.
passing it through the methods to make them unbiased against
To extract the best features using the PCA, 10-fold cross-
the features. This was the only pre-processing for the proposed
validation was performed on each dataset’s possible number
framework. Moreover, 10-fold cross-validation was performed
of features. The dataset’s principal components were extracted
to obtain the results.
in each run, and the model was trained and tested using the
principal components. To make the PCA unbiased to the test
C. Evaluation Metrics
data, training was performed on the training data . The number
To ensure fairness in comparison, this study evaluated the of principal components with the best f-measure over ten runs
performance of the proposed attack attribution method using was then selected as the number of PCA components.
the DT classifier on the original representation and approaches
that used the same dataset(s) in their original articles. However, V. DISCUSSIONS
for the proposed self-tuning attack attribution method, we were
not able to find similar approaches. A comparison with the The proposed attack detection and attack attribution meth-
Fuzzy C-Mean (FCM) clustering [25] verified that FCM could ods form a framework that can keep ICS/IIoT systems secure.
detect only four out of eight classes in the gas pipeline dataset This framework is proposed to address the challenge of
(while our model attributes all eight classes). This suggested ICS imbalanced data without ignoring the minority class or
that the attacks were very similar and hard to classify . balancing the dataset. The proposed framework should be
Similar to other approaches, this study used standard metrics deployed on the physical layer to passively monitor the sensor
to evaluate the performance of machine learning algorithms. data and give an alert when an attack happens. In such a case,
Specifically, it used True Positive (TP), True Negative (TN), the data is sent to the attribution model to detect the attack
False Positive (FP), and False Negative (FN) to represent the attribute. Finally, security experts and incident response teams
number of samples correctly classified as attacks, correctly can handle attacks and prevent potential damages using the
classified as normal, wrongly classified as attacks, and wrongly proposed framework’s efficient, accurate information.
classified as normal, respectively. Using these metrics, it is
possible to define Accuracy (ACC), Precision (Pre), Recall A. The Proposed Attack Detection Method
(Rec), F-measure, Receiver Operating Characteristics (ROC) The proposed attack detection method consists of a deep
curve, and Area Under Curve (AUC) to quantify the perfor- representation learning model with two unsupervised stacked
mance of ML algorithms in performing malware detection. autoencoders, feature extraction using the PCA, and a DT
TP + TN classification.
ACC = (10)
TP + TN + FP + FN Due to the consideration of both attack and normal data
TP in the training step, the proposed attack detection method
Pre = (11) can detect previously seen attacks with better f-measures than
TP + FP
TP the other methods, as can be seen in Table I. To enhance
Rec = (12) the method’s ability to face the previously unseen attacks, an
TP + FN anomaly detection module was added to the system trained on
2 × Pre × Rec (13) the normal data to capture the normal data structure and detect
f − measure =
Pre + Rec anomalies. The OCSVM model was used in this module.
• Accuracy indicates the number of samples that are cor- The proposed attack detection component is scalable to
rectly classified over the entire dataset. Since ICS datasets larger ICS with more features and larger data sets. The only
are imbalanced, this metric is not a good one for evalu- part of the system that depends on the ICS architecture is the
ation (see Equation 10). representation-learning step, which needs more training time
• Precision indicates the number of samples that are de- by increasing the size of the system and/or the data’s size.
tected correctly as attack over total samples detected as However, it will not affect the performance of the proposed
an attack (see Equation 11). framework in real implementation.

1) General Performance: As observed in Table I, the minority set. Furthermore, the fusion layer consists of useful
proposed method outperformed the base DT model on the representations from both majority (normal) and minority
original representation in all metrics. Moreover, it outper- (attack) data.
formed other techniques in the f-measure metric (i.e. the 3) Previously Unseen Attack Detection: To detect previ-
harmony between precision and recall and an important metric ously unseen attacks, the OCSVM model was added to the
to evaluate imbalanced datasets). In addition, the proposed proposed framework. OCSVM, a type of SVM, attempts
attack detection method outperformed all other techniques to maximize the decision boundary’s margin to yield better
on the SWaT dataset. In other words, the proposed attack generalization. Based on the evaluations, we observed that
detection method achieved good precision without affecting this method correctly detected 86.14% of previously unseen
the recall metric on the data. As discussed earlier, accuracy is attacks in the gas pipeline dataset. Moreover, 94.53% of the
not a useful metric by which to evaluate models’ performances previously unseen attacks were detected correctly in the SWaT
using imbalanced datasets; in this case, by labeling all of the dataset.
samples with the majority class label, the model achieved high 4) Execution Time Comparison: Table III compares the
accuracy (78.14% in gas and 87.9% in the SWaT dataset). proposed attack detection component’s execution time with
Moreover, as shown in Table II, the proposed attack de- other proposed methods in the literature. As illustrated in
tection method has a higher recall (true-positive rate) than Table III, it takes 1200 seconds to train the whole model
other techniques for each attack attribute. In other words, the on the SWaT dataset, while applying the trained model over
proposed method detects more attacks than the others when testing samples takes 2.98 seconds, which means around 0.03
trained on only one attack type. milliseconds for each sample. Moreover, training the proposed
Table I reinforces the importance of the representation learn- method on the Gas Pipeline dataset takes 1115 seconds, while
ing models to ICS datasets. The proposed deep representation the test takes around 1.1 seconds, which means around 0.02
learning step enables the method to develop new features milliseconds for each sample. As can be seen from Table III,
separately for normal and attack data in an unsupervised the proposed model is faster than most DNN-based techniques
manner based on their patterns. In turn, these new features due to its simpler architecture combined with the PCA method,
allow the DT to perform a more effective classification than which makes the DT faster. Besides, the proposed attack
was facilitated using the original features. detection component’s execution time illustrates that it can
2) Imbalanced Testing: The reported higher f-measure in detect attack samples in almost real-time (0.02 milliseconds for
Table I shows that the proposed attack detection method the Gas Pipeline dataset and 0.03 milliseconds for the SWaT
achieved better performance on the imbalanced datasets. To dataset).
evaluate the robustness of the proposed ensemble two-phase
attack detection method for imbalanced ICS data, this study
generated different sets of data with different imbalance ratios B. The Proposed Attack Attribution Method
by varying the number of attack samples in the original In the proposed attack attribution method, a one-vs-all DNN
dataset. These sets were obtained from the original datasets classifier was responsible for extracting each attribute’s pattern
and generated randomly. Next, the new datasets were fed and assigned belonging confidence to each observation. These
into the proposed attack detection method and compared with confidences from all DNNs were passed to another DNN,
several base classifiers, including DT, Logistic Regression which was responsible for attack attribution. Due to the close
(LR), Gradient Boosting (GB), AdaBoost M1 (AB), and patterns of the attacks [25], this DNN was not performed well.
Random Forest (RF). The new imbalanced sets were used However, it can detect attributes better than FCM. To improve
for training to ensure a fair comparison, and the evaluation the attack attribution method performance, this study defined
was performed using a predefined test set. In addition to a two-step method. In the first step, the aforementioned DNN
achieving better performance for the proposed attack detection determined the two best attribute candidates for the observed
method in all metrics, the proposed model resulted in a sample. In the second step, the observed sample was sent to a
robust, consistent performance in all metrics for both datasets DT pre-trained on the samples of two candidate attributes to
(see Figure 2). Robustness refers to the low variance of the detect the best attribute.
changes in the performance of the model. It indicates that the Using one-vs-all classifiers for each attack attribute guaran-
proposed attack detection method achieves high accuracy, low tees that each classifier passes the best result to the ensemble
false positives, and high f-measures simultaneously, thereby DNN model that yields better performance, as this paper
outperforming the competing approaches. More specifically, will show here. These classifiers were connected to a DNN
the high f-measure of the proposed method is significant in fusion model to pass their extracted features and fuse them
performance evaluation for imbalanced datasets. into the fusion model to attribute the samples. Each one-vs-
Beyond this, the findings suggested that the proposed all classifier was a supervised DNN that encoded the input
method mitigates the challenge of the imbalanced problem features within an 8-dimensional space and then into a 128-
in DNNs by separating the attack and normal samples and dimensional space using the ReLU activation function. Based
running separate, unsupervised stacked autoencoders on each on the final representation, the output layer classified it. The
of them. Using this technique, major class samples’ effects on fusion model is another DNN; its inputs were the outputs
the gradient descent algorithm are avoided/omitted, enabling of the one-vs-all classifiers. This fusion model decoded the
the autoencoders to extract more useful features from the input features in the 128-dimensional space, followed by a

TABLE I
COMPARISON OF THE PROPOSED ATTACK DETECTION METHOD WITH OTHER TECHNIQUES ON THE GAS PIPELINE AND SWAT DATASETS

SWaT Dataset Pipeline Dataset

Method Pre Rec f-measure Method ACC Pre Rec f-measure
Proposed method 0.9999 0.9999 0.9998 Proposed method 96.20 0.9617 0.9620 0.9618
DT 0.8411 0.8284 0.8346 DT 91.11 0.9092 0.9111 0.9099
LAD-ADS [13] 0.936 0.891 0.914 SVM [28] 92.50 0.782 0.936 0.852
DNN [26] 0.9829 0.6785 0.8028 K-means [25] 56.80 0.8319 0.5728 0.6751
1D CNN [29] 0.868 0.854 0.861 NB [25] 90.36 0.8195 0.7692 0.8595
MADGAN [30] 0.9897 0.6374 0.77 AllKNN [12] 97 0.98 0.92 0.95
Tabor [31] 0.8617 0.7880 0.8232 LSTM [32] 92 0.94 0.78 0.85
LSTM [33] 0.951 0.627 0.756
ST-ED [33] 0.949 0.705 0.809

TABLE II
COMPARISON BETWEEN THE RECALL OF THE PROPOSED ATTACK DETECTION METHOD AND OTHER TECHNIQUES ON THE GAS PIPELINE DATASET
ATTACK ATTRIBUTES

Model NMRI CMRI MSCI MPCI MFCI DoS Recon.

Proposed attack detection method 0.97 0.95 0.97 0.95 1 1 1
AllKNN [12] 0.93 0.76 0.68 0.85 1 0.98 1
LSTM [32] 0.88 0.67 0.62 0.80 1 0.94 1
K-means [25] 0.19 0.20 0.73 0.66 0.52 0.56 0.75
NB [25] 0.81 0.84 0.73 0.67 0.52 0.79 0.50

(A) (B) (C)

(D) (E) (F)

Fig. 2. Comparison of accuracy , AUC, and f-measure of the proposed attack detection method and other basic classifiers on original representation for
different attack IR (A), (B), and (C) on the gas pipeline dataset and (D), (E), and (F) on the SWaT dataset. In the figures, PM is the proposed attack detection
method, DT is the Decision Tree, LR is the Logistic Regression, GB is the Gradient Boosting, AB is the AdaBoost M1, and RF is the Random Forest.

64- dimensional space using the ReLU activation function. The probable attributes to obtain the final attack attribute. This was
output layer used the softmax activation function to attribute labelled the secondary attack attribution method. As observed
the observation to the given attributes (31 for the SWaT dataset in Table IV, all of the metrics improved significantly by using
and seven for the gas pipeline dataset). the final DT model (secondary attack attribution) compared
As discussed in [25], running the FCM algorithm on the with the primary attack attribution method (using the output
gas pipeline dataset with the eight clusters resulted in four of DNN model) on both datasets. Thus, the attack attribution
clusters. This implies that the attacks are very similar and method can attribute all attacks with reasonable confidence (as
share many common features that the FCM algorithm consid- a best or second-best result). Figure 3 compares the confusion
ers them one group. To overcome this problem, this study matrices for the performance of the proposed primary and
detected the two most probable attack attributes for each secondary attack attribution methods for the gas pipeline
sample using the ensemble model. These samples were fed dataset. The confusion matrix for the SWaT dataset is not
into the DT classifier, which was trained on the two most reported due to page limitations since it includes 36 different

TABLE III
COMPARISON OF THE TRAIN AND TEST EXECUTION TIME OF THE PROPOSED ATTACK DETECTION METHOD WITH OTHER TECHNIQUES ON THE GAS
PIPELINE AND SWAT DATASETS. IN THIS TABLE, S STANDS FOR SECONDS AND W STANDS FOR WEEKS.

SWaT Dataset Pipeline Dataset

Method Train Test Method Train Test
Proposed method 1200s 2.98s Proposed method 1115s 1.10s
LAD-ADS [13] 8820s 2s SVM [28] 11712 -
DNN [26] 2w 28800s AllKNN [12] - 5.99s
Tabor [31] 214s 33s LSTM [32] 2100s 1.65s
LSTM [33] 57s (per epoch) 13s
ST-ED [33] 692s 217.50s

attack attributes. Despite the strong evaluation results of the which is similar to the other DNN-based detection methods in
secondary attack attribution method, it cannot discriminate the literature.
between DoS and MPCI samples due to the similar impacts Moreover, the testing computational complexity of the pro-
of these attacks on its features. posed attack detection method is shown in Equation 15.
The proposed attack attribution component is scalable to
larger ICS with more features and larger data sets. However, 2 2
O(n ) + O(n) + O(1) = O(n ) (15)
its execution time depends on the number of attack classes
and almost independent of the system’s size (features). which is similar to all other DNN-based methods (except the
1) Execution Time: Training of the proposed attack attribu- recurrent neural network-based methods) in the literature.
tion component on the Gas Pipeline dataset took 1155 seconds,
Adding the previously unseen module did not change the
while the attribution over test data took 0.65 seconds, which computational complexity of training and testing the pro-
means around 0.05 milliseconds for each sample. Moreover, posed attack detection technique since the OCSVM’s training
training of the proposed attack attribution component on the 3
computational complexity is O(n ). In addition, its testing
SWaT dataset took 3452 seconds, and it classified the test 2
computational complexity is O(n ), which cannot affect the
data in 2.87 seconds, which means around 0.27 milliseconds
proposed attack detection method’s computational complexity.
for each sample. The proposed model’s training and testing
execution time depend on the number of attribute classes 2) The Proposed Attack Attribution Method: The proposed
attack attribution method includes several one-vs-all DNNs
(seven classes for the Gas Pipeline dataset vs. 31 classes for
connected using another DNN to make a deeper DNN model.
the SWaT dataset).
The best two attribution candidates were selected using this
DNN model, and a pre-trained DT on the candidate attributes
C. Computational Complexity was used to detect the final attributes. As a DT should
c (c 1)
In this section, the computational complexity of the pro- be trained for every two attributes, × − DTs should be
2
posed attack detection and attribution methods will be ana- trained; where c is the number of attributes, each has a
3
lyzed. computational complexity of O(n ). Thus, the computational
2 3
The computational complexities of training and testing the complexity of training all of the DTs is O(c × n ), where c
used algorithms are shown in Table V [34], [35]. In this table, is the number of attributes, and n is the number of training
n is the number of training samples, and the computational samples.
complexities were calculated for the worst-case scenario, in In addition to the DTs, the proposed attack attribution
which the number of input features, number of neurons in method used DNNs with the training computational complex-
4
each layer, number of selected support vectors, and depth of ity of O(n ). Combining the DTs’ and the DNN model’s train-
the DT is considered to be n. ing, the computational complexity of training the proposed
1) The Proposed Attack Detection Method: As mentioned attack attribution model is shown in Equation 16.
before, the proposed attack detection method consists of a
novel form of deep representation learning, PCA feature
O(c × n ) + O(n ) = O(n )
2 3 4 4
extraction, and a DT classification. Each deep representation (16)
learning model has three encoding and three decoding layers.
Based on Table V, the computational complexity of training where c is the number of attributes, and n is the number
the proposed deep representation learning in the worst-case of training samples. Since the number of training samples is
4 significantly larger than the number of attributes, the number
scenario is O(n ), where n is the number of training samples.
of attributes is ignored in the computational complexity anal-
The other parts of this method are the PCA and DT algo-
ysis. As seen in Equation 16, the computational complexity
rithms. As mentioned in Table V, in the worst-case scenario,
of training the proposed attack attribution method is similar to
the PCA and DT algorithms’ computational complexity is
3 that of the other DNN methods.
equal to O(n ). Equation 14 shows the computational com-
plexity of training of the proposed attack detection method. The proposed attack attribution’s testing computational
2
complexity is O(n ), similar to the computational complexi-
4 3 3 4
O(n ) + O(n ) + O(n ) = O(n ) (14) ties of the other DNN-based techniques in the literature.

TABLE IV
RESULTS OF THE PROPOSED SELF-TUNING TWO-PHASE ATTACK ATTRIBUTION METHOD ON BOTH GAS PIPELINE AND SWAT DATASETS

Accuracy Precision Recall f-measure

Model
Gas SWaT Gas SWaT Gas SWaT Gas SWaT
Proposed primary attack attribution method 78.08 99.53 0.7906 0.9959 0.7808 0.9953 0.7857 0.9956
Proposed secondary attack attribution method 98.14 99.71 0.9822 0.9974 0.9814 0.9971 0.9818 0.9972

(A) (B)

Fig. 3. Confusion matrices of the proposed attack attribution method on the gas pipeline dataset for (A) the proposed primary attack attribution method and
(B) the proposed secondary attack attribution method

TABLE V and capable of detecting previously unseen attacks. The attack

COMPUTATIONAL COMPLEXITY OF THE USED ALGORITHMS attribution stage is an ensemble of several one-vs-all classi-
Algorithm Training Testing fiers, each trained on a specific attack attribute. The entire
DT O(n3) O(n) model forms a complex DNN with a partially connected and
PCA O(n3) O(1) fully connected component that can accurately attribute cyber-
OCSVM O(n3) O(n2) attacks, as demonstrated. Despite the complex architecture of
DNN O(n4) O(n2)
the proposed framework, the computational complexity of the
4 2
training and testing phases are respectively O(n ) and O(n ),
D. Implementation in Real-World Environment (n is the number of training samples), which are similar
to those of other DNN-based techniques in the literature.
The proposed framework can be implemented in the same
Moreover, the proposed framework can detect and attribute
network layer as the Human Machine Interface (HMI) to ob-
the samples timely with a better recall and f-measure than
serve the sensor data from field devices and detect and attribute
previous works.
attacks. It also can be connected to the monitoring system in
Future extension includes the design of a cyber-threat hunt-
control center to inform the security experts about the presence
ing component to facilitate the identification of anomalies
of the attack and help them choose preventive actions in a
invisible to the detection component for example by building
timely manner. Moreover, the provided information helps the
a normal profile over the entire system and the assets.
incident response team understand the attack and its impacts,
based on the attribution information, to revive the damaged REFERENCES
assets [1] F. Zhang, H. A. D. E. Kodituwakku, J. W. Hines, and J. Coble,
As shown in Figure 4, at first the input sensor data are “Multilayer Data-Driven Cyber-Attack Detection System for Industrial
fed into the detection component. The detection components Control Systems Based on Network, System, and Process Data,” IEEE
Transactions on Industrial Informatics, vol. 15, no. 7, pp. 4362–4369,
classify it as normal or attack based on its previous experience 2019.
(training data). If the entered sample is detected as normal, [2] R. Ma, P. Cheng, Z. Zhang, W. Liu, Q. Wang, and Q. Wei, “Stealthy
it will pass to an OCSVM module for further investigation Attack Against Redundant Controller Architecture of Industrial Cyber-
Physical System,” IEEE Internet of Things Journal, vol. 6, no. 6, pp.
by comparing it to normal samples’ profiles. However, if the 9783–9793, 2019.
detection component detects the sample as an attack, it will [3] E. Nakashima, “Foreign hackers targeted U.S. water plant
go to the attribution component to extract its attribution. All in apparent malicious cyber attack, expert says.” [Online].
Available: https://www.washingtonpost.com/blogs/checkpoint-
the outputs are then passed to a monitoring system. washington/post/foreign-hackers-broke-into-illinois-water-plant-control-
system-industry-expert-says/2011/11/18/gIQAgmTZYN blog.html
VI. CONCLUSION [4] G. Falco, C. Caldera, and H. Shrobe, “IIoT Cybersecurity Risk Modeling
for SCADA Systems,” IEEE Internet of Things Journal, vol. 5, no. 6,
This paper proposed a novel two-stage ensemble deep pp. 4486–4495, 2018.
learning-based attack detection and attack attribution frame- [5] J. Yang, C. Zhou, S. Yang, H. Xu, and B. Hu, “Anomaly Detection Based
on Zone Partition for Security Protection of Industrial Cyber-Physical
work for imbalanced ICS data. The attack detection stage Systems,” IEEE Transactions on Industrial Electronics, vol. 65, no. 5,
uses deep representation learning to map the samples to the pp. 4257–4267, 2018.
[6] S. Ponomarev and T. Atkison, “Industrial control system network intru-
new higher dimensional space and applies a DT to detect the sion detection by telemetry analysis,” IEEE Transactions on Dependable
attack samples. This stage is robust to imbalanced datasets and Secure Computing, vol. 13, no. 2, pp. 252–260, 2016.

Input Data Detection Attack

Attribution Component
Component

Normal

OCSVM module
Security Monitoring System

Fig. 4. Interaction of the proposed framework in real environment.

[7] J. F. Clemente, “No cyber security for critical energy infrastructure,” simulation and data logging for intrusion detection system research,” in
Ph.D. dissertation, Naval Postgraduate School, 2018. 7th Annual Southeastern Cyber Security Summit, 2015.
[8] C. Bellinger, S. Sharma, and N. Japkowicz, “One-class versus binary [24] J. Goh, S. Adepu, K. N. Junejo, and A. Mathur, “A dataset to support
classification: Which and when?” in 2012 11th International Conference research in the design of secure water treatment systems,” in Crit-
on Machine Learning and Applications, vol. 2, 2012, pp. 102–106. ical Information Infrastructures Security, G. Havarneanu, R. Setola,
[9] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT H. Nassopoulos, and S. Wolthusen, Eds. Cham: Springer International
Press, 2016. [Online]. Available: http://www.deeplearningbook.org Publishing, 2017, pp. 88–99.
[10] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A [25] S. N. Shirazi, A. Gouglidis, K. N. Syeda, S. Simpson, A. Mauthe,
review and new perspectives,” IEEE Transactions on Pattern Analysis I. M. Stephanakis, and D. Hutchison, “Evaluation of anomaly detection
and Machine Intelligence, vol. 35, no. 8, pp. 1798–1828, 2013. techniques for scada communication resilience,” in 2016 Resilience Week
[11] M. Zolanvari, M. A. Teixeira, L. Gupta, K. M. Khan, and R. Jain, (RWS), 2016, pp. 140–145.
“Machine Learning-Based Network Vulnerability Analysis of Industrial [26] J. Inoue, Y. Yamagata, Y. Chen, C. M. Poskitt, and J. Sun, “Anomaly
Internet of Things,” IEEE Internet of Things Journal, vol. 6, no. 4, pp. detection for a water treatment system using unsupervised machine
6822–6834, 2019. learning,” IEEE International Conference on Data Mining Workshops,
[12] I. A. Khan, D. Pi, Z. U. Khan, Y. Hussain, and A. Nawaz, “HML-IDS: ICDMW, vol. 2017-November, pp. 1058–1065, 2017.
A hybrid-multilevel anomaly prediction approach for intrusion detection [27] M. Kravchik and A. Shabtai, “Detecting cyber attacks in industrial
in SCADA systems,” IEEE Access, vol. 7, pp. 89 507–89 521, 2019. control systems using convolutional neural networks,” Proceedings of
[13] T. K. Das, S. Adepu, and J. Zhou, “Anomaly detection in industrial the ACM Conference on Computer and Communications Security, no. 1,
pp. 72–83, 2018.
control systems using logical analysis of data,” Computers & Security,
vol. 96, p. 101935, 2020. [28] S. D. Anton, A. Hafner, S. Sinha, and H. Schotten, “Anomaly-based
intrusion detection in industrial aata with SVM and random forests,” in
[14] J. J. Q. Yu, Y. Hou, and V. O. K. Li, “Online False Data Injection Attack
the 27th International Conference on Software, Telecommunicationsand
Detection With Wavelet Transform and Deep Neural Networks,” IEEE
Computer Networks (SoftCOM). IEEE, 2019.
Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3271–3280,
[29] M. Kravchik and A. Shabtai, “Efficient cyber attack detection in indus-
2018.
trial control systems using lightweight neural networks and pca,” IEEE
[15] M. M. N. Aboelwafa, K. G. Seddik, M. H. Eldefrawy, Y. Gadallah, transactions on dependable and secure computing, pp. 1–1, 2021.
and M. Gidlund, “A machine-learning-based technique for false data [30] D. Li, D. Chen, B. Jin, L. Shi, J. Goh, and S. K. Ng, “MAD-GAN:
injection attacks detection in industrial iot,” IEEE Internet of Things Multivariate anomaly detection for time series data with generative
Journal, vol. 7, no. 9, pp. 8462–8471, 2020. adversarial networks,” Lecture Notes in Computer Science (including
[16] W. Yan, L. K. Mestha, and M. Abbaszadeh, “Attack detection for subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
securing cyber physical systems,” IEEE Internet of Things Journal, Bioinformatics), vol. 11730 LNCS, pp. 703–716, 2019.
vol. 6, no. 5, pp. 8471–8481, 2019. [31] Q. Lin, S. Verwer, S. Adepu, and A. Mathur, “TABOR: A graphical
[17] A. Cook, A. Nicholson, H. Janicke, L. Maglaras, and R. Smith, “Attri- model-based approach for anomaly detection in industrial control sys-
bution of Cyber Attacks on Industrial Control Systems,” EAI Endorsed tems,” ASIACCS 2018 - Proceedings of the 2018 ACM Asia Conference
Transactions on Industrial Networks and Intelligent Systems, vol. 3, on Computer and Communications Security, pp. 525–536, 2018.
no. 7, p. 151158, 2016. [32] C. Feng, T. Li, and D. Chana, “Multi-level anomaly detection in
[18] L. Maglaras, M. Ferrag, A. Derhab, M. Mukherjee, H. Janicke, and industrial control systems via package signatures and lstm networks,” in
S. Rallis, “Threats, Countermeasures and Attribution of Cyber Attacks 2017 47th Annual IEEE/IFIP International Conference on Dependable
on Critical Infrastructures,” ICST Transactions on Security and Safety, Systems and Networks (DSN), 2017, pp. 261–272.
vol. 5, no. 16, p. 155856, 2018. [33] M. Macas and W. Chunming, “Enhanced cyber-physical security through
[19] M. Alaeiyan, A. Dehghantanha, T. Dargahi, M. Conti, and S. Parsa, deep learning techniques,” CEUR Workshop Proceedings, vol. 2457,
“A Multilabel Fuzzy Relevance Clustering System for Malware Attack no. 38, 2019.
Attribution in the Edge Layer of Cyber-Physical Networks,” ACM [34] C.-t. Chu, S. Kim, Y.-a. Lin, Y. Yu, G. Bradski, K. Olukotun, and A. Ng,
Transactions on Cyber-Physical Systems, vol. 4, no. 3, pp. 1–22, 2020. “Map-reduce for machine learning on multicore,” in Advances in Neural
[20] U. Noor, Z. Anwar, T. Amjad, and K.-K. R. Choo, “A machine Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman,
learning-based FinTech cyber threat attribution framework using high- Eds., vol. 19. MIT Press, 2007, pp. 281–288.
level indicators of compromise,” Future Generation Computer Systems, [35] J. Su and H. Zhang, “A fast decision tree learning algorithm,” in
vol. 96, pp. 227–242, 2019. Proceedings of the 21st National Conference on Artificial Intelligence -
[21] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Volume 1, ser. AAAI’06. AAAI Press, 2006, p. 500–505.
Chemometrics and Intelligent Laboratory Systems, vol. 2, no. 1, pp. 37
– 52, 1987, proceedings of the Multivariate Statistical Workshop for
Geologists and Geochemists.
[22] A. N. Jahromi, J. Sakhnini, H. Karimpour, and A. Dehghantanha,
“A deep unsupervised representation learning approach for effective
cyber-physical attack detection and identification on highly imbalanced
data,” in Proceedings of the 29th Annual International Conference on
Computer Science and Software Engineering, ser. CASCON ’19. USA:
IBM Corp., 2019, p. 14–23.
[23] T. Morris, Z. Thornton, and I. Tunipseed, “Industrial control system

2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Guelph. Downloaded on April 07,2021 at 22:22:55 UTC from IEEE Xplore. Restrictions apply.
View publication stats

Computational Fluid Dynamics Sheets
No ratings yet
Computational Fluid Dynamics Sheets
12 pages
New Project Page 3
No ratings yet
New Project Page 3
53 pages
Electronics
No ratings yet
Electronics
28 pages
CSSE - Anomaly Detection in ICS Datasets With Machine Learning Algorithms
No ratings yet
CSSE - Anomaly Detection in ICS Datasets With Machine Learning Algorithms
14 pages
Al Hawawreh2018 PDF
No ratings yet
Al Hawawreh2018 PDF
11 pages
Developing A Hybrid Intrusion Detection System Using Data Mining For Power Systems2
No ratings yet
Developing A Hybrid Intrusion Detection System Using Data Mining For Power Systems2
10 pages
ABAL-abal IJIES May 1-2 2024 Alikhan Bekzhanov 5-Kazakstan
No ratings yet
ABAL-abal IJIES May 1-2 2024 Alikhan Bekzhanov 5-Kazakstan
7 pages
Itc16 PLC
No ratings yet
Itc16 PLC
10 pages
Classifier
No ratings yet
Classifier
9 pages
Industrial Cyber-Physical Systems Protection - A Methodological Review
No ratings yet
Industrial Cyber-Physical Systems Protection - A Methodological Review
21 pages
Decomposition and sequential-AND Analysis of
No ratings yet
Decomposition and sequential-AND Analysis of
20 pages
Artigo Descrevendo Ataques PDF
No ratings yet
Artigo Descrevendo Ataques PDF
20 pages
Data-Driven Cybersecurity Knowledge Graph Construc
No ratings yet
Data-Driven Cybersecurity Knowledge Graph Construc
13 pages
Industrial Control System-Anomaly Detection Dataset ICS-ADD For Cyber-Physical Security Monitoring in Smart Industry Environments
No ratings yet
Industrial Control System-Anomaly Detection Dataset ICS-ADD For Cyber-Physical Security Monitoring in Smart Industry Environments
10 pages
Journal Pre-Proof
No ratings yet
Journal Pre-Proof
29 pages
Machine Learning-Based Intrusion Detection For Smart Grid Computing: A Survey
No ratings yet
Machine Learning-Based Intrusion Detection For Smart Grid Computing: A Survey
29 pages
A Unified Architectural Approach For Cyberattack-Resilient Industrial Control Systems
No ratings yet
A Unified Architectural Approach For Cyberattack-Resilient Industrial Control Systems
25 pages
A Survey On Industrial Control System Testbeds
No ratings yet
A Survey On Industrial Control System Testbeds
46 pages
Sensors 23 04060
No ratings yet
Sensors 23 04060
26 pages
False Data Injection Attack Detection For Industrial Control Systems Based On Both Time - and Frequency-Domain Analysis of Sensor Data
No ratings yet
False Data Injection Attack Detection For Industrial Control Systems Based On Both Time - and Frequency-Domain Analysis of Sensor Data
11 pages
A Survey On The Detection Algorithms For False Data Injection Attacks in Smart Grids
No ratings yet
A Survey On The Detection Algorithms For False Data Injection Attacks in Smart Grids
17 pages
1 s2.0 S2352864823000640 Main
No ratings yet
1 s2.0 S2352864823000640 Main
15 pages
Model-Based Cybersecurity Analysis: Extending Enterprise Modeling To Critical Infrastructure Cybersecurity
No ratings yet
Model-Based Cybersecurity Analysis: Extending Enterprise Modeling To Critical Infrastructure Cybersecurity
34 pages
Electronics 11 00867 v2
No ratings yet
Electronics 11 00867 v2
19 pages
ARMET Approach
No ratings yet
ARMET Approach
15 pages
A Review On Cybersecurity Analysis, Attack Detection, and Attack Defense Methods in Cyber-Physical Power Systems
No ratings yet
A Review On Cybersecurity Analysis, Attack Detection, and Attack Defense Methods in Cyber-Physical Power Systems
17 pages
Learning Approaches For Security and Privacy in Internet of Things
No ratings yet
Learning Approaches For Security and Privacy in Internet of Things
12 pages
Smart Grid Cyber-Physical Attack and Defense - A Review
No ratings yet
Smart Grid Cyber-Physical Attack and Defense - A Review
19 pages
Off-the-Shelf Solutions As Potential Cyber Threats To Industrial Environments and Simple-To-Implement Protection Methodology
No ratings yet
Off-the-Shelf Solutions As Potential Cyber Threats To Industrial Environments and Simple-To-Implement Protection Methodology
14 pages
1 s2.0 S1874548222000348 Main
No ratings yet
1 s2.0 S1874548222000348 Main
15 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
1 page
Evaluation of Machine Learning Algorithms Used On Attacks Detection in Industrial Control Systems
No ratings yet
Evaluation of Machine Learning Algorithms Used On Attacks Detection in Industrial Control Systems
12 pages
A Survey On Intrusion Detection System in IoT Networks
No ratings yet
A Survey On Intrusion Detection System in IoT Networks
19 pages
Trustworthy and Reliable Cyberattack Detection in Industrial IOT
No ratings yet
Trustworthy and Reliable Cyberattack Detection in Industrial IOT
6 pages
Adversarial Attacks On Machine Learning Cybersecurity Defences in Industrial Control Systems
No ratings yet
Adversarial Attacks On Machine Learning Cybersecurity Defences in Industrial Control Systems
9 pages
Smart Grid Cyber-Physical Attack and Defense: A Review: Hang Zhang,, Bo Liu,, Hongyu Wu
No ratings yet
Smart Grid Cyber-Physical Attack and Defense: A Review: Hang Zhang,, Bo Liu,, Hongyu Wu
19 pages
Machine Learning For Cybersecurity in Smart Grids
No ratings yet
Machine Learning For Cybersecurity in Smart Grids
24 pages
Industrial Control System Device Classification Using Network Traff - 2021 - Arr
No ratings yet
Industrial Control System Device Classification Using Network Traff - 2021 - Arr
11 pages
Comparison and Investigation of AI-Based Approaches For Cyberattack Detection in Cyber-Physical Systems
No ratings yet
Comparison and Investigation of AI-Based Approaches For Cyberattack Detection in Cyber-Physical Systems
17 pages
AML Based Intrusion Detection
No ratings yet
AML Based Intrusion Detection
17 pages
Vulnerabilities Paper
No ratings yet
Vulnerabilities Paper
18 pages
SSRN 4749820
No ratings yet
SSRN 4749820
8 pages
1 s2.0 S0167404823002250 Main
No ratings yet
1 s2.0 S0167404823002250 Main
14 pages
Detection and Location of A Cyber Attack in An Active Distribution System
No ratings yet
Detection and Location of A Cyber Attack in An Active Distribution System
10 pages
Anomaly Events Classification and Detection System in Critical Industrial Internet of Things Infrastructure Using Machine Learning Algorithms
No ratings yet
Anomaly Events Classification and Detection System in Critical Industrial Internet of Things Infrastructure Using Machine Learning Algorithms
22 pages
Securing Industrial Control Systems Components Cyber Threats, and Machine Learning-Driven Defense Strategies
No ratings yet
Securing Industrial Control Systems Components Cyber Threats, and Machine Learning-Driven Defense Strategies
78 pages
A Deep and Scalable Unsupervised Machine Learning System For Cyber-Attack Detection in Large-Scale Smart Grids
No ratings yet
A Deep and Scalable Unsupervised Machine Learning System For Cyber-Attack Detection in Large-Scale Smart Grids
11 pages
Cyber-Critical Infrastructure Protection Using Rea
No ratings yet
Cyber-Critical Infrastructure Protection Using Rea
13 pages
1 s2.0 S016740482300007X Main
No ratings yet
1 s2.0 S016740482300007X Main
18 pages
Computer Networks: Muhammad Rizwan Asghar, Qinwen Hu, Sherali Zeadally
No ratings yet
Computer Networks: Muhammad Rizwan Asghar, Qinwen Hu, Sherali Zeadally
16 pages
A Smart Digital Twin Enabled Security Framework For Vehicle-to-Grid Cyber-Physical Systems
No ratings yet
A Smart Digital Twin Enabled Security Framework For Vehicle-to-Grid Cyber-Physical Systems
14 pages
IoT Network Attack Detection Using Supervised Machine Learning
No ratings yet
IoT Network Attack Detection Using Supervised Machine Learning
15 pages
Adaptive Hierarchical Cyber Attack Detection and Localization in Active Distribution Systems
No ratings yet
Adaptive Hierarchical Cyber Attack Detection and Localization in Active Distribution Systems
12 pages
Looking Back To Look Forward - Lessons Learnt From Cyber-Attacks On Industrial
No ratings yet
Looking Back To Look Forward - Lessons Learnt From Cyber-Attacks On Industrial
14 pages
Resilient Control of Cyber-Physical Systems Against Denial-of-Service Attacks
No ratings yet
Resilient Control of Cyber-Physical Systems Against Denial-of-Service Attacks
6 pages
A Novel Approach For Enhancing Cyber Resiliency in Distance Relay Using PCA and Random Forest
No ratings yet
A Novel Approach For Enhancing Cyber Resiliency in Distance Relay Using PCA and Random Forest
5 pages
Sensors 23 02415
No ratings yet
Sensors 23 02415
18 pages
Formato de Excel Modelo para Revision de Literatura
No ratings yet
Formato de Excel Modelo para Revision de Literatura
11 pages
1 s2.0 S2452414X2300122X Main
No ratings yet
1 s2.0 S2452414X2300122X Main
13 pages
Theories of Evolution
No ratings yet
Theories of Evolution
17 pages
Shayri
No ratings yet
Shayri
15 pages
USAF Squadron Facilities Design Guide
No ratings yet
USAF Squadron Facilities Design Guide
27 pages
CS 2336 Discrete Mathematics: Counting: Permutations and Combinations
No ratings yet
CS 2336 Discrete Mathematics: Counting: Permutations and Combinations
29 pages
ASD 89 The Sheet Is Designed For Calculating The Unity Ratio of A Member Subjecting To Moment and Axial Load.
No ratings yet
ASD 89 The Sheet Is Designed For Calculating The Unity Ratio of A Member Subjecting To Moment and Axial Load.
7 pages
MOS Installation of PUBLIC ADDRES Systems
No ratings yet
MOS Installation of PUBLIC ADDRES Systems
15 pages
03 - G01 Voltage Supply and Bus Systems
100% (1)
03 - G01 Voltage Supply and Bus Systems
52 pages
Statistik English
No ratings yet
Statistik English
16 pages
Final + Sol - Spring 2023
No ratings yet
Final + Sol - Spring 2023
11 pages
2 - My Favourite Season
No ratings yet
2 - My Favourite Season
3 pages
Online Notebook - by Slidesgo
No ratings yet
Online Notebook - by Slidesgo
9 pages
The Book of Love and Creation A Channeled Text Multiformat Download
100% (17)
The Book of Love and Creation A Channeled Text Multiformat Download
17 pages
Surface Area
No ratings yet
Surface Area
1 page
FANUC PICTURE Specification (Edition 8.0 or Later)
No ratings yet
FANUC PICTURE Specification (Edition 8.0 or Later)
758 pages
Timestamp Enrollment No Name Father Name Gender
No ratings yet
Timestamp Enrollment No Name Father Name Gender
12 pages
19xr Impeller
No ratings yet
19xr Impeller
1 page
Installation Instructions LLC7813
No ratings yet
Installation Instructions LLC7813
2 pages
Unit 1 Assessment Template
No ratings yet
Unit 1 Assessment Template
2 pages
GE2 - Exercise 2.1 Juvine Ramos
No ratings yet
GE2 - Exercise 2.1 Juvine Ramos
4 pages
Navier Solution
No ratings yet
Navier Solution
81 pages
Sindhu Rudianto - PDF - Wiratman Wangsadinata .PDF - Ellen M. Rathje - Makalah
No ratings yet
Sindhu Rudianto - PDF - Wiratman Wangsadinata .PDF - Ellen M. Rathje - Makalah
10 pages
Physics Grade 10 Unit 4 Summarized Note
No ratings yet
Physics Grade 10 Unit 4 Summarized Note
24 pages
Q3 - Periodical Test MUSIC
No ratings yet
Q3 - Periodical Test MUSIC
2 pages
Utf-8virtual 20lesson 20plan 20template
No ratings yet
Utf-8virtual 20lesson 20plan 20template
3 pages
(Ebook PDF) Retailing: Integrated Retail Management, 3rd Edition Instant Download
100% (2)
(Ebook PDF) Retailing: Integrated Retail Management, 3rd Edition Instant Download
45 pages
Sripura GP
No ratings yet
Sripura GP
6 pages
Workbook Unit 11 and 12 by Cristofer
No ratings yet
Workbook Unit 11 and 12 by Cristofer
22 pages
Quiz 2
No ratings yet
Quiz 2
8 pages
The First Quarterly Assessment Results of Grade 2
No ratings yet
The First Quarterly Assessment Results of Grade 2
13 pages

Toward Detection and Attribution of Cyber-Attacks in Iot-Enabled Cyber-Physical Systems

Uploaded by

Toward Detection and Attribution of Cyber-Attacks in Iot-Enabled Cyber-Physical Systems

Uploaded by

This article has been accepted for publication in a future issue of this journal, but has not been

Toward Detection and Attribution of Cyber-Attacks

Attack Detection Method Attack Attribution Method

Normal Data Normal One-vs-All

Fig. 1. Proposed attack detection and attribution framework

matrix of hnormal which is part of the features show how the

SWaT Dataset Pipeline Dataset

Model NMRI CMRI MSCI MPCI MFCI DoS Recon.

(A) (B) (C)

(D) (E) (F)

SWaT Dataset Pipeline Dataset

Accuracy Precision Recall f-measure

TABLE V and capable of detecting previously unseen attacks. The attack

Input Data Detection Attack

Fig. 4. Interaction of the proposed framework in real environment.

You might also like