0% found this document useful (0 votes)

8 views11 pages

A Self-Attention-Based Deep Convolutional

Uploaded by

Faraz Ali Arain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views11 pages

A Self-Attention-Based Deep Convolutional

Uploaded by

Faraz Ali Arain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Received 13 January 2024, accepted 18 March 2024, date of publication 22 March 2024, date of current version 2 April 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3380816

A Self-Attention-Based Deep Convolutional

Neural Networks for IIoT Networks
Intrusion Detection
MOHAMMED S. ALSHEHRI 1 , OUMAIMA SAIDANI 2 , FATMA S. ALRAYES 2 ,
SAADULLAH FAROOQ ABBASI 3 , AND JAWAD AHMAD 4 , (Senior Member, IEEE)
1 Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 61441, Saudi Arabia
2 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh
11671, Saudi Arabia
3 Department of Electronic, Electrical and Systems Engineering, University of Birmingham, B15 2TT Birmingham, U.K.
4 School of Computing, Engineering and the Built Environment, Edinburgh Napier University, EH10 5DT Edinburgh, U.K.

Corresponding author: Saadullah Farooq Abbasi ([email protected])

The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the research groups
funding program grant code (NU/RG/SERC/12/3). Princess Nourah bint Abdulrahman University Researchers Supporting Project number
(PNURSP2024R319), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

ABSTRACT The Industrial Internet of Things (IIoT) comprises a variety of systems, smart devices, and
an extensive range of communication protocols. Hence, these systems face susceptibility to privacy and
security challenges, making them prime targets for malicious attacks that can result in harm to the overall
system. Privacy breach issues are a notable concern within the realm of IIoT. Various intrusion detection
systems based on machine learning (ML) and deep learning (DL) have been introduced to detect malicious
activities within these networks and identify attacks. However, traditional ML and DL models encounter
significant hurdles when faced with highly imbalanced training data and repetitive patterns within network
datasets, hampering their performance in distinguishing between various classes of attacks. To overcome
the challenges inherent in existing systems, this paper presents a self-attention-based deep convolutional
neural network (SA-DCNN) model designed for monitoring the IIoT networks and detecting malicious
activities. The SA mechanism computes the significance value for each input feature, and the DCNN
processes these parameters to detect IIoT network behavior. Additionally, a two-step cleaning method has
been implemented to eliminate redundancy within the training data, considering both intra-class and cross-
class samples. Furthermore, to tackle the issue of underfitting, we have employed a mutual information-
based feature filtering method. This method ranks all the features in descending order based on their mutual
information and subsequently removes the features with negative impact from the dataset. The performance
of the SA-DCNN model is assessed using IoTID20 and Edge-IIoTset datasets. Moreover, the proposed study
is demonstrated through a comprehensive comparison with other ML and DL models, as well as against
relevant studies, showcasing the superior performance and efficacy of the proposed model.

INDEX TERMS Attention mechanism, CNN, deep learning, IIoT, intrusion detection.

I. INTRODUCTION analyzing information [1], [2]. Its primary objectives include

The Industrial Internet of Things (IIoT) is an interlinked elevating operational efficiency, facilitating predictive main-
network of smart devices, sensors, and machines employed tenance, optimizing processes, and enhancing overall pro-
within industrial environments for gathering, sharing, and ductivity across various industries, including manufacturing,
energy, transportation, and healthcare [3], [4], [5], [6].
The associate editor coordinating the review of this manuscript and The environment of the IIoT is characterized by a variety
approving it for publication was Rupak Kharel . of systems, smart devices, and an extensive range of
2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
45762 For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 12, 2024
M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

feature filtering, and normalization. The cleaning step

encompasses sub-processes. Firstly, instances with undefined
and missing values are removed. Next, duplication is removed
from the datasets. The dataset is scanned for duplication
within an attack class and eliminated. Furthermore, the
dataset is examined across all classes to identify duplicate
instances where only the attack label is changed, and
these duplications are removed from all the attack classes.
Moreover, we employ the mutual information method for
feature filtering. This method ranks features in descending
order and removes those features that negatively impact the
FIGURE 1. A cyber attack scenario on an IIoT network. model, leading to underfitting.
The performance of the SA-DCNN model is assessed
using two real-time IoT and IIoT network intrusion detec-
communication protocols [7], [8], [9]. Hence, these systems tion datasets, namely IoTID20 and Edge-IIoTset. Various
face susceptibility to privacy and security challenges, making evaluation metrics, including precision, recall, F1-score,
them prime targets for malicious attacks that can result in and accuracy, are employed to assess the performance.
harm to the overall system. Privacy breach issues are a Furthermore, to validate the proposed method’s performance,
notable concern within the realm of IIoT [10], [11], [12]. it is compared with several other machine learning (ML) and
Figure 1 illustrates a cyberattack scenario wherein a botnet deep learning (DL) models, as well as with findings from
is employed to initiate a distributed denial of service (DDoS) related articles. The major contributions of this article are
attack on an industrial IoT network, with a specific focus on outlined as follows:
industrial servers.
For the security of the IIoT network, numerous researchers • A novel DL-based IDS called SA-DCNN is introduced
and experts have introduced various intrusion detection for the prediction of intrusions in IIoT networks. This
systems (IDS) designed to identify cyber-attacks within model comprises of a self-attention mechanism and
these networks and identify attacks [13], [14], [15], [16]. the DCNN model. The self-attention mechanism is
Machine learning (ML) and deep learning (DL)-based IDSs utilized to compute the significance of each input value,
play a crucial role in identifying malicious attacks due while DCNN processes these values to detect network
to their generalization capabilities, enabling them to learn behaviors.
from network datasets and recognize previously unseen • In this study, a two-step cleaning process is imple-
patterns [17], [18], [19], [20]. The existing ML and DL- mented. The first step involves removing instances with
based models demonstrate satisfactory performance for a empty and undefined values, while the second step aims
limited number of attack identifications [21]. However, their to eliminate duplications from the dataset. During the
effectiveness diminishes as the number of classes increases, removal of duplications, both intra-class and cross-class
especially when confronted with highly imbalanced training duplications are addressed in the datasets.
set data. Additionally, certain network datasets contain • A feature filtering method is employed to rank all
repetitive data, leading to inflated model performance on features in descending order and eliminate those that
those specific datasets, as the model has encountered much of adversely affect the model’s performance, potentially
the test set data during training. Moreover, the ML and DL- leading to underfitting. Specifically, the mutual informa-
based models decrease performance when confronted with tion technique is employed for feature filtering, retaining
datasets that include repetitions of similar data across various only those features that positively impact the model.
classes, where only the class labels are different. • The effectiveness of the SA-DCNN method has been
To address these challenges, this study proposes a self- validated by comparing the outcomes with other ML and
attention DL method for the prediction of intrusions in IIoT DL models. The other methods were implemented under
networks, along with preprocessing steps to prepare data for the same experimental environment as the proposed
the model. The proposed model consists of self-attention model, and the preprocessing steps were consistent for
(SA) and deep convolutional neural networks (DCNN). all models, including the proposed SA-DCNN.
The SA computes the significance value for each input The remainder of the paper is structured as follows:
attribute [22], and DCNN processes these parameters to Section II provides an overview of existing works. Section III
detect IIoT network behavior. The primary advantage of delves into a detailed presentation of the proposed model.
DCNN is its ability to converge inputs toward the most The methodology behind the proposal is expounded upon
impactful parameters and reduce the overall number of in Section IV. Section V encompasses a comprehensive
parameters [23], [24]. This process enhances detection per- discussion of the results, accompanied by a comparison of
formance while minimizing time consumption. Additionally, the SA-DCNN model with other methods. Lastly, Section VI
the preprocessing steps involve cleaning, numericalization, serves as the concluding section for the entire paper.

VOLUME 12, 2024 45763

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

II. RELATED WORK

The rapid expansion of the IIoT in industrial sectors
brings numerous benefits but also exposes vulnerabilities to
malicious attackers. Many researchers and experts have been
diligently working on improving security and have proposed
various methods for identifying malicious attacks within
these networks.
Authors in [25] present a DCNN model designed for
IoT network monitoring and the identification of malicious
activities. The DCNN model was applied to both category
and sub-category scenarios to discern the sub-class of attacks.
To evaluate the model’s performance, the authors utilized the
IoTID20 dataset. From the experiments, they achieved an
accuracy of 77.55% in detecting malicious activities.
In [26], the authors proposed a hybrid model combining
Convolutional Neural Network (CNN) and Long Short-Term
Memory (LSTM) for detecting intrusions in IoT network
scenarios. The primary emphasis of the authors is on enhanc-
ing the model’s performance, specifically concentrating on
identifying sub-categories of IoT network attacks. They
employed the Edge-IIoTset dataset to assess the performance
of the CNN-LSTM model. The results showcase a remarkable
98.69% accuracy in classifying attacks.
In [27], the authors introduced an Extreme Learning
Machine, Support Vector Machine models, and a rule-
based intrusion detection system similar to SNORT for IIoT
networks. The performance of the proposed model was FIGURE 2. Basic architecture of SA-DCNN.
evaluated using the KDD99, UNSW-NB15, CSE-CIC-IDS-
2018, and Edge-IIoTset datasets. They achieved accuracy
rates of 97.83%, 96.59%, 92.54%, and 97.27% for the
respective datasets. these studies exhibit repetitive data patterns, which can result
In [28], the authors employed the LSTM model for in inflated model performance on those particular datasets.
monitoring Software-Defined Networking (SDN)-enabled This is because the model may encounter much of the test set
IoT networks and detecting cyberattacks. The authors data during the training phase, leading to an overestimation
specifically concentrated on enhancing the accuracy of Low- of its effectiveness. Furthermore, ML and DL models tend to
Rate Distributed Denial of Service (LDDoS) detection. They experience a decline in performance when confronted with
utilized the Edge-IIoTset dataset to evaluate the models. The datasets containing repetitions of similar data across various
results presented in the paper demonstrate an impressive classes, even if only the class labels differ.
98.88% accuracy in the classification of multi-class sub-
category classifications. III. THE PROPOSED SA-DCNN MODEL
In [29], the authors introduced a hybrid model that This study proposes a novel DL model called SA-DCNN
combines a bidirectional gated recurrent unit (B-GRU) and for IIoT network traffic monitoring and detection of cyber-
LSTM for identifying cyber attacks in edge-envisioned smart attacks. The SA-DCNN consists of a self-attention mech-
agriculture networks. The authors focused specifically on anism and deep convolutional neural networks (DCNN),
enhancing the detection of DDoS attacks in these networks. as depicted in Fig 2. The self-attention mechanism computes
They assessed the model using the Edge-IIoTset dataset, and the significance value for each input feature, and the DCNN
the experimental outcomes revealed an impressive 98.32% processes these parameters to detect IIoT network behavior.
accuracy. The primary advantage of DCNN is its ability to converge
The related studies predominantly concentrate on improv- inputs toward the most impactful parameters and reduce
ing the performance of intrusion detection in IoT and IIoT the overall number of parameters. This process enhances
networks. However, a common limitation in these studies is detection performance while minimizing time consumption.
the oversight of data-cleaning procedures before the training In the proposed model, the self-attention mechanism is
phase. Specifically, there is a lack of attention to addressing used to compute attention scores and highlight the importance
redundancies within the data belonging to the same class of each input feature. This mechanism calculates the attention
(intra-class) and neglecting the inclusion of mixed data across score based on queries (Q), keys (k), and values (V). Q, K,
different classes (inter-class). Some network datasets used in and V are computed using Eq 1, Eq 2, and Eq 3, respectively,

45764 VOLUME 12, 2024

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

where X is the input and W is the learning weight. IV. THE PROPOSED APPROACH
This section offers a thorough exploration of the implemented
Q = Wq · X (1) approach as depicted in Figure 3, which highlights its key
K = Wk · X (2) stages. The framework initiates with an in-depth analysis of
V = Wv · X (3) the employed dataset, covering various preprocessing stages.
Subsequently, the data undergoes stratified splitting into
Eq 4 is used to compute the attention score (AS ), where dq training and testing sets. Following these stages, the model
is the length of Q. Subsequently, the attention value (AV ) is proceeds through training and testing processes.
computed using Eq 5.
Q · KT A. DATASETS
AS = p (4)
dq The IoTID20 and Edge-IIoTset are widely recognized
AV = softmax (AS ) · V (5) and extensively used datasets in the research community.
The IoTID20 dataset has been collected from home IoT
Once the attention value for each input feature is calculated networks to facilitate the detection of cyber attacks [32].
by the SA mechanism, it is then fed into the DCNN layers. Its primary advantage stems from the inclusion of up-to-
The primary advantage of a CNN model lies in its capability date communication data and innovative samples, enhancing
to effectively capture the significance of input parameters. the capability to detect network intrusions [33]. The dataset
Furthermore, CNN operates with fewer parameters com- comprises a total of 625,783 samples, with 40,073 classified
pared to recurrent algorithms in deep learning, resulting in as normal and the remaining 585,710 categorized into four
improved processing speed [30]. A typical CNN architecture types of attacks. Furthermore, these four types of attacks are
consists of convolutional layers, pooling layers, and fully subdivided into eight sub-types. The Edge-IIoTset includes
connected layers [31]. In DCNN, We used four convolutional samples of IoT and IIoT network traffic collected from
layers, two max-pooling layers, a flattening layer, and three a testbed consisting of seven layers. Comprising fourteen
fully connected feedforward neural network (FFNN) layers in attacks associated with IoT and IIoT communication proto-
the proposed SA-DCNN model. The convolutional layers are cols [34], the Edge-IIoTset comprises a total of 2,219,201
utilized to emphasize each parameter using a kernel, where samples. Among these, 1,615,643 samples are classified as
the size of the kernel is three. Within this layer, the ReLU normal, while the remaining 603,558 samples are related to
activation function is used. The convolutional operation is 14 different attacks.
represented in Eq 6 and 7.
N
X B. DATA PREPROCESSING
xk = bk + (Pi , wik ) (6) The preprocessing steps are important for readying the
i=1 dataset for optimal compatibility with ML and DL models.
yk = max(0, xk ) (7) This paper employs several preprocessing steps, encompass-
ing data preparation, feature filtering, normalization, and the
where xk denotes the input in convolutional, while Pk division of the dataset into train and test sets.
signifies the output of the preceding layer. wik corresponds
to the kernel spanning from index i to k, and bk denotes the
bias associated with the neuron in the convolutional layer. 1) DATA PREPARATION
The output of the convolutional layer is passed into the max- Data preparation is the initial step of preprocessing, involving
pooling layer which selects the most significant parameters as two main methods: the first is cleaning, and the second is the
expressed in Eq 8, where Mk is the output of the max-pooling conversion of categorical attributes into numerical format.
layer.
a: CLEANING
Mk =max
i∈ℜ yk (8)
The data cleaning process consists of two sub-steps. In the
The output of the max-pooling layer is forwarded to the first sub-step, we eliminate instances with undefined and
flattening layer, which transforms it into a one-dimensional ‘Null’ values from the dataset using the Pandas library
array. This array is then passed to the fully connected FFNN in Python. In the second sub-step, we address duplica-
layers. The FFNN comprises three layers, with the first two tion in the dataset by employing two methods. Initially,
layers being hidden layers utilizing the ReLU activation we remove duplications within the same class using the
function. The final layer is dedicated to producing output drop_duplicates function from the Pandas library. Subse-
probabilities, and for this purpose, the softmax activation quently, we eliminate duplications across different classes
function is employed, as expressed in Eq 9. by considering all attributes instead of just the classifi-
cation label. For this, we utilize the drop and duplicate
exi
softmax(x)i = PK (9) functions of Pandas, leveraging indexes to facilitate the
xj
j=1 e process.

VOLUME 12, 2024 45765

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

FIGURE 3. The proposed architecture block diagram.

b: FEATURES ENCODING all attributes with a value greater than 0.1, eliminating those
The datasets we employed contain numerous features in with zero or near-zero impact values. Out of 83 attributes,
categorical form, necessitating conversion into numerical 56 were chosen for the IoTID20 dataset, and for the Edge-
format for compatibility with DL models aimed at predicting IIoTset dataset, 29 out of 62 features were selected for the
network activity behaviors. To accomplish this, we opted experiment.
for the label encoder method. This method assigns a unique
numerical value to each category of values within an attribute, 3) NORMALIZATION
following an alphabetic order. We chose this approach due Normalization involves rescaling data to a standardized
to its efficiency in terms of memory usage and processing range. The performance of classifiers is impacted by
power, as opposed to the one-hot encoder. The one-hot features with diverse ranges. The utilized datasets encompass
encoder, while effective, demands additional memory for the attributes with varying scales, necessitating normalization.
conversion of categorical features. In this experiment, we employ the min-max normalization
technique to normalize features within the range of 0 to 1,
2) FEATURES FILTERING
as presented in Equation 10.
In this experiment, we employed a feature filtering method x − xmin
Xnorm = (10)
to identify influential attributes within the dataset, while xmax − xmin
excluding features that have a negative impact on the
classifier. The negative impact of certain attributes arises due 4) STRATIFIED SPLIT
to the amalgamation of data from different classes without The stratified method is utilized to divide the data into
providing discernible patterns. To filter the attributes in training and testing sets, maintaining specified percentages
the utilized datasets, we employed the mutual information to ensure a balanced representation of each class in the splits.
method, which demonstrates the impact of each feature and In this instance, we applied the stratified approach to allocate
ranks them in descending order based on entropy. We selected 80% of the data to the training set and 20% to the test set.

45766 VOLUME 12, 2024

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

TABLE 1. Performance assessment with various layers combination on IoTID20 category.

TABLE 2. Performance assessment with various layers combination on IoTID20 sub-category.

C. THE PROPOSED SA-DCNN HYPERPARAMETERS of the outcomes of the proposed model is given, along
In this experiment, we utilize various hyperparameters to with a comparison with other models and state-of-the-art
achieve optimal performance. In all convolutional layers, articles. To assess the effectiveness of the proposed SA-
64 filters, a kernel size of 3, the same padding, and the DCNN model, we employed four evaluation metrics, namely
ReLU activation function are utilized. The max pooling accuracy, precision, recall, and F1-score.
layer employs a pool size of 2. In the feedforward neural
network layer, three layers are used. The first two layers A. IMPLEMENTATION ENVIRONMENT
are hidden layers with 64 and 32 hidden units, respectively, Experiments were conducted on an HP desktop system
employing the ReLU activation function. The final layer of equipped with a core-i9 nine-generation CPU, a GEFORCE
the feedforward neural network is the output layer of the RTX 2080 GPU, and 32 GB of RAM. The Python
model, where the softmax function is employed to produce 3.11 programming language, along with Jupyter Notebook,
probabilities for multi-classification. was employed for the implementation of classifiers. Various
The sparse categorical cross-entropy function is employed libraries, such as Tensorflow, Pandas, sci-kit-learn, and
for loss calculation, and the Adam optimizer is utilized to Numpy, were leveraged to support the implementation. It is
optimize weights during training. A batch size of 32, along noteworthy that all these tools were run on a Windows 11 Pro
with a configured number of 100 epochs, has been selected 64-bit operating system to ensure consistency and compati-
for the IoTID20 dataset, aiming to facilitate an efficient bility.
and effective training process. For the EdgeIIoTset dataset,
a batch size of 32 and a configured number of 20 epochs have B. THE PROPOSED SA-DCNN OUTCOMES
been chosen to achieve optimal training performance. In this section, we present the experimental outcomes
of the proposed SA-DCNN with various hyperparameter
V. EXPERIMENTATION AND FINDINGS variations for two scenarios: multi-class category and multi-
This section primarily focuses on the experimental findings. class sub-category classifications, utilizing both datasets.
Initially, it presents the evaluation metrics used in the Furthermore, to assess the efficacy of the SA-DCNN model,
experiments. Following that, a brief overview of the exper- we conducted experiments with several other traditional ML
imental system environment where all the experiments were and DL models in the same environment and compared
conducted is provided. Subsequently, a detailed presentation the results with those of the proposed model. Additionally,

VOLUME 12, 2024 45767

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

FIGURE 4. Performance of the proposed SA-DCNN on IoTID20 category.

FIGURE 5. Performance of the proposed SA-DCNN on IoTID20 sub-category.

TABLE 3. Performance assessment with various layers combination on Edge-IIoTset category.

we compare the performance achieved by the proposed The analysis covered two scenarios: multi-class category
model with state-of-the-art articles on the same datasets and sub-category classification. Tables 1 and 2 provide a
to demonstrate the efficacy of the proposed model. The comprehensive analysis of the testing results for the proposed
outcomes are validated through a fivefold cross-validation SA-DCNN model across various layer combinations. Upon
process. evaluating the results, it becomes evident that the proposed
SA-DCNN demonstrated optimal performance with four
1) OUTCOMES WITH VARIOUS HIDDEN LAYERS ON IoTID20 convolutional, two max-pooling, and three fully connected
As mentioned earlier, the datasets were divided into training FFNN layers.
and testing sets, with proportions of 80% and 20%, respec- Additionally, Figures 4 and 5 depict the training and
tively. Following that, the model underwent training on the validation performance, serving as an evaluation of the
training set using various configurations of hidden layers. proposed SA-DCNN model for potential overfitting issues.

45768 VOLUME 12, 2024

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

TABLE 4. Performance assessment with various layers combination on Edge-IIoTset sub-category.

TABLE 5. Results comparison with other models on IoTID20 category. TABLE 7. Results comparison with other models on Edge-IIoTset category.

across different layer combinations for the Edge-IIoTset

TABLE 6. Results comparison with other models on IoTID20 sub-category.
dataset. Notably, the SA-DCNN model exhibited optimal
performance when configured with four convolutional layers,
two max-pooling layers, and three fully connected FFNN
layers. To further assess the model’s generalization capability,
Figures 6 and 7 illustrate the training and validation
performance, ensuring the absence of overfitting concerns.
The alignment of accuracy and loss trends across epochs for
both training and validation sets indicates the robustness of
the proposed SA-DCNN architecture in handling the Edge-
IIoTset dataset.

3) PERFORMANCE COMPARISON WITH OTHER ML AND DL

Accuracy and loss during each epoch were scrutinized for MODELS
both the training and validation results. The visual analysis The effectiveness of the SA-DCNN model was affirmed
of the training and validation results reveals closely aligned through a comprehensive validation process, which involved
performance, indicating that the proposed model did not comparing its outcomes with those of various cutting-edge
demonstrate signs of overfitting. methods. For comparison, traditional ML and sophisticated
DL models were employed, encompassing the multi-layer
2) OUTCOMES WITH VARIOUS HIDDEN LAYERS ON perceptron (MLP), gaussian naive Bayes (GNB), linear
EDGE-IIoTSET regression (LR), deep-autoencoder (DAE), LSTM, GRU, and
For the Edge-IIoTset dataset, a similar experimental setup CNN. It’s noteworthy that these models were executed within
was employed as the IoTID20 dataset, with the data the same environment, incorporating identical preprocessing
partitioned into 80% for training and 20% for testing. The steps as the proposed model. This approach ensured an
SA-DCNN model underwent training on the training set equitable and meaningful assessment of their respective
with varying hidden layer configurations to explore its performances. All the implemented DL models utilized the
performance. The evaluation focused on multi-class category sparse categorical cross-entropy loss function, employed the
and sub-category classification scenarios. Tables 3 and 4 Adam optimizer, and were trained with a batch size of 32.
present a detailed examination of the testing outcomes The training phase of each model was iterated for 100 epochs

VOLUME 12, 2024 45769

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

FIGURE 6. Performance of the proposed SA-DCNN on Edge-IIoTset category.

FIGURE 7. Performance of the proposed SA-DCNN on Edge-IIoTset sub-category.

TABLE 8. Results comparison with other models on Edge-IIoTset TABLE 9. Performance comparison with related articles.
sub-category.

on the IoTID20 dataset and 20 epochs on the Edge-IIoTset classification, respectively. Evaluation of the test results
dataset. shows that the proposed model gives optimal performances
compared to other algorithms, with superior performance in
a: PERFORMANCE COMPARISON ON IoTID20 DATASET detecting malicious activities within IIoT networks.
The comparative analysis of testing performance between
the proposed SA-DCNN and alternative models is outlined c: PERFORMANCE COMPARISON WITH RELATED ARTICLES
in Table 5 for category classification and Table 6 for sub- To evaluate the enhancement in the detection performance
category classification on the IoTID20 dataset, respectively. of the proposed study, encompassing both preprocessing and
The examination of test results highlights the superior model performance, we conducted a comparative analysis
performance of the proposed model over other models. with state-of-the-art articles related to the same dataset.
Detailed results comparisons are presented in Table 9
b: PERFORMANCE COMPARISON ON EDGE-IIoTset DATASET showcasing an in-depth examination of the outcomes from
The comparison on the Edge-IIoTset dataset is presented other related articles and our study. The analysis of results
in Table 7 and Table 8 for category and sub-category demonstrates an improvement compared to existing studies,

45770 VOLUME 12, 2024

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

highlighting its excellent capabilities in efficiently detecting [6] S. Li, G. Chai, Y. Wang, G. Zhou, Z. Li, D. Yu, and R. Gao, ‘‘CRSF:
malicious activities within IIoT networks. An intrusion detection framework for industrial Internet of Things
based on pretrained CNN2D-RNN and SVM,’’ IEEE Access, vol. 11,
pp. 92041–92054, 2023.
VI. CONCLUSION [7] Y. Zhang, C. Yang, K. Huang, and Y. Li, ‘‘Intrusion detection of industrial
Internet-of-Things based on reconstructed graph neural networks,’’ IEEE
This paper introduces a self-attention-based deep convo- Trans. Netw. Sci. Eng., vol. 10, no. 5, pp. 2894–2905, Sep./Oct. 2023.
lutional neural network (SA-DCNN) model designed for [8] M. Mohy-eddine, A. Guezzaz, S. Benkirane, and M. Azrour, ‘‘An effective
monitoring IIoT networks and detecting malicious activ- intrusion detection approach based on ensemble learning for IIoT edge
computing,’’ J. Comput. Virol. Hacking Techn., vol. 19, no. 4, pp. 469–481,
ities. Additionally, a two-step cleaning method has been Dec. 2022.
implemented to eliminate redundancy within the training [9] M. Nuaimi, L. C. Fourati, and B. B. Hamed, ‘‘Intelligent approaches toward
intrusion detection systems for industrial Internet of Things: A systematic
data, considering both intra-class and cross-class samples.
comprehensive review,’’ J. Netw. Comput. Appl., vol. 215, Jun. 2023,
The proposed method overcomes the existing DL-based Art. no. 103637.
model’s challenges and improves the detection performance [10] M. Tanveer and S. Shabala, ‘‘Entangling the interaction between essential
and nonessential nutrients: Implications for global food security,’’ in Plant
of cyberattacks in the IIoT network. The performance Nutrition and Food Security in the Era of Climate Change. Amsterdam,
of the SA-DCNN model is assessed using IoTID20 and The Netherlands: Elsevier, 2022, pp. 1–25.
Edge-IIoTset datasets. The proposed model demonstrates [11] M. Tanveer, A. Badshah, A. U. Khan, H. Alasmary, and S. A. Chaudhry,
‘‘CMAF-IIoT: Chaotic map-based authentication framework for industrial
96.89% accuracy, 92.39% precision, 87.83% recall, and Internet of Things,’’ Internet Things, vol. 23, Oct. 2023, Art. no. 100902.
90.05% F1 score on the IoTID20 dataset. Additionally, [12] Q. Wang, X. Zhu, Y. Ni, L. Gu, and H. Zhu, ‘‘Blockchain for the
it achieves 99.95% accuracy, 99.46% precision, 99.61% IoT and industrial IoT: A review,’’ Internet Things, vol. 10, Jun. 2020,
Art. no. 100081.
recall, and a 99.53% F1 score on the Edge-IIoTset dataset. [13] E. Gyamfi and A. D. Jurcut, ‘‘Novel online network intrusion detection
These results represent optimal performance when compared system for industrial IoT based on OI-SVDD and AS-ELM,’’ IEEE Internet
Things J., vol. 10, no. 5, pp. 3827–3839, Mar. 2023.
to other traditional ML and DL paradigms. The other [14] S. Ullah, W. Boulila, A. Koubaa, Z. Khan, and J. Ahmad, ‘‘ABDNN-IDS:
models were implemented under the same experimental Attention-based deep neural networks for intrusion detection in industrial
environment, and the preprocessing steps were consistent for IoT,’’ in Proc. IEEE 98th Veh. Technol. Conf. (VTC-Fall), Oct. 2023,
pp. 1–5.
all models, including the proposed SA-DCNN. Furthermore, [15] N. W. Khan, M. S. Alshehri, M. A. Khan, S. Almakdi, N. Moradpoor,
the outcomes of this study were compared with the results of A. Alazeb, S. Ullah, N. Naz, and J. Ahmad, ‘‘A hybrid deep learning-
other related articles, indicating the improved performance of based intrusion detection system for IoT networks,’’ Math. Biosciences
Eng., vol. 20, no. 8, pp. 13491–13520, 2023.
the proposed study. In the future, the number of attack classes [16] S. Aldhaheri and A. Alhuzali, ‘‘SGAN-IDS: Self-Attention-Based gener-
is expected to increase further, considering additional sub- ative adversarial network against intrusion detection systems,’’ Sensors,
categories of attacks. vol. 23, no. 18, p. 7796, Sep. 2023.
[17] B. B. Zarpelāo, R. S Miani, C. T. Kawakani, and S. C. de Alvarenga,
‘‘A survey of intrusion detection in Internet of Things,’’ J. Netw. Comput.
ACKNOWLEDGMENT Appl., vol. 84, pp. 25–37, Apr. 2017.
[18] S. Ullah, M. A. Khan, J. Ahmad, S. S. Jamal, Z. E. Huma, M. T. Hassan,
The authors are thankful to the Deanship of Scientific N. Pitropakis, and W. J. Buchanan, ‘‘HDL-IDS: A hybrid deep learning
Research at Najran University for funding this work architecture for intrusion detection in the Internet of Vehicles,’’ Sensors,
under the research groups funding program grant code vol. 22, no. 4, p. 1340, Feb. 2022.
[19] T. Wang, J. Li, W. Wei, W. Wang, and K. Fang, ‘‘Deep-Learning-Based
(NU/RG/SERC/12/3). Princess Nourah bint Abdulrah- weak electromagnetic intrusion detection method for zero touch networks
man University Researchers Supporting Project number on industrial IoT,’’ IEEE Netw., vol. 36, no. 6, pp. 236–242, Nov. 2022.
[20] A. Heidari and M. A. Jabraeil Jamali, ‘‘Internet of Things intrusion
(PNURSP2024R319), Princess Nourah bint Abdulrahman detection systems: A comprehensive review and future directions,’’ Cluster
University, Riyadh, Saudi Arabia. Comput., vol. 26, no. 6, pp. 3753–3780, Dec. 2023.
[21] S. Ullah, W. Boulila, A. Koubâa, and J. Ahmad, ‘‘MAGRU-IDS: A multi-
head attention-based gated recurrent unit for intrusion detection in IIoT
REFERENCES networks,’’ IEEE Access, vol. 11, pp. 114590–114601, 2023.
[1] M. Abdel-Basset, V. Chang, H. Hawash, R. K. Chakrabortty, and M. Ryan, [22] S. Ullah, J. Ahmad, M. A. Khan, M. S. Alshehri, W. Boulila, A. Koubaa,
‘‘Deep-IFS: Intrusion detection approach for industrial Internet of Things S. U. Jan, and M. M. I. Ch, ‘‘TNN-IDS: Transformer neural network-based
traffic in fog environment,’’ IEEE Trans. Ind. Informat., vol. 17, no. 11, intrusion detection system for MQTT-enabled IoT networks,’’ Comput.
pp. 7704–7715, Nov. 2021. Netw., vol. 237, Dec. 2023, Art. no. 110072.
[23] I. Al-Turaiki and N. Altwaijry, ‘‘A convolutional neural network for
[2] M. M. Alani, ‘‘An explainable efficient flow-based industrial IoT
improved anomaly-based network intrusion detection,’’ Big Data, vol. 9,
intrusion detection system,’’ Comput. Electr. Eng., vol. 108, May 2023,
no. 3, pp. 233–252, Jun. 2021.
Art. no. 108732. [24] A. Aldweesh, A. Derhab, and A. Z. Emam, ‘‘Deep learning approaches for
[3] O. Friha, M. A. Ferrag, M. Benbouzid, T. Berghout, B. Kantarci, anomaly-based intrusion detection systems: A survey, taxonomy, and open
and K.-K.-R. Choo, ‘‘2DF-IDS: Decentralized and differentially private issues,’’ Knowl.-Based Syst., vol. 189, Feb. 2020, Art. no. 105124.
federated learning-based intrusion detection system for industrial IoT,’’ [25] S. Ullah, J. Ahmad, M. A. Khan, E. H. Alkhammash, M. Hadjouni,
Comput. Secur., vol. 127, Apr. 2023, Art. no. 103097. Y. Y. Ghadi, F. Saeed, and N. Pitropakis, ‘‘A new intrusion detection
[4] P. Ruzafa-Alcázar, P. Fernández-Saura, E. Mármol-Campos, system for the Internet of Things via deep convolutional neural network
A. González-Vidal, J. L. Hernández-Ramos, J. Bernal-Bernabe, and and feature engineering,’’ Sensors, vol. 22, no. 10, p. 3607, May 2022.
A. F. Skarmeta, ‘‘Intrusion detection based on privacy-preserving [26] A. Khacha, R. Saadouni, Y. Harbi, and Z. Aliouat, ‘‘Hybrid deep learning-
federated learning for the industrial IoT,’’ IEEE Trans. Ind. Informat., based intrusion detection system for industrial Internet of Things,’’ in Proc.
vol. 19, no. 2, pp. 1145–1154, Feb. 2023. 5th Int. Symp. Informat. its Appl. (ISIA), Nov. 2022, pp. 1–6.
[5] M. H. Jamal, M. A. Khan, S. Ullah, M. S. Alshehri, S. Almakdi, U. Rashid, [27] P. Dini, A. Begni, S. Ciavarella, E. De Paoli, G. Fiorelli, C. Silvestro,
A. Alazeb, and J. Ahmad, ‘‘Multi-step attack detection in industrial and S. Saponara, ‘‘Design and testing novel one-class classifier based on
networks using a hybrid deep learning architecture,’’ Math. Biosci. Eng., polynomial interpolation with application to networking security,’’ IEEE
vol. 20, no. 8, pp. 13824–13848, 2023. Access, vol. 10, pp. 67910–67924, 2022.

VOLUME 12, 2024 45771

M. S. Alshehri et al.: Self-Attention-Based DCNN for IIoT Networks Intrusion Detection

[28] A. A. Alashhab, M. S. M. Zahid, A. Muneer, and M. Abdukkahi, ‘‘Low- FATMA S. ALRAYES received the M.Sc. degree in e-business and
rate DDoS attack detection using deep learning for SDN-enabled IoT information systems from Newcastle University and the Ph.D. degree in
networks,’’ Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 11, pp. 1–7, 2022. computer science and informatics from Cardiff University. She is currently
[29] D. Javeed, T. Gao, M. S. Saeed, and P. Kumar, ‘‘An intrusion detection sys- an Associate Professor with the Information Systems Department, College
tem for edge-envisioned smart agriculture in extreme environment,’’ IEEE of Computer and Information Sciences (CCIS-IS), Princess Nourah bint
Internet Things J., early access, 2023, doi: 10.1109/JIOT.2023.3288544. Abdulrahman University (PNU), Saudi Arabia. Her research interests
[30] G. Scarpa, M. Gargiulo, A. Mazza, and R. Gaetano, ‘‘A CNN-based fusion
method for feature extraction from sentinel data,’’ Remote Sens., vol. 10,
include privacy protection, usable privacy and security, privacy awareness,
no. 2, p. 236, Feb. 2018. data science and analytics, and social web.
[31] B. Riyaz and S. Ganapathy, ‘‘A deep learning approach for effective
intrusion detection in wireless networks using CNN,’’ Soft Comput.,
vol. 24, no. 22, pp. 17265–17278, Nov. 2020.
[32] H. Kang, D. H. Ahn, G. M. Lee, J. D. Yoo, K. H. Park, and
H. K. Kim, ‘‘IoT network intrusion dataset,’’ IEEE Dataport, Sep. 2019,
doi: 10.21227/q70p-q449.
[33] I. Ullah and Q. H. Mahmoud, ‘‘A technique for generating a botnet dataset SAADULLAH FAROOQ ABBASI is currently an
for anomalous activity detection in IoT networks,’’ in Proc. IEEE Int. Conf. experienced Researcher with more than ten years
Syst., Man, Cybern. (SMC), Ottawa, ON, Canada. Cham, Switzerland: of cutting-edge research and teaching experience
Springer, Oct. 2020, pp. 508–520. in prestigious institutes, including the University
[34] M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras, and H. Janicke, ‘‘Edge- of Birmingham, U.K.; the University of Glasgow,
IIoTset: A new comprehensive realistic cyber security dataset of IoT and
U.K.; Fudan University, Shanghai, China; and the
IIoT applications for centralized and federated learning,’’ IEEE Access,
National University of Science and Technology
vol. 10, pp. 40281–40306, 2022.
[35] T. Shen, L. Ding, J. Sun, C. Jing, F. Guo, and C. Wu, ‘‘Edge computing Islamabad, Pakistan. He has taught various courses
for IoT security: Integrating machine learning with key agreement,’’ in both at undergraduate (UG) and postgraduate
Proc. 3rd Int. Conf. Consum. Electron. Comput. Eng. (ICCECE), Jan. 2023, (PG) levels during his career. He has coauthored
pp. 474–483. more than 20 research papers in international journals and peer-reviewed
international conference proceedings. His research interests include machine
learning, artificial intelligence, security, and biomedical engineering. He is
an invited reviewer for numerous world-leading high-impact journals
(reviewed more than 100 journal articles to date).
MOHAMMED S. ALSHEHRI received the B.S.
degree in computer science from King Khalid
University, Abha, Saudi Arabia, in 2010, the M.S.
degree in computer science from the University
of Colorado Denver, Denver, CO, USA, in 2014,
and the Ph.D. degree in computer science with
JAWAD AHMAD (Senior Member, IEEE) is a
concentration on information security from the
highly experienced Teacher with more than 13
University of Arkansas, Fayetteville, AR, USA,
years of teaching and research experience in
in 2021. He received a Graduate Certificate in
prestigious institutes. He has taught at renowned
cybersecurity from the University of Arkansas,
institutions such as Edinburgh Napier University
in 2020. His research interests include cybersecurity, computer networks,
(U.K.), Glasgow Caledonian University (U.K.)
blockchain, machine learning, and deep learning.
and HITEC University Taxila (Pakistan) etc.
He has also served as a supervisor for several
Ph.D., M.Sc., and bachelor’s students, providing
guidance and support for their dissertations. He
OUMAIMA SAIDANI received the M.Sc. degree has published in renowned journals including IEEE Transactions, ACM
in computer sciences from Paris Dauphine Uni- Transactions, Elsevier, and Springer with over 180 research papers and 5000
versity, France, and the Ph.D. degree in computer citations (H-Index 40). For the past three years, his name has appeared on
sciences from Paris 1-Panthéon Sorbonne Univer- the list of the world’s top 2% scientists in Cybersecurity, as published by
sity, France. She is currently an Assistant Pro- Clarivate (a list endorsed by Stanford University, USA). Furthermore, in
fessor with the Information Systems Department, 2020, he received the endorsement of U.K., exceptional talent candidate
College of Computer and Information Sciences (‘Emerging Leader’) for pioneering work in the field of Cybersecurity and
(CCIS-IS), Princess Nourah bint Abdulrahman AI. To date, he has secured research and funding grants of more than £200K
University (PNU), Saudi Arabia. Her research as a Principal Investigator (PI) and a Co-Investigator (Co-I). In terms of
interests include information systems engineering, academic achievements, he has earned a Gold medal for his outstanding
business process engineering, the IoT, context-aware computing, deep performance in M.S. and a Bronze medal for his achievements in B.S.
learning, and artificial intelligence.