0% found this document useful (0 votes)
7 views5 pages

Federated Deep Learning-Based Intrusion Detection Approach For Enhancing Privacy in Fog-IoT Networks

The document presents a federated deep learning-based intrusion detection system (FDL-IDS) designed to enhance security in fog-IoT networks, addressing vulnerabilities of IoT devices. The proposed system utilizes a decentralized approach to allow local model training while preserving data privacy, thus improving detection capabilities against cyber-attacks. Performance evaluations demonstrate the effectiveness of the FDL-IDS in accurately classifying network traffic while minimizing communication overhead and ensuring scalability.

Uploaded by

labraouinabila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

Federated Deep Learning-Based Intrusion Detection Approach For Enhancing Privacy in Fog-IoT Networks

The document presents a federated deep learning-based intrusion detection system (FDL-IDS) designed to enhance security in fog-IoT networks, addressing vulnerabilities of IoT devices. The proposed system utilizes a decentralized approach to allow local model training while preserving data privacy, thus improving detection capabilities against cyber-attacks. Performance evaluations demonstrate the effectiveness of the FDL-IDS in accurately classifying network traffic while minimizing communication overhead and ensuring scalability.

Uploaded by

labraouinabila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)

Federated Deep Learning-based Intrusion Detection


2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS) | 979-8-3503-9389-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/IOTSMS59855.2023.10325826

Approach for Enhancing Privacy in Fog-IoT


Networks
Bensaid Radjaa1 , Labraoui Nabila2 , Haythem Bany Salameh3
1 STIC Laboratory, Abou Bekr Belkaid University, Tlemcen, Algeria
2 LRI laboratory, Abou Bekr Belkaid University, Tlemcen, Algeria
3 College of Engineering, Al Ain University, Al Ain, UAE

Abstract—The Internet of Things (IoT) revolution has led to a Detection Systems (IDS) play a crucial role in mitigating these
proliferation of connected devices. However, these IoT devices risks by actively identifying and stopping potential attacks,
face inherent limitations, such as limited computing power, thereby enhancing the overall security of fog computing
storage capacity, and battery life. This makes them susceptible to
misuse and exploitation. Attackers exploit these vulnerabilities to environments [5]. Moreover, there are two primary types
compromise IoT devices and create botnets that threaten fog-IoT of IDS: signature-based and anomaly-based. Signature-based
networks. Therefore, developing effective cyber-attack detection IDSs rely on predefined attack patterns or signatures. However,
mechanisms such as Machine Learning (ML) based Intrusion a main limitation of these methods is their inability to detect
Detection Systems (IDSs) becomes crucial, which is imperative zero-day attacks, which exploit new or previously unknown
to safeguard fog-IoT infrastructures. However, conventional ML
approaches often require centralized data storage on a single vulnerabilities [6]. On the other hand, anomaly-based meth-
server or in the cloud, leading to concerns regarding data ods leverage Machine Learning (ML) techniques, including
confidentiality, communication overhead, and energy consump- Deep Learning (DL), to create a model by analyzing the
tion. This paper addresses this issue by leveraging IDS-based usual behavior and characteristics of the system, identifying
anomaly detection to prevent cyber attacks on IoT networks. deviations as abnormalities [7]. However, conventional ML
Specifically, we propose using Federated Deep Learning (FDL)
across a fog-based IDS architecture that utilizes the Lost Short- approaches often require centralized data storage on a single
Term Memory (LSTM) model and the Bot-IoT dataset. Our server or in the cloud, leading to concerns regarding data con-
solution adopts a local learning approach, allowing devices to fidentiality, communication overhead, and energy consumption
acquire knowledge from others by sharing only model updates [8]. Federated Learning (FL) is a promising technique to
without exposing their data. By adopting the FDL approach, the address these limitations, enabling knowledge exchange while
detection model demonstrates a comparable (slightly improved)
performance compared to existing centralized deep learning while preserving privacy and reducing expenses [9]. By empowering
ensuring data privacy-preserving. devices to train a unified model collaboratively while keeping
Index Terms—Fog-IoT network, IDS-based anomaly detection, training data on the device, FL separates machine learning
Federated Deep Learning. capabilities from the need for centralized data storage [10]. As
a result, this technique can significantly mitigate privacy and
I. I NTRODUCTION security risks. This paper proposes a federated deep learning-
The rapid growth of wireless communication technology based Intrusion Detection System (FDL-IDS) for Fog-IoT
has accelerated the massive deployment of the Internet of Network (FIN), facilitating collaboration across edge devices
Things (IoT), resulting in a vast number of interconnected to securely exchange data and deliver robust attack detection in
physical devices and sensors forming IoT applications [1]. IoT-based smart city applications. Furthermore, the proposed
This network enables seamless information sharing with cloud detection system detects anomalies by classifying the network
or fog computing systems over the Internet, improving data traffic as benign or malicious. It attempts to achieve high
exchange and connectivity [2]. Fog computing involves a accuracy, low false positives, and low communication costs
diverse range of interconnected intelligent objects capable of while remaining flexible and scalable for IoT environments.
sensing, collecting, and processing data for tasks such as anal- The main contributions of this work are as follows.
ysis, control, monitoring, and real-time decision-making [3].
However, integrating Fog computing with IoT raises security • This paper proposes a fog-IoT intrusion detection system
and privacy concerns due to IoT device resource limitations (IDS) model to mitigate attacks on fog-IoT systems. A
within the fog network. Consequently, these vulnerabilities lightweight detection model addresses the challenges of
expose the network to attacks, including hacking techniques limited memory and computational resource constraints
that compromise sensitive data or disrupt fog services [4]. on edge devices. Additionally, long-short-term memory
Implementing robust security measures becomes imperative (LSTM) networks are chosen due to their ability to han-
to address these security concerns. In this regard, Intrusion dle diverse datasets and distributions without restrictions

Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-9389-7/23/$31.00 ©2023 IEEE 156
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)

and their strong performance on the heterogeneous data systems through fog computing. Therefore, this paper intro-
commonly found in IoT devices. duces a novel approach, which presents a distributed fog-based
• A federated deep learning-based fog-IoT IDS architecture defensive intrusion detection system (IDS) for cyberattacks.
is introduced, providing a solution to issues such as The method uses FDL to detect anomalies by classifying
single points of failure in centralized architectures. This the network as benign or malicious traffic. The proposed
architecture also ensures the privacy of locally trained architecture for cyberattack prevention includes three layers:
data, an essential requirement for many IoT applications. the Cloud layer, Fog network layer, and Edge layer:
Furthermore, deploying the detection model at the fog • The Cloud layer provides infrastructure and resources
layer, in proximity to potential attack origins, leads to but does not directly participate in federated learning.
faster detection response times and decreases the over- • The Fog network layer implements an intelligent IDS
head on the cloud infrastructure. using fog computing and comprises two sub-layers:
• The proposed framework is evaluated on the real-world – The Centralized Fog-Server processes data at the
BoT-IoT dataset [11], which contains various cyber- network edge, utilizing FDL for enhanced IDS ca-
attacks, including DoS, DDoS.Evaluation metrics like pabilities, where the global model is set at selected
accuracy, precision, recall, and F1 score are used to assess clients, aggregating their updates based on local data
the effectiveness of the distributed framework for detect- and updating the global model.
ing cyber-attacks. The framework is also compared with – The Distributed Fog-Nodes located at the network
the centralized architecture to demonstrate the superiority edge act as intermediaries between the IoT devices
of the proposed approach. and the fog server. They participate in the federated
The remainder of this paper is structured as follows. Section learning process as clients, training the model locally
II introduces the proposed cyber security intrusion detection using their own data to prevent sensitive information
model for FIN, utilizing federated knowledge. Section III exposure to a centralized server. Furthermore, fog
presents the performance evaluation results. Finally, in Section nodes, including routers, gateways, switches, etc.,
IV, we provide concluding remarks. possess higher processing power, memory, and con-
nectivity than individual IoT devices, making them
II. P ROPOSED M ODEL well-suited for local model training. In addition, its
This section provides an overview of FL-based IDS for the proximity to the network’s edge facilitates efficient
fog-IoT Network. Subsequently, we outline the architecture of data processing and enhances the overall security
our proposed network architecture, followed by a description performance.
of our proposed model. • The Things layer consists of the IoT devices responsible
for collecting and uploading data. This contributes to
A. Federated Learning-based IDS for Fog-IoT Network
the FL process, enabling better anomaly detection within
Federated Learning (FL) is a distributed machine learning distributed fog-based IDS.
method that builds a global model by combining insights
from multiple devices through communication rounds. Its C. The Proposed Model
primary objective is to enhance the detection of anomalies This study presents an FDL-IDS to detect cyber attacks in
or attacks at their source [12]. FL, coupled with distributed fog-IoT networks effectively. Specifically, fog computing is
fog computing, allows for the containment of infected areas well suited to deploy training intelligence, given its abundant
without disrupting the overall system, resulting in improved data and improved communication [16]. Our FDL-IDS is
performance. It enables edge devices to collaborate, optimizing integrated into a fog server, which oversees client registra-
anomaly detection even with limited local data, particularly tions, computes the federated model, and stores it securely.
for extensive datasets. This approach can enhance system Moreover, we improve intrusion detection capabilities through
responsiveness during attacks. FL’s decentralized nature in IoT collaborative learning between the fog-nodes ”clients” and the
systems aligns well with IoT device architecture, effectively fog-server ”central server” to identify IoT network attacks as
addressing resource constraints. IoT sensors can perform local malicious or normal, as demonstrated in Fig.1. Furthermore,
model training, conserving bandwidth and reducing latency in leveraging fog nodes for local anomaly detection utilizes their
resource-constrained environments [13]. FL achieves scalabil- computational resources efficiently. In addition, FL enables
ity by aggregating model updates centrally, accommodating a collaborative, decentralized model training without sharing
large number of IoT devices without overloading the central local data, only aggregated updates. This allows the nodes
server [14]. Furthermore, FL’s collaborative model training to improve the global model while retaining privacy. The fog
preserves local data distributions, enhancing detection perfor- server broadcasts a model to selected clients, which trains it
mance, especially in diverse IoT settings [15]. on their local data and returns only the updates. The server
aggregates these to improve the global model iteratively.
B. Network Model Architecture Our goal is an efficient, accurate, and scalable IDS using
To effectively address cyber-attack concerns in IoT net- federated learning’s adaptable, privacy-preserving, and collab-
works, it is essential to incorporate attack prevention into IoT orative capabilities.

Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
157
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)

TABLE I: Simulation parameters.


classifier Parameters Configuration Value
Input layer 7
Hidden node 64
Hidden layer 2
Output 1
Batch size 32
Local epoch 20
LSTM
Global epoch 5
Learning rate 0.001
Activation function Relu
Classification function Sigmoid
Optimizer Adam
Loss function BinaryCrossentropy
Fig. 1: The proposed model.

• For the federated learning approach, we assume that all • K-fold cross-validation: We use k-fold= 5 to evaluate
the communication and the aggregation process occurs the model’s performance by splitting the available data
securely within the fog server and the fog nodes, ensuring into k subsets or folds.
the data’s confidentiality, integrity, and availability while
facilitating secure data sharing among authorized entities. B. Experiment Analysis

III. P ERFORMANCE E VALUATION In this section, we present an analysis of the dataset that
we have used and discuss the preprocessing of these data.
To investigate the effectiveness of our proposed approach,
1) BoT-IoT dataset: In our experimental assessments, we
the following section details the experimental setup and
adopt the BoT-IoT dataset, which originates from the Cyber
methodologies used for rigorous evaluations.
Range Lab at UNSW Canberra [11]. This dataset was created
A. Experiments Setup within an authentic network environment, encompassing both
This section discusses the tools used to establish exper- normal and botnet traffic. To simplify the training and testing
iments for FDL-based IDS on fog-IoT network systems. procedures, we used a smaller subset consisting of 5% of the
The experiments are carried out on a Windows 10 operat- original dataset, which the dataset’s creators provided. This
ing system, utilizing a system with an Intel i3 processor subset comprises 19 features and includes various cyberattacks
and 8GB of RAM. The Python programming language is such as DDoS, DoS, reconnaissance, and theft attacks. We
employed, along with deep learning libraries such as Keras excluded theft attacks due to their limited number of samples,
[17] and TensorFlow [18], for training and testing the FDL which could lead to an imbalanced dataset.
model. The system’s performance and evaluation encompasses 2) Data Preprocessing: We present the data processing
two primary use cases: the Centralized Learning approach and analysis framework for our realistic cyber-security dataset
and the FL approach, applied to binary classification tasks. encompassing IoT applications. This framework involves the
Our experiment evaluation considers two distinct use cases: following: In our approach, we utilize two datasets: one for
Centralized and FDL approaches. In the Centralized learning training and the other for testing. For binary ’attack’ classi-
approach, data is centralized at a single location, and we fication, we perform label encoding. Furthermore, to mitigate
employ RNN DL classifiers, specifically LSTM .Conversely, overfitting, we discard certain extraneous flow features, specif-
to attain exceptional detection accuracy while safeguarding ically ’pkSeqID,’ ’seq,’ ’saddr,’ ’sport,’ and ’daddr.’ Addition-
users’ privacy in the FDL approach, we use IID (Independent ally, we incorporate feature correlation analysis and remove
and Identically Distributed) data distribution among 10 clients ’max.’ To tackle class imbalance, we implement synthetic
(k=10), employing communication rounds and an aggregation minority over-sampling and under-sampling techniques. The
server that combines the models from these clients. We uti- feature importance obtained from the random forest analysis
lize the same LSTM classifiers as used in the Centralized used in our experiment is presented in TABLE II.
approach. The diverse parameters utilized in the DL models We utilize StandardScaler as a preprocessing technique to
for centralized and FDL approaches are presented in Table I. transform the data, ensuring a mean of 0 and a standard devi-
Additionally, to prevent overfitting, we use two methods: ation of 1. This process is commonly called ’standardization’
• Dropout: During training, 10% of the input units in the
or ’z-score’ normalization.
LSTM layer will be randomly set to 0 at each update.
• Early stopping: with Patience=3, monitor=val loss used
C. Evaluation Metrics
for Stopped if the performance on the validation set stops When analyzing IDS performance, the most commonly
improving. employed metrics include the following:

Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
158
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)

TABLE II: Description of features selected using random


forest.

Features Description
Mean Average duration of aggregated records
Min Minimum duration of aggregated records
Stddev Standard deviation of aggregated records
State number Numerical representation of feature state
Srate Source-to-destination packets per second
N IN Conn P DstIP Number of inbound connections per destination IP

• True Negatives (TN): Legitimate traffic correctly identi-


fied as normal.
• True Positives (TP): Malicious traffic is accurately rec-
ognized as an attack. (a) Deep Learning
• False Positives (FP): Normal traffic is mistakenly classi-
fied as malicious traffic.
• False Negatives (FN): Malicious traffic incorrectly la-
beled as normal traffic.
Furthermore, we utilize various measures to assess our pro-
posed model’s effectiveness, including precision, recall, F-
score, and accuracy. These metrics facilitate a systematic com-
parative analysis with other relevant approaches commonly
used in IDS. The performance metrics are summarized below:
1) Accuracy is a measure that quantifies the proportion of
instances correctly classified among the total number of
observed samples. It is computed as follows:
tp + tn
accuracy = (1) (b) Federated Deep Learning
tp + tn + f p + f n
2) that quantifies the proportion of instances correctly clas- Fig. 2: Confusion matrix of the Binary classification
sified among the total number of observed samples. It
is calculated as follows:
tp
precision = (2)
tp + f p The Receiver Operating Characteristic (ROC) curve pro-
3) The recall refers to the proportion of correctly identified vides a means to assess and compare the performance of
positive samples, which is determined using the equa- classification models. It presents a visual representation and
tion: a comprehensive evaluation of their ability to differentiate
tp
recall = (3) between positive and negative classes. The ROC curve ef-
tp + f n fectively depicts the balance between the False Positive Rate
4) The f1-score is the harmonic mean of precision and (specificity) and the True Positive Rate (sensitivity). In our
recall, which is determined using the following formula: case, Fig.3 shows that our model achieves a ROC of 99.03%
2 × precision × recall in the FDL, followed by 98.8% in DL.
f 1 − score = (4) Fig.4 shows that FDL achieves an accuracy of 99.47% ,
(precision + recall)
higher accuracy, precision, and recall compared to regular DL.
D. Evaluation Results Furthermore, FDL achieves an F1 score of 99.66%, combining
The classification performance of our model is depicted by precision and recall, showing its predictive solid capability as
the confusion matrix in Fig.2, which summarizes the accurate an IDS.
and erroneous predictions achieved through the proposed Simulation results demonstrate that under IID data distribution,
approach. The primary goal is to reduce the false positive FDL can achieve comparable accuracy to centralized DL mod-
and false negative rates. The proposed model has effectively els, as local devices see similar patterns and global aggregation
achieved this objective, resulting in false positive rates of approximates full dataset training. However, FDL maintains
0.37% and 0.35% for DL and FDL, respectively, and false significant advantages over centralized deep learning regard-
negative rates of 0.41% for DL and 0.17% for FDL. It is ing privacy protection, efficiency, robustness, and flexibility,
noteworthy that FDL outperforms the centralized DL model addressing limitations like privacy risks and network latency
in these metrics. faced by centralized DL approaches.

Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
159
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)

significant advancements in preserving the privacy of IoT


device data and achieving high accuracy in detecting potential
attacks.
R EFERENCES
[1] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of things
(iot): A vision, architectural elements, and future directions,” Future
generation computer systems, vol. 29, no. 7, pp. 1645–1660, 2013.
[2] K. C. Okafor, I. E. Achumba, G. A. Chukwudebe, G. C. Ononiwu et al.,
“Leveraging fog computing for scalable iot datacenter using spine-leaf
network topology,” Journal of Electrical and Computer Engineering,
vol. 2017, 2017.
[3] A. V. Dastjerdi, H. Gupta, R. N. Calheiros, S. K. Ghosh, and R. Buyya,
“Fog computing: Principles, architectures, and applications,” in Internet
(a) Deep Learning of things. Elsevier, 2016, pp. 61–75.
[4] Y. I. Alzoubi, V. H. Osmanaj, A. Jaradat, and A. Al-Ahmad, “Fog
computing security and privacy for the internet of thing applications:
State-of-the-art,” Security and Privacy, vol. 4, no. 2, p. e145, 2021.
[5] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey
of intrusion detection systems: techniques, datasets and challenges,”
Cybersecurity, vol. 2, no. 1, pp. 1–22, 2019.
[6] P. Garcia-Teodoro, J. Diaz-Verdejo, G. Maciá-Fernández, and
E. Vázquez, “Anomaly-based network intrusion detection: Techniques,
systems and challenges,” computers & security, vol. 28, no. 1-2, pp.
18–28, 2009.
[7] S. Naseer, Y. Saleem, S. Khalid, M. K. Bashir, J. Han, M. M. Iqbal, and
K. Han, “Enhanced network anomaly detection based on deep neural
networks,” IEEE access, vol. 6, pp. 48 231–48 246, 2018.
[8] O. A. Wahab, A. Mourad, H. Otrok, and T. Taleb, “Federated machine
learning: Survey, multi-level classification, desirable criteria and future
(b) Federated Deep Learning directions in communication and networking systems,” IEEE Communi-
cations Surveys & Tutorials, vol. 23, no. 2, pp. 1342–1397, 2021.
Fig. 3: ROC Curve of the Binary classification [9] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:
Concept and applications,” ACM Transactions on Intelligent Systems and
Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
[10] D. C. Nguyen, M. Ding, Q.-V. Pham, P. N. Pathirana, L. B. Le,
A. Seneviratne, J. Li, D. Niyato, and H. V. Poor, “Federated learning
meets blockchain in edge computing: Opportunities and challenges,”
IEEE Internet of Things Journal, vol. 8, no. 16, pp. 12 806–12 825,
2021.
[11] N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull, “Towards
the development of realistic botnet dataset in the internet of things
for network forensic analytics: Bot-iot dataset,” Future Generation
Computer Systems, vol. 100, pp. 779–796, 2019.
[12] S. Agrawal, S. Sarkar, O. Aouedi, G. Yenduri, K. Piamrat, M. Alazab,
S. Bhattacharya, P. K. R. Maddikunta, and T. R. Gadekallu, “Federated
learning for intrusion detection system: Concepts, challenges and future
directions,” Computer Communications, 2022.
[13] A. Imteaj, K. Mamun Ahmed, U. Thakker, S. Wang, J. Li, and M. H.
Amini, “Federated learning for resource-constrained iot devices: panora-
mas and state of the art,” Federated and Transfer Learning, pp. 7–27,
2022.
Fig. 4: Evaluation metrics [14] D. Huba, J. Nguyen, K. Malik, R. Zhu, M. Rabbat, A. Yousefpour, C.-J.
Wu, H. Zhan, P. Ustinov, H. Srinivas et al., “Papaya: Practical, private,
and scalable federated learning,” Proceedings of Machine Learning and
IV. C ONCLUSION Systems, vol. 4, pp. 814–832, 2022.
[15] S. AbdulRahman, H. Tout, H. Ould-Slimane, A. Mourad, C. Talhi,
This paper presented the implementation of an Intelligent and M. Guizani, “A survey on federated learning: The journey from
centralized to distributed on-site learning and beyond,” IEEE Internet of
IDS based on anomaly detection. The primary goal is to Things Journal, vol. 8, no. 7, pp. 5476–5497, 2020.
enhance security within the fog-IoT environment by intro- [16] M. Mukherjee, M. Guo, J. Lloret, and Q. Zhang, “Leveraging intelligent
ducing binary classification in the IDS to distinguish between computation offloading with fog/edge computing for tactile internet:
Advantages and limitations,” IEEE Network, vol. 34, no. 5, pp. 322–
legitimate and malicious network traffic. The system utilizes 329, 2020.
a sophisticated FDL model to actively monitor the fog-IoT [17] “Keras.io,” https://keras.io, 2023.
network traffic. This paper assessed the effectiveness of FDL [18] M. Abadi, “Tensorflow: learning functions at scale,” in Proceedings
of the 21st ACM SIGPLAN international conference on functional
compared to centralized DL (non-federated learning) methods. programming, 2016, pp. 1–1.
Specifically, we used Long Short-Term Memory (LSTM)
and the IID data distribution within the BoT-IoT dataset.
This investigation’s findings highlighted the FDL approach’s
superiority over the centralized DL method. It demonstrated

Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
160

You might also like