Federated Deep Learning-Based Intrusion Detection Approach For Enhancing Privacy in Fog-IoT Networks
Federated Deep Learning-Based Intrusion Detection Approach For Enhancing Privacy in Fog-IoT Networks
Abstract—The Internet of Things (IoT) revolution has led to a Detection Systems (IDS) play a crucial role in mitigating these
proliferation of connected devices. However, these IoT devices risks by actively identifying and stopping potential attacks,
face inherent limitations, such as limited computing power, thereby enhancing the overall security of fog computing
storage capacity, and battery life. This makes them susceptible to
misuse and exploitation. Attackers exploit these vulnerabilities to environments [5]. Moreover, there are two primary types
compromise IoT devices and create botnets that threaten fog-IoT of IDS: signature-based and anomaly-based. Signature-based
networks. Therefore, developing effective cyber-attack detection IDSs rely on predefined attack patterns or signatures. However,
mechanisms such as Machine Learning (ML) based Intrusion a main limitation of these methods is their inability to detect
Detection Systems (IDSs) becomes crucial, which is imperative zero-day attacks, which exploit new or previously unknown
to safeguard fog-IoT infrastructures. However, conventional ML
approaches often require centralized data storage on a single vulnerabilities [6]. On the other hand, anomaly-based meth-
server or in the cloud, leading to concerns regarding data ods leverage Machine Learning (ML) techniques, including
confidentiality, communication overhead, and energy consump- Deep Learning (DL), to create a model by analyzing the
tion. This paper addresses this issue by leveraging IDS-based usual behavior and characteristics of the system, identifying
anomaly detection to prevent cyber attacks on IoT networks. deviations as abnormalities [7]. However, conventional ML
Specifically, we propose using Federated Deep Learning (FDL)
across a fog-based IDS architecture that utilizes the Lost Short- approaches often require centralized data storage on a single
Term Memory (LSTM) model and the Bot-IoT dataset. Our server or in the cloud, leading to concerns regarding data con-
solution adopts a local learning approach, allowing devices to fidentiality, communication overhead, and energy consumption
acquire knowledge from others by sharing only model updates [8]. Federated Learning (FL) is a promising technique to
without exposing their data. By adopting the FDL approach, the address these limitations, enabling knowledge exchange while
detection model demonstrates a comparable (slightly improved)
performance compared to existing centralized deep learning while preserving privacy and reducing expenses [9]. By empowering
ensuring data privacy-preserving. devices to train a unified model collaboratively while keeping
Index Terms—Fog-IoT network, IDS-based anomaly detection, training data on the device, FL separates machine learning
Federated Deep Learning. capabilities from the need for centralized data storage [10]. As
a result, this technique can significantly mitigate privacy and
I. I NTRODUCTION security risks. This paper proposes a federated deep learning-
The rapid growth of wireless communication technology based Intrusion Detection System (FDL-IDS) for Fog-IoT
has accelerated the massive deployment of the Internet of Network (FIN), facilitating collaboration across edge devices
Things (IoT), resulting in a vast number of interconnected to securely exchange data and deliver robust attack detection in
physical devices and sensors forming IoT applications [1]. IoT-based smart city applications. Furthermore, the proposed
This network enables seamless information sharing with cloud detection system detects anomalies by classifying the network
or fog computing systems over the Internet, improving data traffic as benign or malicious. It attempts to achieve high
exchange and connectivity [2]. Fog computing involves a accuracy, low false positives, and low communication costs
diverse range of interconnected intelligent objects capable of while remaining flexible and scalable for IoT environments.
sensing, collecting, and processing data for tasks such as anal- The main contributions of this work are as follows.
ysis, control, monitoring, and real-time decision-making [3].
However, integrating Fog computing with IoT raises security • This paper proposes a fog-IoT intrusion detection system
and privacy concerns due to IoT device resource limitations (IDS) model to mitigate attacks on fog-IoT systems. A
within the fog network. Consequently, these vulnerabilities lightweight detection model addresses the challenges of
expose the network to attacks, including hacking techniques limited memory and computational resource constraints
that compromise sensitive data or disrupt fog services [4]. on edge devices. Additionally, long-short-term memory
Implementing robust security measures becomes imperative (LSTM) networks are chosen due to their ability to han-
to address these security concerns. In this regard, Intrusion dle diverse datasets and distributions without restrictions
Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-9389-7/23/$31.00 ©2023 IEEE 156
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)
and their strong performance on the heterogeneous data systems through fog computing. Therefore, this paper intro-
commonly found in IoT devices. duces a novel approach, which presents a distributed fog-based
• A federated deep learning-based fog-IoT IDS architecture defensive intrusion detection system (IDS) for cyberattacks.
is introduced, providing a solution to issues such as The method uses FDL to detect anomalies by classifying
single points of failure in centralized architectures. This the network as benign or malicious traffic. The proposed
architecture also ensures the privacy of locally trained architecture for cyberattack prevention includes three layers:
data, an essential requirement for many IoT applications. the Cloud layer, Fog network layer, and Edge layer:
Furthermore, deploying the detection model at the fog • The Cloud layer provides infrastructure and resources
layer, in proximity to potential attack origins, leads to but does not directly participate in federated learning.
faster detection response times and decreases the over- • The Fog network layer implements an intelligent IDS
head on the cloud infrastructure. using fog computing and comprises two sub-layers:
• The proposed framework is evaluated on the real-world – The Centralized Fog-Server processes data at the
BoT-IoT dataset [11], which contains various cyber- network edge, utilizing FDL for enhanced IDS ca-
attacks, including DoS, DDoS.Evaluation metrics like pabilities, where the global model is set at selected
accuracy, precision, recall, and F1 score are used to assess clients, aggregating their updates based on local data
the effectiveness of the distributed framework for detect- and updating the global model.
ing cyber-attacks. The framework is also compared with – The Distributed Fog-Nodes located at the network
the centralized architecture to demonstrate the superiority edge act as intermediaries between the IoT devices
of the proposed approach. and the fog server. They participate in the federated
The remainder of this paper is structured as follows. Section learning process as clients, training the model locally
II introduces the proposed cyber security intrusion detection using their own data to prevent sensitive information
model for FIN, utilizing federated knowledge. Section III exposure to a centralized server. Furthermore, fog
presents the performance evaluation results. Finally, in Section nodes, including routers, gateways, switches, etc.,
IV, we provide concluding remarks. possess higher processing power, memory, and con-
nectivity than individual IoT devices, making them
II. P ROPOSED M ODEL well-suited for local model training. In addition, its
This section provides an overview of FL-based IDS for the proximity to the network’s edge facilitates efficient
fog-IoT Network. Subsequently, we outline the architecture of data processing and enhances the overall security
our proposed network architecture, followed by a description performance.
of our proposed model. • The Things layer consists of the IoT devices responsible
for collecting and uploading data. This contributes to
A. Federated Learning-based IDS for Fog-IoT Network
the FL process, enabling better anomaly detection within
Federated Learning (FL) is a distributed machine learning distributed fog-based IDS.
method that builds a global model by combining insights
from multiple devices through communication rounds. Its C. The Proposed Model
primary objective is to enhance the detection of anomalies This study presents an FDL-IDS to detect cyber attacks in
or attacks at their source [12]. FL, coupled with distributed fog-IoT networks effectively. Specifically, fog computing is
fog computing, allows for the containment of infected areas well suited to deploy training intelligence, given its abundant
without disrupting the overall system, resulting in improved data and improved communication [16]. Our FDL-IDS is
performance. It enables edge devices to collaborate, optimizing integrated into a fog server, which oversees client registra-
anomaly detection even with limited local data, particularly tions, computes the federated model, and stores it securely.
for extensive datasets. This approach can enhance system Moreover, we improve intrusion detection capabilities through
responsiveness during attacks. FL’s decentralized nature in IoT collaborative learning between the fog-nodes ”clients” and the
systems aligns well with IoT device architecture, effectively fog-server ”central server” to identify IoT network attacks as
addressing resource constraints. IoT sensors can perform local malicious or normal, as demonstrated in Fig.1. Furthermore,
model training, conserving bandwidth and reducing latency in leveraging fog nodes for local anomaly detection utilizes their
resource-constrained environments [13]. FL achieves scalabil- computational resources efficiently. In addition, FL enables
ity by aggregating model updates centrally, accommodating a collaborative, decentralized model training without sharing
large number of IoT devices without overloading the central local data, only aggregated updates. This allows the nodes
server [14]. Furthermore, FL’s collaborative model training to improve the global model while retaining privacy. The fog
preserves local data distributions, enhancing detection perfor- server broadcasts a model to selected clients, which trains it
mance, especially in diverse IoT settings [15]. on their local data and returns only the updates. The server
aggregates these to improve the global model iteratively.
B. Network Model Architecture Our goal is an efficient, accurate, and scalable IDS using
To effectively address cyber-attack concerns in IoT net- federated learning’s adaptable, privacy-preserving, and collab-
works, it is essential to incorporate attack prevention into IoT orative capabilities.
Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
157
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)
• For the federated learning approach, we assume that all • K-fold cross-validation: We use k-fold= 5 to evaluate
the communication and the aggregation process occurs the model’s performance by splitting the available data
securely within the fog server and the fog nodes, ensuring into k subsets or folds.
the data’s confidentiality, integrity, and availability while
facilitating secure data sharing among authorized entities. B. Experiment Analysis
III. P ERFORMANCE E VALUATION In this section, we present an analysis of the dataset that
we have used and discuss the preprocessing of these data.
To investigate the effectiveness of our proposed approach,
1) BoT-IoT dataset: In our experimental assessments, we
the following section details the experimental setup and
adopt the BoT-IoT dataset, which originates from the Cyber
methodologies used for rigorous evaluations.
Range Lab at UNSW Canberra [11]. This dataset was created
A. Experiments Setup within an authentic network environment, encompassing both
This section discusses the tools used to establish exper- normal and botnet traffic. To simplify the training and testing
iments for FDL-based IDS on fog-IoT network systems. procedures, we used a smaller subset consisting of 5% of the
The experiments are carried out on a Windows 10 operat- original dataset, which the dataset’s creators provided. This
ing system, utilizing a system with an Intel i3 processor subset comprises 19 features and includes various cyberattacks
and 8GB of RAM. The Python programming language is such as DDoS, DoS, reconnaissance, and theft attacks. We
employed, along with deep learning libraries such as Keras excluded theft attacks due to their limited number of samples,
[17] and TensorFlow [18], for training and testing the FDL which could lead to an imbalanced dataset.
model. The system’s performance and evaluation encompasses 2) Data Preprocessing: We present the data processing
two primary use cases: the Centralized Learning approach and analysis framework for our realistic cyber-security dataset
and the FL approach, applied to binary classification tasks. encompassing IoT applications. This framework involves the
Our experiment evaluation considers two distinct use cases: following: In our approach, we utilize two datasets: one for
Centralized and FDL approaches. In the Centralized learning training and the other for testing. For binary ’attack’ classi-
approach, data is centralized at a single location, and we fication, we perform label encoding. Furthermore, to mitigate
employ RNN DL classifiers, specifically LSTM .Conversely, overfitting, we discard certain extraneous flow features, specif-
to attain exceptional detection accuracy while safeguarding ically ’pkSeqID,’ ’seq,’ ’saddr,’ ’sport,’ and ’daddr.’ Addition-
users’ privacy in the FDL approach, we use IID (Independent ally, we incorporate feature correlation analysis and remove
and Identically Distributed) data distribution among 10 clients ’max.’ To tackle class imbalance, we implement synthetic
(k=10), employing communication rounds and an aggregation minority over-sampling and under-sampling techniques. The
server that combines the models from these clients. We uti- feature importance obtained from the random forest analysis
lize the same LSTM classifiers as used in the Centralized used in our experiment is presented in TABLE II.
approach. The diverse parameters utilized in the DL models We utilize StandardScaler as a preprocessing technique to
for centralized and FDL approaches are presented in Table I. transform the data, ensuring a mean of 0 and a standard devi-
Additionally, to prevent overfitting, we use two methods: ation of 1. This process is commonly called ’standardization’
• Dropout: During training, 10% of the input units in the
or ’z-score’ normalization.
LSTM layer will be randomly set to 0 at each update.
• Early stopping: with Patience=3, monitor=val loss used
C. Evaluation Metrics
for Stopped if the performance on the validation set stops When analyzing IDS performance, the most commonly
improving. employed metrics include the following:
Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
158
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)
Features Description
Mean Average duration of aggregated records
Min Minimum duration of aggregated records
Stddev Standard deviation of aggregated records
State number Numerical representation of feature state
Srate Source-to-destination packets per second
N IN Conn P DstIP Number of inbound connections per destination IP
Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
159
2023 10th International Conference on Internet of Things: Systems, Management and Security (IOTSMS)
Authorized licensed use limited to: Université Paris-Saclay. Downloaded on April 02,2024 at 11:58:02 UTC from IEEE Xplore. Restrictions apply.
160