0% found this document useful (0 votes)
104 views19 pages

A Survey On Intrusion Detection System in IoT Networks

This survey examines intrusion detection systems (IDS) in Internet of Things (IoT) networks, highlighting the critical need for security as IoT expands. It analyzes contemporary IDS techniques, particularly those leveraging artificial intelligence and machine learning, and reviews various datasets, performance metrics, and challenges faced in this field. The study emphasizes the importance of developing efficient, lightweight IDS models suitable for resource-constrained devices and suggests future research directions to address existing challenges.

Uploaded by

M Akmal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views19 pages

A Survey On Intrusion Detection System in IoT Networks

This survey examines intrusion detection systems (IDS) in Internet of Things (IoT) networks, highlighting the critical need for security as IoT expands. It analyzes contemporary IDS techniques, particularly those leveraging artificial intelligence and machine learning, and reviews various datasets, performance metrics, and challenges faced in this field. The study emphasizes the importance of developing efficient, lightweight IDS models suitable for resource-constrained devices and suggests future research directions to address existing challenges.

Uploaded by

M Akmal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Cyber Security and Applications 3 (2025) 100082

Contents lists available at ScienceDirect

Cyber Security and Applications


journal homepage: http://www.keaipublishing.com/en/journals/cyber-security-and-applications/

A survey on intrusion detection system in IoT networks


Md Mahbubur Rahman a, Shaharia Al Shakil b,∗, Mizanur Rahman Mustakim a
a
Comp. Sci. & Tech., Beijing Institute of Technology, Beijing, China
b
Info. & Comm. Eng., Beijing Institute of Technology, Beijing, China

a r t i c l e i n f o a b s t r a c t

Keywords: As the Internet of Things (IoT) expands, the security of IoT networks has becoming more critical. Intrusion Detec-
Cyber-physical systems tion Systems (IDS) are essential for protecting these networks against malicious activities. Artificial intelligence,
Distributed denial-of-service with its adaptive and self-learning capabilities, has emerged as a promising approach to enhancing intrusion de-
Internet of Things
tection in IoT environments. Machine learning facilitates dynamic threat identification, reduces false positives,
Intrusion detection
and addresses evolving vulnerabilities. This survey provides an analysis of contemporary intrusion detection
Machine learning
techniques, models, and their performances in IoT networks, offering insights into IDS design and implementa-
tion. It reviews data extraction techniques, useful matrices, and loss functions in IDS for IoT networks, ranking
top-cited algorithms and categorizing IDS studies based on different approaches. The survey evaluates various
datasets used in IoT intrusion detection, examining their attributes, benefits, and drawbacks, and emphasizes
performance metrics and computational efficiency, providing insights into IDS effectiveness and practicality.
Standardized evaluation metrics and real-world testing are stressed to ensure reliability. Additionally, the survey
identifies significant challenges and open issues in ML and DL-based IDS for IoT networks, such as computational
complexity and high false positive rates, and recommends potential research directions, emerging trends, and
perspectives for future work. This forward-looking perspective aids in shaping the future direction of research in
this dynamic field, emphasizing the need for lightweight, efficient IDS models suitable for resource- constrained
IoT devices and the importance of comprehensive, representative datasets.

1. Introduction mary sorts: signature-based technique and anomaly-based technique.


Signature-based technique use predefined rules or marks to recognize
The Internet of Things (IoT) is a worldview that empowers the inter- known attacks, while anomaly-based technique utilize statistical or ma-
connection and communication of different physical and virtual gadgets chine learning strategies to understand the characteristics of legitimate
through the Web. IoT networks serve several domains, including smart and malicious data during the training/offline phase and identify at-
cities, health, agriculture, and transportation. In any case, IoT networks tacks in incoming traffic during the predicting/online phase. Signature-
additionally face numerous security challenges, like unapproved access, based technique outputs high accuracy, low false positive rate and faster
information theft, denial of service, and malicious attacks. Accordingly, runtime for known assaults, however, they can’t identify novel or un-
it is fundamental to plan and execute compelling intrusion detection sys- known assaults, and they require continuous updates of the signature
tems (IDS) for IoT organizations to safeguard them from likely dangers database. Anomaly-based technique relish the value of having the pref-
and guarantee their unwavering quality and accessibility. The Internet erence to identify new or ambiguous attacks, yet they experience the
of Things (IoT), which includes machines, sensors, and cameras, contin- unfriendly effects of high false-positive rates and high computational
ues to steadily expand the number of devices connected to the Internet intricacy.
[96]. Another gauge from the International Data Corporation (IDC) mea- Maintaining the security and performance of cyber-physical systems
sures that there will be 41.6 billion associated IoT devices, creating 79.4 (CPS) on the Internet of Things (IoT) is vital, as these systems often
zettabytes (ZB) of data in 2025 [24]. control essential services and infrastructures. The increasing complex-
An intrusion detection system (IDS) is a framework that observe ity and interconnectivity of these systems have led to a surge in so-
the network traffic and actions and distinguishes any irregular or ma- phisticated cyber-attacks, demanding advanced and flexible intrusion
licious way of behaving that deviates from the typical or anticipated detection methodologies. Current developments in the use of artificial
designs. Intrusion Detection system can be organized into two pri- intelligence (AI) have shown promise in enhancing IDS capabilities.

Peer review under responsibility of KeAi Communications Co., Ltd.



Corresponding author.
E-mail addresses: [email protected] (M.M. Rahman), [email protected] (S.A. Shakil), [email protected] (M.R. Mustakim).

https://doi.org/10.1016/j.csa.2024.100082
Received 14 May 2024; Received in revised form 19 September 2024; Accepted 17 December 2024
Available online 20 December 2024
2772-9184/© 2024 The Authors. Publishing Services by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY
license (http://creativecommons.org/licenses/by/4.0/)
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Fig. 1. Overview of Intrusion Detection Systems for IoT Net-


works.

This advancement enables IoT systems to learn from previous attacks, path toward developing more effective IDS solutions. This study aims to
improving the ability to detect and respond to a wide range of cyber provide a broad overview of current approaches to intrusion detection in
threats. Key challenges in implementing IDS for IoT include dealing with IoT networks, highlight key challenges like data imbalance, model inter-
data imbalance, selecting relevant features, managing model complex- pretability, and the ability of IDS to adapt to new threats, and suggest
ity, and ensuring adaptability to evolving attack vectors. For instance, directions for future research. Ultimately, the goal is to guide efforts
Xu et al. (2023) introduced an innovative IDS combining the Binary in creating IDS that are better equipped to protect IoT environments
Grey Wolf Optimizer (BGWO) and Recursive Feature Elimination (RFE) against the ever-growing range of cyber threats.
for feature selection, the Synthetic Minority Over-sampling Technique The main contributions of this survey are as follows:
(SMOTE) for data balancing, and XGBoost with Bayesian optimization
for classification [96]. This system demonstrated superior results across • This survey offers a thorough analysis of the latest intrusion de-
multiple IoT datasets, achieving high accuracy in both binary and mul- tection techniques and models for IoT networks, evaluating their
ticlass scenarios. Similarly, Sadhwani et al. [73] proposed a lightweight performance and providing a broad overview of the current land-
IDS tailored to the unique challenges of IoT networks, such as han- scape. It offers insightful information for developing and putting
dling missing values, data standardization, and feature selection. Their into practice the upcoming generation of intrusion detection sys-
system utilized multiple ML classifiers, including Logistic Regression tems. The study also provides an extensive and up-to-date analysis of
(LR), Random Forest (RF), Naïve Bayes (NB), Artificial Neural Networks machine learning-based intrusion detection systems (IDS) for Inter-
(ANN), and k-Nearest Neighbors (k-NN), achieving near-perfect accu- net of Things (IoT) networks, covering the concepts and real-world
racy on the TON-IOT and BOT-IOT datasets. In another study, Hossain implementations.
et al. [34] demonstrated the effectiveness of an ensemble-based machine • This survey comprehensively reviews data extraction techniques,
learning approach, which outperformed traditional methods in terms of useful matrices, and loss functions in IDS for IoT networks, rank-
accuracy and false positive rate. Their method utilized Random Forest ing the top-cited algorithms. It categorizes IDS studies based on
as the base classifier and incorporated various feature selection and en- ML/DL models and study focus, including traditional ML models,
semble strategies, such as correlation analysis, mutual information, prin- ensemble-based models, neural networks, deep learning models, and
cipal component analysis (PCA), XGBoost, gradient boosting, bagging, hybrid approaches, summarizing methodologies, datasets, and per-
stacking, and AdaBoost. The ongoing exploration of these methodologies formance. Additionally, it covers traditional ML models like Sup-
highlights the need for continuous innovation in IDS for IoT. For exam- port Vector Machines and Naive Bayes, ensemble-based models like
ple, Ngo et al. [61] compared different feature selection and extraction Random Forest and AdaBoost, neural networks and DL models like
methods using the UNSW-NB15 dataset, finding that feature selection CNNs and LSTMs, etc. Furthermore, the study focus includes sur-
methods like Information Gain (IG) and feature correlation significantly vey/literature reviews, lightweight/compact models, feature selec-
improved detection performance and reduced training time. Meanwhile, tion/extraction methods, and specific application areas.
Tekin et al. [86] investigated on-device ML algorithms for IoT intrusion • The study reviews the features, advantages, and disadvantages
detection, emphasizing the importance of energy-efficient models that of several datasets used in IoT intrusion detection. It provides
can be deployed on resource-constrained devices. In Fig. 1, addressing an overview of their characteristics, applications, and significance
these challenges requires continuous innovation and the development while comparing the advantages and disadvantages of current IoT se-
of more sophisticated IDS models that can operate efficiently in diverse curity measures across different scenarios. Key findings and conclu-
and resource-constrained IoT environments. sions from recent studies are highlighted, detailing categories, ben-
The increasing difficulties in protecting Internet of Things networks efits, drawbacks, and notable aspects of intrusion detection method-
are the motivation behind this research. With the rapid adoption of IoT ologies. The analysis emphasizes performance metrics and compu-
devices across sectors like smart homes, healthcare, and industrial con- tational efficiency, offering insights into IDS effectiveness and prac-
trol systems. There is an urgent need for robust intrusion detection sys- ticality. Standardized evaluation metrics and real-world testing are
tems (IDS) that can effectively detect a variety of cyber-attacks while stressed to ensure reliability. The survey also identifies IoT security
keeping false positives to a minimum. The diverse applications of IoT, challenges, highlighting the need for lightweight, efficient IDS mod-
each with its unique data types and attack vectors, require IDS frame- els for resource-constrained devices.
works that are flexible and capable of providing comprehensive protec- • The study highlights key challenges and unresolved issues in the de-
tion against a wide range of threats. velopment of IDS for IoT networks, including constraints like com-
Given the constantly changing landscape of IoT security, advancing putational complexity and high false positive rates. It also suggests
IDS methodologies remains a crucial area of study. Leveraging machine potential research directions, emerging trends, and new perspectives
learning (ML) and deep learning (DL) techniques, along with improved for future work. This forward-looking approach aims to guide and
methods for feature selection and data preprocessing, offers a promising inspire future research efforts in this rapidly evolving field.

2
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Fig. 2. Summary of articles selection process for IDS in IoT review.

2. Methodology trusion Detection Systems (IDS) for Internet of Things (IoT) networks.
These techniques include packet capture and analysis, which involves
Methodologies applied to Intrusion Detection Systems (IDS) for In- capturing and examining network traffic packets for anomalies using
ternet of Things (IoT) environments are diverse and constantly evolving, tools like Wireshark or Tcpdump Sheikh et al. [78]; log file analysis,
integrating strategies for extracting characteristics from data and using where IoT-generated logs are scrutinized to identify unusual patterns
the most recent advancement in different models. This section examines indicative of potential intrusions, often utilizing tools like Splunk and
these approaches in detail, focusing on the algorithms and techniques the ELK Stack Sarker et al. [77]; and feature extraction, which focuses on
employed for selecting features, preprocessing data, and training mod- selecting specific data attributes such as traffic volume, packet size, and
els, as well as the datasets commonly used for evaluation purposes. protocol types that are relevant for detecting intrusions, with machine
learning models employing these features to distinguish between normal
2.1. Selections of research articles and malicious activities Gates & Taylor, [28]. In Fig. 3, flow data anal-
ysis, utilizing techniques like NetFlow or sFlow, provides insights into
The process depicted in the provided figure represents a System- communication patterns between devices, helping detect unauthorized
atic Literature Review (SLR) methodology, which is widely used in or abnormal network activity Moustafa et al. [55].
fields such as computer science, software engineering, and informa- Data preprocessing and normalization are essential for handling
tion systems. The SLR methodology provides a structured and rigorous missing values, normalizing data ranges, and encoding categorical vari-
approach to reviewing and synthesizing academic literature, ensuring ables, ensuring data consistency and improving model accuracy Idowu
that the review process is comprehensive, transparent, and reproducible et al., [36]. Temporal data analysis captures time-series data from IoT
shown in Fig. 2. devices, which is vital for identifying patterns and anomalies over time
The review process begins with gathering relevant studies from a Jiang et al., [38]. Contextual data extraction involves capturing infor-
wide range of sources, including academic databases, conference pa- mation like device types and deployment settings, which aids in under-
pers, and peer-reviewed journals, to ensure a comprehensive collection standing the specific characteristics and vulnerabilities of the IoT en-
of existing research. Once these studies are collected, they undergo a vironment (Sheikh et al., 2020). Metadata extraction, including times-
screening phase, where they are filtered based on predefined criteria tamps and geolocation, provides more insight into the characteristics
such as relevance, quality, and scope. This step ensures that only the and context of studied material Sarker et al. [77]. Deep packet inspec-
most pertinent and high-quality studies are included. During this phase, tion (DPI) examines the full content of data packets to detect hidden
any duplicate entries are also identified and removed to maintain accu- malware, command and control communications, and data exfiltration
racy and reduce bias. activities Moustafa et al. [55]. Statistical feature extraction derives met-
Following the initial screening, the process continues with a refine- rics like mean, variance, and entropy from raw data, helping identify
ment phase, where newly added records are examined, and any dupli- deviations from normal patterns that indicate malicious activity Idowu
cates detected earlier are discarded. This step is crucial for upholding et al. [36]. Application layer data analysis looks for vulnerabilities such
the integrity of the review by confirming that each study is unique and as SQL injection and cross-site scripting by analyzing the application-
contributes valuable insights. The final stage involves data extraction layer protocols (HTTP, DNS, FTP, etc.). Jiang et al. [38]. Behavioral
and synthesis, where the key details from the selected studies-such as profiling creates profiles for normal behavior of users and devices, with
research methods, primary findings, and contributions-are compiled to significant deviations potentially indicating intrusions, such as insider
present a summary of the current state of research in the field. threats or compromised devices Sheikh et al. [78]. Encrypted traffic
The review phase involves a detailed synthesis and critical analysis analysis, despite the rise in encryption, can still provide insights into
of the collected data. During this step, the studies are carefully evaluated potentially malicious activities by examining traffic patterns and meta-
to identify patterns, gaps, and key themes within the literature, and con- data like packet size and timing Gates & Taylor, [28]. Analyzing user and
clusions are drawn from the combined findings. This in-depth synthesis entity behavior to spot anomalous activity, like irregular login patterns
helps to clarify the overall state of research in the field, pinpointing im- or access to private information, is known as User and Entity Behav-
portant trends, existing challenges, and potential directions for future ior Analytics (UEBA). Jiang et al. [38]. By connecting different events
research. The systematic literature review (SLR) methodology, as shown to recognize intricate attack patterns, the correlation of multi-source
in the figure, ensures that the review process is both structured and com- data creates a comprehensive picture for threat detection by combining
prehensive, maintaining a high standard of rigor and transparency. By data from multiple sources, such as network traffic, data logs, and end-
following SLR guidelines, researchers can create reviews that are me- point sensors. Moustafa et al. [55]. Advanced approaches such as deep
thodical, replicable, and capable of offering meaningful insights into learning can be used in machine learning-based feature learning to auto-
the specific research question or topic being examined. matically extract features from unprocessed data and identify intricate
patterns and relationships that may have gone unnoticed by conven-
2.2. Data extraction techniques tional methods. Idowu et al. [36]. IoT device fingerprinting helps iden-
tify and categorize devices based on their network behavior, detecting
A varied array of data extraction techniques is critical for efficiently unauthorized devices on the network Sheikh et al. [78]. Anomaly detec-
detecting and evaluating potential security risks while developing In- tion in sensor data is crucial for identifying signs of device malfunctions

3
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Fig. 3. Overview of Data Extraction Techniques in IDS for IoT Networks.

Fig. 4. Categorization of Studies on IDS in IoT.

or physical tampering Gates & Taylor, [28]. In network forensics, net- practical implications and challenges in deploying IDS in specialized IoT
work traffic is gathered and examined to look into previous security environments.
incidents and identify the source and scope of attacks. Moustafa et al.
[55]. Finally, threat intelligence integration improves IDS capabilities 3.1. Based on machine learning (ML), or deep learning (DL) models
by delivering real-time information on known threats, such as malicious
IP addresses, phishing domains, and malware hashes. Jiang et al. [38], In this analysis of the most recent exploration during the last sev-
making these comprehensive data extraction techniques essential for ro- eral years, we dig into the area that involves machine learning-powered
bust and reliable security in IoT networks. IDS (intrusion detection systems) designed to secure IoT devices. The
studies discussed here demonstrate the advanced machine learning ap-
proaches to create efficient intrusion detection systems for a variety
3. Studies on intrusion detection systems (IDS) in IoT of IoT scenarios. To improve intrusion detection accuracy, investiga-
tors applies a range of methodologies, including ensemble models, deep
The study of IDS (intrusion detection systems) in IoT has seen a va- learning, and feature selection. The collected data from intrusion detec-
riety of methodologies, classified according to the models of machine tion system (IDS) studies summarizes a variety of refined machine learn-
learning (ML) and deep learning (DL), as well as specialized research fo- ing algorithms and various datasets, representing cutting-edge advances
cuses. Ensemble-based models like Ada Boost, and Random Forest (RF) in this domain and reveals a diligent exploration of methodologies to
have been prominent in ensuring robust network security by integrat- reinforce network security against malicious intrusions. The ground-
ing multiple algorithms, yielding high accuracy across various datasets breaking work of Xu et al. [96] stands out as an example of the effec-
such as WSN-DS and UNSW-NB15. Neural networks and DL models, in- tiveness of ensemble-based methods, which have been the subject of nu-
cluding CNN-BiLSTM and LSTM, offer advanced feature extraction and merous studies. The combination of CNN-BiLSTM, CANET, FNN-Focal,
classification capabilities, often achieving near-perfect accuracy in com- RFS-1, and XGBoost yielded an impeccably balanced IDS, demonstrated
plex datasets like N-BaIot and BoT-IoT. Traditional ML models, such as by flawless evaluation metrics results. Hossain et al. in keeping with
automated ML and Naïve Bayes, continue to be relevant, especially in this sentiment, organized an ensemble ballet employing Random For-
scenarios requiring quick, reliable results with datasets like KDDcup99. est, AdaBoost, and gradient boosting algorithms, finishing in a standout
As shown in Fig. 4, hybrid approaches blend ML and DL techniques, uti- accuracy of 99.42% inside the domain of remote sensor networks [34].
lizing methods like SMOTE for data balancing, demonstrating improved Notable, the frequent use of Naïve Bayes as an algorithmic cornerstone,
detection rates across diverse datasets, including KDDCUP’99 and CIC- as shown in research like Sadhwani et al. [73] and Vishwakarma et al.
MalMem-2022. [92], showcasing its versatility across unequal backgrounds. Further-
The studies were also categorized based on Research focus. Re- more, the infusion of automated machine learning (AML), as smartly
search focus varies from feature selection and extraction to developing executed by Xu et al. [97], demonstrates the paradigm modification
lightweight IDS models for real-time application. Specific applications, toward automated methodologies, with a standard accuracy of 99.7%
such as security in IoT Electric Vehicle Charging Stations, highlight the achieved in classifying IoT intrusions using the KDDcup99 dataset as

4
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 1
Summary of Ensemble-Based Studies.

References Year Proposed Methodology ML/DL Models Dataset Performance

Xu et al. [97] 2023 A data-driven method to intrusion and Automated machine KDDcup99 The proposed algorithm cracks a multi-class
anomaly detection for the IoT based on learning classification issue with an accuracy of
automated machine learning. 99.7%, beating the present algorithms.
Ngo et al. [61] 2023 ML-Based Intrusion Detection by Feature Feature Selection, UNSW-NB15 The statement emphasizes that feature
Selection and Feature Extraction. Feature Extraction, extraction is more reliable than feature
DT, RF, Kneighbors, selection, especially when the parameter K is
MLP, Naive Bayes small (like 4). It also states that, among of the
five classifiers, the decision tree-based MLP is
the best for increasing feature selection
accuracy in addition as the neural
network-based MLP is best for feature
extraction.
Viegas et al. [91] 2023 Toward a trustworthy evaluation Schemes DT, RF, SVM, NB, CIC-IDS2017, The DNN model outperforms the other
for Network-Based Intrusion Detection. ANN, and DNN CSE-CIC-IDS2018, models in terms of accuracy (99%), while the
LUFlow NB model has the lowest false positive rate.
Zakariah et al. [100] 2023 ML-Based Adaptive Synthetic Sampling LSTM, CNN NSL-KDD The MLP classifier has 87% accuracy in
technique for intrusion detection. binary classification and performs
comparably to the attack and all-class
models, with F1 scores of 89% and 83%, and
AUC scores of 0.88 and 0.94 respectively.
Gu & Lu [30] 2021 Combining SVM with Naïve Bayes feature SVM, Naïve Bayes UNSW-NB15, High accuracy, detection rate, low false alarm
embedding to improve data quality and CICIDS2017 rate.
detection performance.
Guo et al. [31] 2016 Multiple-criteria time-varying chaos SVM, MCLP NSL-KDD High rate of detection with a minimal false
particle swarm optimization Support alarm rate compared to standard PSO and
vector machines and linear programming CPSO
Kabir et al. [22] 2018 The optimal allocation-based least square LS-SVM KDD 99 Realistic accuracy and efficiency
support vector machine (OA-LS-SVM)
Singh et al. [68] 2015 Profiling network traffic and the online OS-ELM NSL-KDD 2009, Improved accuracy, false-positive rate, and
sequential extreme learning machine Kyoto University detection time compared to other approaches.
(OS-ELM). benchmark dataset

well as Talukder et al. [85] achieved 99.99% on same dataset and 100% tems where diverse attack types and patterns need to be detected
on different datasets. Datasets, essential to the logical request, cross- across different data sources.
deeply grounded benchmarks like NSL-KDD and UNSW-NB15, close by 3. Deep Learning and Neural Networks Models: In the past few
space-explicit repositories, for example, WUSTL-IIOT-2021 and MQTT- years, neural networks and deep learning models have gained promi-
IoT-IDS2020. The continuing emphasis on the evaluation of IDS frame- nence due to their ability to detect complex patterns and representa-
works on IoT-centric datasets, such as BoT-IoT and ToN-IoT, where the tions in large datasets. Models like Convolutional Neural Networks
nuanced interaction between LSTM and ANN emerges as a noticeable (CNNs) and Long Short-Term Memory (LSTM) networks are particu-
thematic thread, as Khanday et al. [42] highlight, is notable. The cor- larly effective at processing high-dimensional data and understand-
pus of research extends beyond algorithmic complexities to embrace the ing temporal sequences. Deep learning approaches excel at recogniz-
pragmatics of real-world deployment, as proved by Rangelov et al. [69], ing subtle anomalies and patterns that may be missed by traditional
who advocate for the endless enhancement of IoT security measures methods.
in urban sceneries. Additionally, Tekin et al. [86] studied the various 4. Hybrid Approaches: Hybrid approaches in intrusion detection com-
aspects of on-device machine learning models’ energy usage in Smart bine multiple methodologies, including deep learning, traditional
Home Systems (SHSs). In conclusion, the presented collection of learn- machine learning, and other techniques to enhance detection capa-
ing demonstrates the vitality of intrusion detection systems by explain- bilities. These approaches often integrate feature classification, se-
ing the diverse algorithms and datasets in accordance with the evolv- lection, and extraction techniques. By combining the capabilities of
ing essentials of network security and the rapidly increasing challenges several models and methodologies, hybrid systems can offer more
posed by existing threat scenarios. Table 1 presents a list of similar stud- complete and robust solutions to difficult security concerns.
ies and their key results.

1. Traditional ML Models: Intrusion detection systems have made 3.2. Based on study focus
significant use of machine learning models because of their ability
to recognize and predict hostile activities based on historical data. 1. Specific Application Areas: The table presents a summary of re-
These models, like Naïve Bayes and Support Vector Machines (SVM), search studies in specialized fields. These studies address various dif-
are frequently used due to their ease of use and effectiveness in pro- ficulties and offer solutions designed for particular settings, includ-
cessing structured data. Traditional ML models are particularly ef- ing cyber-physical systems (CPS), software-defined networks (SDNs),
fective in scenarios with well-defined feature spaces and where the industrial control systems (ICS), and in-vehicle networks. The re-
relationships between inputs and outputs are relatively straightfor- search underscores progress in anomaly detection, intrusion detec-
ward. tion systems, and the adoption of innovative machine learning mod-
2. Ensemble-Based Models: Ensemble-based models combine the els to improve security and efficiency in these specific sectors.
characteristics of several machine learning methods to increase de- 2. Feature Selection/Extraction: This table summarizes works that
tection accuracy and durability. By combining predictions from vari- focus on feature selection and extraction strategies. These studies in-
ous models, such as Random Forests and AdaBoost, these approaches vestigate diverse methodologies, including Naïve Bayes feature em-
can reduce the variance and bias associated with single models. En- bedding, genetic algorithms, and hybrid approaches integrating ML
semble methods are particularly useful in intrusion detection sys- and DL models. The findings emphasize the importance of refining

5
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Fig. 5. Dataset Distribution Over Years for IDS in IoT Networks.

feature selection and extraction processes to improve the accuracy, Intrusion Detection Systems (IDS) designed for IoT networks have
efficiency, and reliability of security frameworks. primarily relied on datasets published in the past. Our research, how-
3. Lightweight/Compact Models: The table highlights research fo- ever, aims to evaluate the current landscape of Distributed Denial of
cused on developing lightweight and compact intrusion detection Service (DDoS) attack intensities, the complexities associated with IoT
systems (IDS) that maintain high performance while being resource- devices, and the limitations of the datasets currently available. With
efficient. The studies investigate various ML based classifiers, DL evolving attack strategies, it’s clear that there is a pressing need to up-
models, and optimization strategies to create IDS solutions that are date these datasets to reflect more recent threats. Table 3 provides a
suitable for environments with limited computational resources. The detailed overview of the shortcomings and constraints of the datasets
emphasis is on achieving high accuracy and reducing false alarm presently used in IoT network research.
rates without compromising on speed and efficiency. It is worth noting that some widely referenced benchmark datasets,
4. Survey/Literature Review: This table presents comprehensive sur- such as CIC-IDS2017, UNSW-NB15, and NSL-KDD, were not originally
veys and literature reviews that analyze existing intrusion detection created in an IoT-specific environment. Moreover, several datasets, in-
techniques, datasets, and challenges. These reviews provide insights cluding TON_IoT and Bot-IoT, suffer from class imbalances, which can
into the effectiveness of various methods, highlight key trends, and introduce biases or inaccuracies, complicating the process of accurately
identify potential areas for future research. The emphasis is on pro- identifying attacks. Additionally, datasets like UNSW-NB15 and NSL-
viding a comprehensive overview of the current status of IDS re- KDD (associated with the KDD Cup 1999) that have been utilized for
search, including the application of machine learning techniques and DDoS detection do not fully cover all types of DDoS attacks. Meanwhile,
the changing environment of cybersecurity threats. datasets like N-BaIoT2018 and Bot-IoT focus on only a few DDoS attack
types, such as TCP, UDP, and HTTP.
The following table provides an overview of key datasets employed
4. Dataset and evaluation metrics
in network security and intrusion detection research. These datasets
vary in their release and update dates, providers, advantages, and po-
4.1. Dataset
tential limitations, offering diverse resources for cybersecurity applica-
tions. They span a range of contexts, from general IoT environments to
In today’s dynamic cybersecurity world, evaluating and develop-
specific industrial settings, equipping researchers and practitioners with
ing intrusion detection systems (IDS) is significantly reliant on broad
the tools needed to enhance intrusion detection techniques. Understand-
and representative information. These datasets are critical benchmarks
ing both the strengths and weaknesses of these datasets is crucial for
for developing and testing artificial intelligence for detecting and re-
developing effective and adaptable IDS solutions. Let’s explore the key
sponding to various cyber threats. A summary of the experiments con-
characteristics of these datasets in detail.
ducted by numerous researchers on the detection of DDoS attack on
IoT-based networks using ML is shown in Tables 1 and 2. The major-
ity of this research focused on a DDoS attack on older datasets such as 4.2. Evaluation matrices
NSL-KDD, WSN-DS, UNSW-NB15, KDDCUP/KDDcup99, IoT-23, CICIDS
/ CIC-IDS2017/CSE-CIC IDS2018, Seven CPS-specific, and DS2OS. In various fields, particularly in evaluating performance, security,
Fig. 5 shows the release of IDS (intrusion detection system) datasets and quality, several key metrics are used to assess the effectiveness of
for the Internet of Things (IoT) from 2015 to 2023. It indicates that systems and models. Accuracy measures the correctness of predictions
datasets like UNSW-NB15 and KDDCUP99 were released in 2015, fol- by calculating the ratio of correct predictions to the total number of
lowed by WSN-DS in 2016 and CICIDS2017 in 2017. N-BaIoT was in- cases. Precision focuses on the accuracy of positive predictions, indicat-
troduced in 2018, while 2019 had no significant new datasets. The year ing the model’s ability to minimize false positives. Recall, or sensitivity,
2020 saw a surge with datasets like ToN-IoT, BoT-IoT, and IoTID20. measures the model’s ability to identify all relevant instances, reflecting
In 2021, datasets such as SIMARGL and AS-IDS were added, and 2022 the rate of false negatives. The F1-Score, a harmonic mean of precision
featured releases like Seven CPS-specific and CIC-MalMem-2022. The and recall, offers a balanced measure when considering both false posi-
most recent dataset, UNR-IDD, was introduced in 2023. This diagram tives and false negatives. The False Positive Rate (FPR) assesses the pro-
provides a visual representation of the flow and accumulation of these portion of incorrect positive predictions among actual negatives, which
datasets over time, showing their contribution to the field of IDS for IoT is critical in contexts like medical testing or fraud detection. Detection
networks. Rate and True Positive Rate (TPR) both highlight the model’s efficacy in

6
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 2
Summary of Ensemble-Based Studies.

References Year Proposed Methodology ML/DL Models Dataset Performance

Hossain et al. [34] 2023 Using ensemble-based machine RF, AdaBoost, UNSW-NB, UNR-IDD, Highest accuracy of 99.42% for WSN-DS,
learning to provide network gradient boosting. UKM-IDS, SIMARGL, NSL-KDD, 97.77% for UNSW-NB15
security. KDDCUP, CICIDS, and
WSN-DS, etc.
Sadhwani et al. [73] 2023 Compact and lightweight IDS that RF, NB, LR, ANN, BoT-IoT and ToN-IoT RF did great with TON-IOT, and NB
blends ML classifiers KNN performed well with BOT-IOT, scoring
100% accuracy in both binary and
multiple-class classification. They also
trained and predicted faster.
Musleh et al. [57] 2023 Machine learning algorithms used A number of feature IEEE Dataport The findings revealed that the integration
with feature extraction in an extractors, such as of VGG-16 with stacking led to the
IoT-based intrusion detection DenseNet and greatest accuracy, achieving a remarkable
system. VGG-16 transfer 98.3%.
learning models and
image filters
Kumar et al. [43] 2023 Used statistical feature ranking Naive Bayes N/A Upon assessing accuracy values, the
methods and machine learning for authors determine that removing the two
intrusion detection features with the lowest values enhance
accuracy, resulting in a peak accuracy of
95.69% using the top 7 features instead of
utilizing all 9 features.
Krishnan et al. [65] 2019 Developing VARMAN, a Non-Symmetric NSL-KDD, CICIDS2017, High accuracy, effective anomaly
multi-plane security framework Deep Autoencoder, HogZilla detection, efficient resource utilization.
integrating hybrid machine Random Forest
learning models.
Hamed et al. [83] 2018 Recursive feature addition and Recursive Feature ISCX 2012 Notable improvement in performance
bigram technique Addition (RFA) with using different metrics
SVMs
Singh et al. [39] 2023 Feature reduction using Random Random Forest, NSL-KDD Various metrics (accuracy, F-score,
Forest Classifier, SelectFromModel, Extra Trees, precision, recall) across different attack
Recursive Feature Elimination AdaBoost, SVM categories (DoS, Probe, R2L, U2R)
(RFE), and evaluation using
multiple machine learning
classifiers.
J.A. & K.A. [37] 2023 Implementing FL with various ML LSTM, CNN, NSL-KDD, VeReMi, Evaluation using F1-Score, Accuracy,
models, synchronization methods, Random Forest, MLP CAN-Intrusion, Car Hacking Precision, Recall
and aggregation techniques. dataset
Ravi et al. [9] 2022 Feature fusion ensemble RNN, LSTM, GRU, KDD-Cup-1999, UNSW-NB15, WSN-DS: 0.98 accuracy, KDD-Cup-1999:
meta-classifier SVM, Random WSN-DS, CICIDS-2017 0.99 accuracy, UNSW-NB15: 0.99
Forest, Logistic accuracy, and CICIDS-2017: 0.99 accuracy
Regression
Alazzam et al. [32] 2021 Fusion of two subsystems trained OCSVM KDDCUP-99, NSL KDD, 99.9% DR, 0.06 FPR, 99.3% accuracy
on normal and attack packets UNSW-NB15
using OCSVM and PIO.

identifying all relevant cases, which is crucial in security and threat de- rithms like K-Nearest Neighbors (KNN), Decision Tree (DT), and Support
tection systems. Lastly, Detection Accuracy combines the assessment of Vector Machines (SVM), along with advanced ensemble methods such as
both positive and negative case identifications, providing a comprehen- XGBoost and Stacking, contribute to effective feature selection and clas-
sive measure of a system’s overall performance. These metrics are es- sification. Specialized approaches, including Particle Swarm Optimiza-
sential for optimizing operations and enhancing decision-making across tion (PSO), Genetic Algorithms (GA), and hybrid methods like Hybrid
various domains. GA-GWO (Genetic Algorithm + Grey Wolf Optimizer), are also explored
The following table summarizes the most useful matrices commonly for optimizing model performance. The integration of techniques such
used for evaluating Intrusion Detection Systems (IDS) in IoT networks. as SMOTE for Data Balancing, Feature Selection, and innovative models
like MLEID (Machine Learning-based Ensemble Intrusion Detection) and
5. Comprehensive analysis of key findings SDRK Machine Learning Algorithm (Supervised Deep Neural Networks
+ Unsupervised Clustering), reflects the evolving landscape of IDS in
5.1. Analysis of machine learning models in IoT IDS IoT, emphasizing the need for adaptive and sophisticated methods to
secure IoT ecosystems as shown in Fig. 6.
Intrusion Detection Systems on IoT deploy a varied array of meth- This analysis not only highlights the versatility and performance
ods to boost security and detect malicious activities. In our research, we considerations of these algorithms but also sheds light on the evolving
investigated the common use and applicability of several ML methods landscape of machine learning methodologies. The presence of cutting-
across various types of studies. The data, illustrated in a comprehensive edge techniques, including Variational Autoencoders (VAEs) and Deep
pie chart, underscores the widespread use of algorithms like Random Convolutional Generative Adversarial Networks (DC-GAN), further indi-
Forest (RF), Gradient Boosting, Naive Bayes (NB), AdaBoost, and Lo- cates a shift towards more complex and sophisticated models, paving the
gistic Regression (LR), which are often chosen for their reliability and way for future research endeavors. Additionally, the chart showcases a
precision in classification tasks. Additionally, Artificial Neural Networks variety of less common algorithms, demonstrating the diverse method-
(ANN), including specialized forms such as CNN and DNN, provide effec- ologies utilized in the field. This analysis offers meaningful insights into
tive solutions for managing complex data patterns. Additionally, algo- the prevailing trends and the relative popularity of various machine

7
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 3
A Summary of research on Deep Learning models and Neural Networks.

Ref. Year Proposed Methodology ML/DL Models Dataset Performance

Xu et al. [96] 2023 ML-based approach to developing CNN-BiLSTM, CANET, BoT-Iot, NSL-KDD, The model attained perfect scores in
an IDS for IoT devices. FNN-Focal, RFS-1, XGBoost, WUSTL-IIOT2021, accuracy, precision, recall, and an F1
XGBoost-HPO, N-BaIot, and score of 1.0, which strongly indicates its
XGBoost-HPO-feature-selection WUSTL-EHMS-2020. effectiveness.
Khanday et al. [42] 2023 A lightweight IDS with a novel Linear SVC, Naïve Bayes, BoT-IoT and ToN-IoT The LSTM and ANN models performed
data pre-processing technique Logistic Regression, ANN, the best in both datasets for binary and
while using ML and DL classifiers. LSTM multiple classifications, with 99% and
95% accuracy, respectively.
Gaber et al. [27] 2023 PSO and Bat algorithm for picking RF classifier along with Bat WUSTL-IIOT-2021 RF on a dataset created from the BA
vital features and employs the RF algorithm (BA), KNN, MLP scheme scored the highest value of
classifier to identify malicious 99.99%.
activities in IIoT network traffic.
Vishwakarma et al. 2023 A novel two-phase Intrusion NB, elliptic envelop method NSL-KDD, The suggested method obtained
[92] Detection System (IDS) utilizing UNSW_NB15, and reasonable accuracy in the first phase,
Naïve Bayes for data classification CIC-IDS2017 with 97.5% accuracy in the NSL-KDD
and an elliptic envelope approach dataset, 86.9% in the UNSW_NB15
for anomaly detection has been dataset, and 98.59% in the CIC-IDS2017
proposed. dataset.
Rangelov et al. [69] 2023 Towards an integrated MLP, DNN, CNN, and LSTM N/A The authors aim to deploy and
methodology and tool chain for continuously improve suitable IoT
urban IoT networks and platforms. security measures in real-world urban IoT
frames.
Thakur et al. [71] 2021 Combining generic and Generic-Specific Autoencoder, CICIDS2017 High precision, recall, specificity, F1
domain-specific autoencoders to Random Forest scores.
extract and classify network
intrusions.
Wu et al. [99] 2020 Semantic re-encoding and Deep Deep Learning (ResNet) NSL-KDD On the NSL-KDD dataset, the SRDLM
Learning algorithm outperforms traditional
machine learning approaches by more
than 8% and detects Web character
injection network attacks with over 99%
accuracy.
Seo et al. [25] 2023 GAN-based method to generate Multi-layer perceptron (MLP), Fuzzy attack dataset Hyundai YF Sonata: Training: 403,299
adversarial attacks. random forests (RF), logistic of Hyundai YF samples (318,655 normal, 84,644 attack),
regression (LR) Sonata Testing: 351,273 samples (276,337
normal, 74,936 attack)
Akkepalli & Sagar 2024 Hybrid model combining CNN for CNN, Bi-LSTM NSL-KDD dataset The model achieved accuracy: 99.28%,
[82] spatial features; Bi-LSTM for precision: 99%, recall: 99.26%, and
temporal features. F1-Score: 99.18%.
Doriguzzi-Corin & 2024 Enhancing DDoS attack detection Adaptive Federated Learning CIC-DDoS2019 FLAD outperforms other models (FedAVG,
Siracusa [21] using adaptive federated learning. (FLAD) FLDDoS) in terms of F1 score and time
Thein et al. [84] 2024 Improving intrusion detection in Personalized Federated N-BaIoT dataset CNN
IoT using personalized federated Learning-based IDS (pFL-IDS)
learning and robust defense
mechanisms.
Maddu & Rao [53] 2023 CenterNet-based feature CenterNet, ResNet152v2, InSDN dataset, Edge InSDN dataset: 99.65% accuracy, Edge
extraction, ResNet152v2-based DCGAN IIoT dataset IIoT dataset: 99.31% accuracy
classification, and DCGAN for data
augmentation.
Fang et al. [98] 2024 Use of genetic algorithms for Not specified explicitly Various datasets Enhanced performance metrics.
feature selection in intrusion used in the industrial
detection systems control systems for
testing

learning methods, highlighting both commonly used techniques and 5.2. Analysis of key findings in IoT IDS
less-explored areas that may present opportunities for future research.
The bar chart illustrates the top 10 algorithms used in IoT intru- The understandings highlight different algorithms and methods
sion detection, ranked by the number of citations they have received custom-made to handle explicit difficulties in getting IoT networks.From
in research articles. Random Forest (RF) stands out as the most cited lightweight models with imaginative data pre-processing to hybrid tech-
algorithm with 12 citations, showcasing its widespread application and niques consolidating ML and DL, these discoveries offer significant ex-
effectiveness in this domain. SVM is the second-largest algorithm used in periences for network protection experts and specialists exploring the
various papers cited over 7 articles. MLP, LSTM and CNN are also highly intricacies of IoT security. Fig. 7 depicted the field of intrusion detec-
cited, each with 5 citations, indicating their strong presence in the lit- tion in IoT network environments is rapidly evolving, necessitating com-
erature. K-Nearest Neighbor (KNN) and Gradient Boosting follow with plicated approaches to dealing with and defending against emerging
6 citations each, highlighting their significant roles in IoT security. The threats.
Artificial Neural Network (ANN) is cited 5 times, reflecting its special- Table 12 gives brief outlines of key discoveries and decisions from
ized use. XGBoost, Decision Tree (DT), and Logistic Regression (LR) each different arrangements of ongoing inspections in this space. Each entry
have 4 citations, demonstrating their relevance. J48, with 2 citations, is captures the strengths, limitations, and notable aspects of the respec-
less frequently mentioned but still notable. This ranking showcase the tive intrusion detection methodologies. From novel coordination ways
insights on the most impactful algorithms in the realm of IoT intrusion to deal with algorithmic subtleties and dataset contemplations, these
detection. bits of knowledge add to a thorough comprehension of the present sta-

8
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Fig. 6. Distribution of Articles Across Machine Learning Algorithms.

Fig. 7. Top 10 IoT Intrusion Detection Algorithms by Cita-


tions.

tus and difficulties in IoT security. As we dive into the complexities of bustness to network attacks [16]. Kullback-Leibler Divergence (KLD)
each review, it becomes apparent that while progressions are promising, further enhances performance by quantifying the difference between
cautious thought of constraints and future exploration needs is essen- two probability distributions, making it effective in measuring anoma-
tial to refining and propelling the viability of interruption identification lies in IoT networks [102]. In Fig. 8 demonstrated, the Brier Score
frameworks in the powerful domain of IoT. Loss helps in evaluating the confidence of binary classification models,
making it particularly useful for detecting intrusions in binary settings
5.3. Analysis of loss functions in IDS for IoT networks [62].
A critical challenge in IDS for IoT is dealing with imbalanced
In Intrusion Detection Systems (IDS) for IoT Networks, various loss datasets, where rare but crucial intrusion events can be overshadowed
functions play a crucial role in optimizing model performance. These by more frequent, benign data. Loss functions like Focal Loss and Tver-
functions guide the training of machine learning and deep learning mod- sky Loss address this by emphasizing the detection of these rare events,
els by minimizing the difference between predicted and actual values, improving the detection rate for critical intrusions [4,80]. Additionally,
thus enhancing the accuracy of intrusion detection. For classification robust loss functions like Huber Loss and MAE help models maintain
tasks, Cross-Entropy Loss is widely used, measuring the performance performance even in the presence of noisy or anomalous data, ensuring
of models that output probability values, which is essential for distin- the stability of predictions [41].
guishing between normal and malicious traffic [20]. Meanwhile, regres- In complex scenarios involving multi-label and multi-class problems,
sion tasks typically use Mean Squared Error (MSE) and Mean Absolute Jaccard Loss, Hamming Loss, and Sparse Categorical Cross-Entropy Loss
Error (MAE), both of which focus on the differences between predic- come into play. These loss functions are designed to handle overlapping
tions and actual outcomes, proving useful in anomaly detection models categories and sparse data, improving the performance of models tasked
[13,46]. with identifying multiple, simultaneous intrusion types [51,67,87]. Fi-
In more specialized classification approaches like Support Vector nally, for models focusing on specific performance metrics, AUC Loss
Machines (SVMs), Hinge Loss penalizes incorrect classifications, max- and Kappa Loss are used to optimize the area under the ROC curve
imizing the margin between classes and improving the model’s ro- and Cohen’s Kappa statistic, respectively, ensuring that the models not

9
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 4
Summary of hybrid Models Studies.

References Year Proposed Methodology ML/DL Models Dataset Performance

Wadate et al. [93] 2023 Edge-Based Intrusion Detection Backpropagation (BP) neural KDD99 The proposed IDS demonstrate the highest
using ML Over the IoT Network. network, Basis function Radial accuracy, reaching up to 93%. In contrast,
(BFR) neural network, the accuracy of the Naive Bayes detection
Simulated Annealing is the lowest, which could be attributed to
algorithm. its use of a BP neural network.
Talukder et al. [85] 2023 A novel hybrid approach that SMOTE for data balancing, KDDCUP’99 and The accuracy achieved from testing on
blends ML and DL to boost XGBoost for feature selection CIC-MalMem-2022 two different datasets was remarkably
detection rates while maintaining high, reaching 99.99% and a perfect
reliability. 100% respectively.
Vanitha et al. [90] 2023 Improved AnT colony optimization DT, SVM, Ensemble classifier, UNSW-NB15 The new MLEID classifier’s overall
and machine learning-based Proposed MLEID findings are 98.34%, whereas smaller
ensemble Intrusion Detection rates of precision for classifiers like DT,
model SVM, and Ensemble are 77.67%, 89.67%,
and 94.34%, respectively.
Ahuja et al. [58] 2021 Hybrid model combining SVC and SVC, Random Forest Custom SDN Dataset Accuracy: 98.8%, and very low false
Random Forest alarm rate.
Fazio et al. [64] 2020 Using Markov chains to model Markov Chains Simulated Network High traceback accuracy, minimal
probabilistic packet marking for IP Data overhead.
traceback.
Gupta et al. [1] 2018 Training-resistant anomaly Anomaly detection using Various real network Resistant to training attacks, effective
detection system clustering techniques traffic data sources against common network attacks
Vadigi et al. [72] 2023 Enhancing IoT security using Federated reinforcement IoT datasets Reinforcement learning models
federated reinforcement learning. learning (FRL) for IDS
Li et al. [7] 2024 An intrusion detection system for CNN-LSTM, GAN NSL-KDD, In comparison to other works, the
hybrid DoS attacks (HDA-IDS), CICIDS2018, Bot-IoT HDA-IDS achieved an average overall
CL-GAN improvement of 5% in terms of accuracy,
precision, recall, and F1-Score.
Melucci [54] 2024 Spectral decomposition of mixtures Not specified Large test collection The method effectively balances ranking
of symmetric matrices for IR effectiveness with fairness.
Hoang & Kim [33] 2024 Supervised contrastive learning, Supervised contrastive ResNet Car Hacking dataset, Compared to the vanilla cross-entropy
ResNet, transfer learning survival dataset loss, the SupCon loss averagely reduces
false-negative rates by five times.
Sanju [74] 2023 It used Modified Metaheuristics BiLSTM, ELM, GRU IoT-23, IoT-23: 98.12% accuracy, CICIDS2017:
with Weighted Majority Voting UNSW-NB15, 99.98% accuracy, UNSW-NB15: 97.34%
Ensemble Deep Learning CICIDS2017 accuracy
(MM-WMVEDL)
Rajasekaran & 2023 GRU-BWFA classifier with GRU-BWFA classifier UNSW-NB15, UNSW-NB15, NSL-KDD datasets
Magudeeswaran Enhanced Salp Swarm NSL-KDD
[66] Optimization for feature selection

Fig. 8. Most Useful Loss Functions.

only classify intrusions accurately but also provide reliable confidence 6. Challenges, open issues and gap analysis
in their predictions [49,95].
Thus, the choice of loss function in IDS for IoT Networks is pivotal According to the data presented in Tables 1–11,13 this analysis aims
in ensuring that the models can robustly and accurately detect intru- to identify the main challenges, unresolved issues, and gaps in the cur-
sions, even in challenging environments with noisy data and imbalanced rent research landscape in the rapidly developing field of intrusion de-
datasets. Each loss function plays a specialized role, contributing to the tection systems (IDS) for Internet of Things (IoT) environments. Re-
overall efficacy and reliability of the intrusion detection system. searchers have developed various methodologies using a wide range of
ML and DL models supported by numerous datasets. These combined
5.4. Analysis of performance metrics and computational efficiency studies and datasets offer critical insights into the strengths and limita-
tions of current intrusion detection methods.
IDS plays an essential role in detecting and responding to malicious
activities in the network environment. Evaluating the performance and 6.1. Most prominent challenges
computational efficiency of IDS models is critical to assuring their effec-
tiveness and stability. The following table summarizes key performance A key challenge highlighted in the studies is the diversity of IoT
metrics and their importance in assessing IDS models. datasets used. Popular datasets like UNSW-NB15 and NSL-KDD are fre-

10
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 5
Summary of Specific Application Areas Studies.
References Purpose Main Findings Trend and Focus
Krishnan et al. [65] Develop a multi-plane security framework for SDN The combination of hybrid ML techniques and Integrating advanced machine learning techniques
NDAE significantly improves IDS performance and with SDN to develop scalable and efficient security
resource optimization in SDNs. frameworks.
Thakur et al. [71] Enhance IDS in CPS with deep autoencoder models The unique architecture of GSAE effectively Developing deep learning models that can handle
disentangles generic and domain-specific features, the complexities and variations of network
leading to improved classification performance. intrusions in CPS.
Ahuja et al. [58] Classify benign and DDoS attack traffic in SDN The proposed hybrid ML model significantly Enhancing DDoS attack detection in SDN
environments improves DDoS attack detection accuracy and environments using advanced ML techniques and
reduces false alarms compared to existing custom datasets for better accuracy and reliability.
methods.
Satheesh et al. [59] Develop a priority-based model using SDN to Flow-based ML models combined with SDN Leveraging SDN’s capabilities for centralized
detect anomaly intrusions provide a robust framework for real-time anomaly control and dynamic response to enhance network
detection and network management, security through advanced ML techniques.
outperforming conventional methods.
Fazio et al. [64] Model probabilistic packet marking for IP Modeling the PM approach using Markov chains Emphasizes the use of probabilistic and stochastic
traceback provides a systematic and efficient method for IP models to enhance network security mechanisms,
traceback, reducing the number of packets particularly for mitigating DoS and DDoS attacks.
required for accurate path reconstruction.
Kabir et al. [22] Develop a comprehensive multi-plane security OA-LS-SVM effectively detects intrusions with a Improving intrusion detection through statistical
framework for SDN realistic performance. techniques and optimization.
Gupta et al. [1] Enhance IDS in CPS by combining generic and Training attacks can be detected and the IDS Enhancing IDS resilience against training-based
domain-specific deep autoencoders remains effective. attacks.
Singh et al. [68] Develop a priority-based model using SDN to OS-ELM with traffic profiling is efficient and Reducing complexity and improving performance
detect anomaly intrusions effective for NIDS. in IDS using OS-ELM.
Wu et al. [99] Model probabilistic packet marking for IP Semantic re-encoding and deep learning Enhancing generalization ability and robustness of
traceback significantly improve the accuracy and robustness IDS using semantic re-encoding combined with
of the intrusion detection model. deep learning.
Truong et al. [8] Detect cyberattacks in ICS using anomaly detection The proposed method effectively detects Anomaly detection is a viable approach for
cyberattacks in industrial control systems. securing industrial control systems.
Vadigi et al. [72] Propose a federated reinforcement learning-based The proposed system effectively detects intrusions FRL enhances the security of IoT systems by
IDS for enhancing IoT security in IoT environments using FRL. improving detection accuracy.
Doriguzzi-Corin & Propose an adaptive federated learning approach The proposed approach effectively detects DDoS Adaptive federated learning improves the
Siracusa [21] for detecting DDoS attacks attacks with high accuracy. detection of DDoS attacks.
Boobalan et al. [63] Explore the integration of federated learning with The integration of FL with IIoT provides Federated learning enhances the security and
IIoT significant security and efficiency improvements. operational efficiency of IIoT systems.
Thein et al. [84] personalized federated learning method proposed pFL-IDS effectively detects intrusions Personalized federated learning enhances IDS
and mitigates the impact of poisoning attacks. performance and robustness against poisoning
attacks.
Li et al. [7] Address security challenges in IoT networks with In terms of botnet and DoS attack detection, the Improving IoT security using hybrid detection
hybrid IDS HDA-IDS performs better than other IDS. methods and advanced AI models.
Melucci [54] Maximize ranking effectiveness and fairness in Maintaining an acceptable level of effectiveness Balancing effectiveness and fairness in information
information retrieval systems and fairness simultaneously is feasible. retrieval systems.
Hoang & Kim [33] Propose a deep learning model for in-vehicle IDS The SupCon ResNet model effectively classifies Enhanced performance by using contrastive
multiple attacks and adapts to new vehicle models. learning and transfer learning.
Khan et al. [40] Develop method for detecting intrusion attacks on Effective detection of in-vehicle network intrusions Security in automotive networks.
in-vehicle CAN using the proposed method.
Zhu et al. [103] Enhance transferability of adversarial attacks Hybrid attacks can significantly improve Enhancing the effectiveness of adversarial attacks.
using hybrid approach adversarial transferability.
Rehman et al. [70] Enhance IoT security using proactive defense Proactive defense mechanisms can significantly Focus on proactive security measures in IoT.
mechanisms enhance IoT security.
Abolfathi et al. [3] Enhance web privacy on HTTPS traffic with novel The novel method significantly enhances web Enhancing privacy on web traffic.
method privacy.
Sanju, [74] Propose hybrid approach for IDS in IoT The proposed hybrid technique increases the Focus on addressing the physical and functional
accuracy and efficiency of IDS in IoT systems. variety of IoT systems.
Gupta et al. [79] Develop efficient IDS for IoT-enabled smart cities The hybrid approach significantly enhances Focus on improving IDS performance in
using hybrid optimization and deep learning classification accuracy and reduces training time. IoT-enabled smart cities.
Shone et al. [60] Develop IDS using transformer-based models for Transformer-based models improve the detection Focuses on advanced transformer models.
improved detection of network intrusions capabilities of IDS in network environments.
Rajasekaran & Detect malicious attacks in network environments Proposed classifier effectively detects and Improve the detection of cyber-attacks using
Magudeeswaran, using GRU-BWFA classifier differentiates various types of network attacks advanced classification techniques.
[66] with high accuracy.
Seo et al. [25] Introduce GAN-based adversarial attacks in GAN-based adversarial attacks can significantly Focus on improving the adaptability of ML-based
in-vehicle networks reduce the detection accuracy of ML-based IDS in IDS to adversarial attacks.
in-vehicle networks.
Akkepalli & Sagar, Propose hybrid CNN, Bi-LSTM model for effective In network anomaly detection, the hybrid CNN and Improvement using hybrid DL models.
[82] network anomaly detection Bi-LSTM model performs better than other models.
J.A., K.A. [37] Explore FL for IDS in IoV Highlighted the effectiveness of FL in IoV contexts Increasing interest in FL to enhance security in IoV
for improving IDS performance while maintaining by decentralizing the learning process.
data privacy.
Al-Ghuwairi et al. Detect anomalies in cloud computing using time Demonstration on applicability of time series Focus on real-time anomaly detection and
[10] series data and ML models for improving IDS in cloud environments. continuous monitoring in cloud computing.
Maddu & Rao, [53] Implement CenterNet-ResNet152V2 based deep The suggested model has great detection Future work will involve strengthening the system
learning technique for SDN capabilities and can effectively mitigate and through feature selection, detecting zero-day and
identify the source of attacks. lowering the rate of DDoS attacks on IoT systems.
Rangelov et al. [69] Integrated methodology for ML-based IDS in urban IoT security mechanisms are being continuously Enhancing IoT security in urban networks.
IoT networks. improved in urban IoT.
ElKa-shlan et al. [23] ML-based intrusion detection system for IoT Classified attacks with 99.2% accuracy using Enhancing security of IoT EVCSs using ML-based
electric vehicle charging stations (EVCSs). filtered classifier algorithm. IDS.

11
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 6
Summary of Feature Selection/Extraction Studies.

References Purpose Main Findings Trend and Focus

Gu & Lu, [30] Improve IDS by merging Naïve Bayes When SVM and Naïve Bayes feature embedding Enhancing ML models for intrusion detection by
feature and SVM. are used, IDS performance is greatly enhanced. improving data quality and integrating hybrid
The method is effective and adaptable to techniques to boost performance.
different datasets and environments.
Guo et al. [31] Improve IDS by merging Naïve Bayes TVCPSO improves SVM and MCLP by optimizing Optimizing traditional ML models with advanced
feature and SVM parameter setting and feature selection. optimization techniques.
Hamed et al. [83] Classify benign and DDoS attack traffic in RFA and bigram technique improve NIDS Combining feature selection methods with ML
SDN environments performance significantly. for better NIDS.
Ravi et al. [9] Propose feature fusion ensemble Feature fusion and ensemble meta-classifier Improved network intrusion detection can be
meta-classifier for IDS enhance the detection and classification of achieved by incorporating feature fusion with
network intrusions. ensemble learning.
Singh et al. [39] Develop efficient IDS using SVM and Ensemble learning algorithms combined with Focus on combining different ML techniques to
ensemble learning algorithms SVM provide robust intrusion detection. improve detection performance.
Fang et al. [98] Propose feature selection method for ICS Genetic algorithms can effectively enhance Focus on enhancing security in industrial control
using genetic algorithms feature selection for better intrusion detection systems through advanced feature selection
performance. techniques.
Ngo et al., [61] Selecting and Extraction Features for Feature extraction more reliable, DT and MLP ML-Based Intrusion Detection: Feature Selection
Machine Learning-Based Intrusion best for both. and Extraction.
Detection
Talukder et al., [85] Novel hybrid approach blending ML and Accuracy of 99.99% and 100% on respective Blending ML and DL for high accuracy intrusion
DL for intrusion detection. datasets. detection.

Table 7
Summary of Lightweight/Compact Models Studies.

References Purpose Main Findings Trend and Focus

Vo et al. [35] Propose AI-powered IDS improving performance reaches higher performance in comparison to Improving dataset quality and detection
by enhancing training set quality state-of-the-art techniques. accuracy, reducing latency.
Devendiran & Propose a deep learning-based IDS using chaotic When it comes to robustness and accuracy, the combining chaotic optimization techniques
Turukmane, [19] optimization strategy Dugat-LSTM model performs better. and DL to improve intrusion detection.
Aljehane et al. [6] Propose a new technique for intrusion The GJOADL-IDSNS technique effectively Enhancing network security using advanced
recognition and classification using deep learning recognizes and classifies network intrusions. optimization and deep learning techniques.
Alazzam et al. [32] Introduce lightweight IDS with low false alarm The system effectively reduces false alarms while Focus on reducing false alarms in network
rate maintaining high detection accuracy by using a intrusion detection systems.
combination of OCSVM and PIO.
Sadhwani et al. [73] Compact and lightweight IDS with ML classifiers. RF achieved 100% accuracy for ToN-IoT, NB for Developing compact and lightweight IDS
BoT-IoT. models.
Khanday et al. [42] Lightweight IDS with novel data pre-processing. LSTM and ANN achieved 99% and 95% Improving accuracy of lightweight IDS with
accuracy, respectively. novel data pre-processing.

Table 8
Summary of Survey/Literature Review Studies.

References Purpose Main Findings Trend and Focus

Mothukuri et al. [89] An overview of the privacy and security This paper examines the most recent methods in The study identifies key challenges and potential
features of federated learning Federated learning with an emphasis on privacy solutions in securing FL systems.
and security concerns.
Valkenburg & Systematically review the application of the The Three Lines Model is effective but has Governance frameworks in cybersecurity.
Bongiovanni, [11] Three Lines Model in cybersecurity limitations in its application in cybersecurity.
Alsoufi et al. [56] Review anomaly-based IDS in IoT using deep Deep learning techniques are effective in dealing Deep learning approaches are becoming more
learning with security challenges in IoT ecosystems, with and more popular for IoT anomaly detection,
supervised methods performing better. particularly after 2018.
Kumar et al. [44] Review various IDS techniques used in Identified key trends and challenges in IDS Focus on evolving IDS techniques to address
network environments research, with recommendations for future work. emerging network security threats.
Dasgupta et al. [18] Review ML techniques in cybersecurity ML techniques significantly enhance detection Emphasis on the integration of ML techniques for
and response capabilities in cybersecurity. proactive and reactive cybersecurity measures.
Khraisat et al. [2] Review IDS techniques, datasets, and More recent and extensive datasets covering a Future IDS should address the challenge of
challenges broad range of malware activity are required. detecting newer types of malware and
overcoming evasion techniques.

quently employed, but they suffer from issues such as class imbalance, its the creation of effective intrusion detection systems tailored to the
inadequate representation of certain attack types, and a focus on gen- unique challenges of IoT environments.
eral IT networks rather than the distinctive features of IoT environ- Researchers are encouraged to combine more recent datasets or
ments. This raises concerns about the generalizability of the offered in- develop new ones to address emerging challenges because the rapid
trusion detection models to genuine IoT conditions. The utilization of growth of attack techniques necessitates datasets that imitate the current
outdated datasets like KDDCUP’99 further complicates this, as they may threat landscape. Appropriate ML models in IoT security frameworks
not be relevant to present-day cyber threats. Additionally, the scarcity presents another set of difficulties, particularly regarding computational
of datasets explicitly tailored for IoT settings remains an open issue. The costs. IoT devices typically have limited resources, so deploying complex
need for datasets that accurately reflect the complexities of IoT network ML models on them demands careful attention to computational effi-
traffic, such as the variety of devices and communication patterns, lim- ciency and energy usage. There is a need for resource-demanding model

12
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 9
Summary of Datasets.

Dataset Name Released Provider Advantages Disadvantages

N-BaIot 2018 Yair Meidan and collaborators Real-world IoT operations, emphasis on Limited to specific IoT devices,
typical assaults, detect attacks from focuses on Mirai and Bashlite
hacked IoT devices, differentiate between DDoS attacks only.
hour-long and millisecond-long IoT-based
threats
WUSTL-IIOT2021 2021 Washington University in St. Louis. Realistic IoT network, diverse attack May require removal of certain
types. columns for generalization,
accuracy and prediction speed
drop on manipulated dataset.
WUSTL-EHMS-2020, 2020. Washington University in St. Louis. Created using real-time Enhanced Limited to healthcare monitoring
Healthcare Monitoring System (EHMS) systems.
testbed. Provides ample training samples
for machine learning, ensuring excellent
detection performance of ML algorithms.
NSL-KDD Refined version of Canadian Institute for Cybersecurity at No redundant or duplicate records, Suffers from class imbalance and
the KDD’99 dataset. the University of New Brunswick. balanced distribution of records across overlap, lacks IoT-specific data,
different difficulty levels. and does not cover all types of
DDoS attacks.
ToN-IoT and BoT-IoT 2020 and 2023 School of Engineering and Information Realistic network environment and Limited to specific IoT devices,
Technology, UNSW Canberra at ADFA large-scale network designed, normal and lacks a comprehensive feature set,
botnet traffic, includes heterogeneous suffers from significant class
data sources. imbalance, and does not include
all DDoS attack types.
WSN-DS 2016 SEL - PSU. The Security Engineering Lab Specialized for WSN, it features four forms Limited to WSN, class imbalance,
(SEL), College of Computer and of DoS assaults. The dataset has been used and dynamic network behavior
Information Sciences, Prince Sultan in numerous machine-learning-based issues which need to be addressed
University, Saudi Arabia. intrusion detection systems. before employing this dataset for
any classifier model development.
UNSW-NB / 2015 University of New South Wales (UNSW), Real modern normal and synthetic Issues with class imbalance, class
UNSW-NB15 Canberra at the Australian Defence Force contemporary attack behaviors. overlap, cannot detect all kinds of
Academy (ADFA) attacks, not focused on IoT context.
As new attacks arise and old
attacks are evolving. DDoS attacks
are not taken into consideration.
UNR-IDD 2023 The University of Nevada, Reno. Offers network port statistics for detailed Limited to network port statistics.
intrusion analysis, providing a diverse
range of samples and scenarios for
researchers.
UKM-IDS20 2020 The dataset was provided by Universiti Includes novel attack types such as ARP
Kebangsaan Malaysia. Poisoning, DoS, Port Scan, and various
exploits.
SIMARGL 2021 RoEduNet, Romania. Features derived from live traffic make This dataset serves as the basis for
the dataset highly suitable for building the multi-class classification issue.
deployable network intrusion detection
systems.[1].
NF-UQ-NIDS 2020 University of Queensland. Supports the merging of multiple smaller Prone to dimensional overload
datasets into larger, more universal NIDS from extensive feature collection
datasets, encompassing flows from and storage, limiting the
various network setups and diverse attack evaluation of ML model
scenarios. generalization.
NF-ToN-IoT 2020 The University of New South Wales Heterogeneous data sources, high ML model performance using the
(UNSW) Canberra at the Australian accuracy in classifying network traffic. NF-ToN-IoT dataset is frequently
Defence Force Academy (ADFA). inconsistent.
CICIDS / 2017/2018 The Communications Security Offers a comprehensive set of network The datasets do not account for
CIC-IDS2017/CSE- Establishment (CSE) & the Canadian traffic and image representations, dynamic network changes, such as
CIC-IDS2018 Institute for Cybersecurity (CIC)1. beneficial for network security and IDS network upgrades or evolving
research and development. attack techniques.
Seven CPS-specific 2022 The School of Computer Science and Captures system behavior and interactions Datasets generated due to scarcity
Informatics, De Montfort University, for AI algorithms in securing of real CPS datasets.
Leicester, UK. cyber-physical systems.
KDDCUP/KDDcup99 1999 MIT Lincoln Lab; The Defense Advanced Publicly available and can be used for Data is relatively old, which may
Research Projects Agency (DARPA) as benchmarking, and comparison of not reflect modern attack
part of the Knowledge Discovery and Data different IDS, preprocessed and cleaned, techniques and is not perfectly
Mining (KDD) Cup competition. suitable for supervised learning tasks. balanced, with more normal
connections than attack
connections.
DS2OS 2022 University of California, Berkeley. A large and diverse real-world IoT time Unlabeled data and limited
series dataset that includes various data coverage of different IoT devices.
types, such as sensor data, network traffic,
and system logs; well-documented and
easy to use.
IEEE Dataport Image 2021 IEEE DataPort. Provides a comprehensive collection of Effectiveness depends on the use
network traffic and image representations, case, and may not cover all
useful for network security and IDS possible attack scenarios.
research and development.
(continued on next page)

13
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 9 (continued)

Dataset Name Released Provider Advantages Disadvantages

IoT-23 2020 Developed by the Avast AIC laboratory, Labeled dataset for IoT malware May not be sufficient for
funded by Avast Software. infections and benign traffic, suitable for researchers who require a larger
training machine learning algorithms. dataset, capture range may not be
representative of the current IoT
landscape, not diverse enough as it
only contains traffic from IoT
devices.
CIC-MalMem-2022 2022 The Canadian Institute for Cybersecurity Suitable for malware detection using ML May not reflect the current
(CIC) at the University of New Brunswick. algorithms, closely reflects real-world malware landscape; limited
scenarios with prevalent malware types. diversity with only Ransomware,
Spyware and Trojan Horse types.
LUFlow 2020 Lancaster University. Continuously updated using the Citrus After downloading the dataset,
framework. certain columns need to be
removed as they are unique to the
attacks and would expose the type
of attack to the model.
MQTT-IoT-IDS2020 2021 Hanan Hindy, Christos Tachtatzis, Robert Includes generic networking scanning and Requires fixing the number of
Atkinson, Ethan Bayne, and Xavier MQTT brute-force attacks, suitable for ML features, composed of imbalanced
Bellekens. applications; raw pcap files allow for data.
in-depth analysis of MQTT IoT network
communications and related attacks.
This table presents a comprehensive overview of various datasets relevant to cybersecurity, with a focus on intrusion detection and network security. Each dataset is
defined by its source, purpose, and features, serving as valuable resources for researchers developing and evaluating machine learning-based solutions. The datasets
range from IoT-specific to those designed for Cyber-Physical Systems (CPS) security, covering diverse attack types, network traffic scenarios, and data formats. By
highlighting both the advantages and disadvantages, the table offers insights into the suitability of each dataset for different research and development needs in the
dynamic field of cybersecurity.

combinations and real-world testing on IoT devices, raising issues about real-world testing is needed to validate the performance and adapt-
the adaptability and proficiency of proposed frameworks when deployed ability of these models.
on resource-constrained IoT gadgets. The gap lies in the lack of compre-
hensive assessments on various IoT gadgets and datasets, restricting the 6.2. Open issues
versatility and flexibility of proposed intrusion recognition frameworks.
The review also points out specific challenges, including the need to • Feature Selection and Data Preprocessing: Effective feature se-
balance computational efficiency with accuracy and the critical role of lection and data preprocessing are critical for improving IDS perfor-
selecting models that suit the characteristics of the dataset. High false mance. However, there is no standardized approach for these pro-
positive rates in anomaly-based IDS add complexity, making it difficult cesses in the context of IoT, leading to inconsistent results and diffi-
to differentiate between harmless anomalies and genuine attacks, which culties in comparing different IDS models.
results in excessive alerts and undermines trust in the IDS. • Standardized Evaluation Metrics: The absence of standardized
Furthermore, a notable gap in current research is the absence of stan- evaluation metrics and benchmarks for IoT IDS represents a criti-
dardized evaluation metrics and benchmarks specifically tailored for IoT cal gap in the field. Current evaluation methods vary widely, mak-
intrusion detection systems. Although many studies claim high accuracy ing it challenging to evaluate the relative efficacy of different IDS
rates under controlled conditions, the lack of established measures and approaches. Developing standardized metrics that mimic real-world
benchmarks presents issues for comparing and generalizing these results IoT scenarios is essential for more accurate assessments.
across different contexts. Establishing evaluation standards that closely • Adaptability to Evolving Threats: IDS must be able to adjust to
replicate real-world IoT environments is crucial for achieving more re- new and emerging attack vectors because of the dynamic nature of
liable assessments of intrusion detection models. cyber threats. Many existing IDS models lack this adaptability, mak-
ing them less effective over time. Continuous learning and updating
• Dataset Imbalance and Outdated Data: Many datasets used in IDS mechanisms are required to maintain IDS relevance in the face of
research, such as NSL-KDD and KDDCUP99, are outdated and ex- emerging threats.
hibit significant class imbalance issues. These datasets often do not
6.3. Gap analysis
capture the latest types of attacks or the distinctive qualities of IoT
environments, limiting the generalizability and effectiveness of IDS
A noticeable gap in the current landscape is the limited focus on real-
models trained on them.
world applicability and the absence of standardized evaluation for IDS.
• Computational Complexity: Integration of machine learning mod-
Although studies claim high accuracy rates in controlled settings, real-
els into IoT security systems presents considerable challenges, par-
world IoT contexts remains ambiguous. There is a need for standard-
ticularly regarding computational costs. IoT devices are frequently
ized benchmarks and evaluation metrics that mimic the complexity of
resource limited, necessitating careful consideration of computing
IoT environments, allowing for more accurate assessments of proposed
efficiency and power usage.
intrusion detection solutions.
• High False Positive Rates: Anomaly-based IDS, while effective in
detecting novel attacks, tend to have significant false positive rates. • Comprehensive IoT-specific Datasets: Many of the existing
This occurs because of difficulty in distinguishing between benign datasets struggle with severe class imbalance issues and a neglect
anomalies and actual attacks, leading to unnecessary alerts and re- of IoT-specific characteristics. The use of outdated datasets raises
duced trust in the IDS. concerns about their applicability to contemporary cyber threats,
• Lack of Real-world Testing: Many IDS models are evaluated in con- and there is a scarcity of datasets designed for IoT. Developing new
trolled environments using synthetic datasets, which fail to fully rep- datasets that reflect the complexities of IoT network traffic is crucial
resent the complexities of real-world IoT networks. More extensive for advancing IDS research.

14
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 10
Dataset Descriptions for Cybersecurity Research.

Dataset Name Descriptions

N-BaIot This dataset contains real traffic data from 9 commercial IoT devices infected by BASHLITE malware and Mirai attacks. It includes 115 features
covering both normal and ten types of attack traffic (e.g., Scan, Junk, UDP flooding) to evaluate intrusion detection models. The dataset captures
authentic botnet traffic in a controlled environment.
WUSTL-IIOT2021 Designed for AI and ML-based research on Industrial Internet of Things (IIoT) security, this dataset includes network data representing various traffic
types, such as normal, command injection, DoS, reconnaissance, and backdoor traffic, collected via an IIoT testbed.
WUSTL-EHMS-2020 Created from a testbed combining network flow metrics with patients’ biometric data, this dataset addresses the scarcity of integrated biometric and
network traffic datasets. It includes both normal and malicious traffic types (e.g., man-in-the-middle attacks like spoofing and data injection) and
utilizes real-time data capture from medical sensors, gateways, and control components.
NSL-KDD A refined version of the original KDD’99 dataset, it addresses several known issues with the earlier version. It consists of four major attack types: DoS,
Probe, R2L, and U2R. This dataset is widely used for benchmarking intrusion detection methods and serves as a standard reference, despite not
perfectly representing real-world networks.
ToN-IoT and BoT-IoT These modern datasets are designed for AI-driven cybersecurity applications and include heterogeneous data from IoT sensors, multiple operating
systems (Windows 7/10, Ubuntu 14/18), and network traffic. They are suitable for evaluating intrusion detection systems, threat intelligence, malware
detection, and more.
WSN-DS Focused on Wireless Sensor Networks (WSNs), this dataset contains 374,661 records across 17 features, which help detect and classify various Denial of
Service (DoS) attacks such as Blackhole, Grayhole, Flooding, and Scheduling. The dataset uses the LEACH protocol for analysis.
UNSW-NB This dataset comprises 2,540,044 instances of modern, realistic network activity, both normal and abnormal, including 9 types of attacks. It is
commonly used for machine learning-based intrusion detection solutions and includes data sources like DNS and HTTP information.
UNR-IDD Created to address issues like suboptimal performance and inadequate tail class representation, this dataset leverages network port statistics for
fine-grained intrusion analysis. It contains 34 features that differentiate between normal data and various attack types, such as TCP-SYN, PortScan, and
Overflow.
UKM-IDS20 A dataset for network intrusion detection with 46 features covering attacks like DoS, scans, ARP poisoning, and exploits. The training and test sets
contain instances of both normal and malicious traffic and are analyzed using feature selection and rule-based classifiers in machine learning.
SIMARGL Assembled from real-life network traffic, this dataset contains 44 features with an unbalanced class distribution. It is evaluated through cross-validation
to provide robust security against emerging cyber threats.
NF-UQ-NIDS A comprehensive dataset for machine-learning-based Network Intrusion Detection Systems (NIDSs), including 11,994,893 records of benign flows and
various attack types (DoS, DDoS, Injection, Reconnaissance, etc.). It demonstrates the advantages of combining multiple smaller datasets into a larger,
more universal one.
NF-ToN-IoT Designed for Industry 4.0/IoT and IIoT security research, this dataset includes diverse data from IoT/IIoT sensors, multiple operating systems, and
network traffic. It contains 43 features with over 16 million data rows classified as attack or benign, covering categories like DoS, DDoS, Injection, and
Reconnaissance.
CIC-IDS2017 A dataset that includes benign traffic and various attack types (e.g., Brute-force, Heartbleed) across seven scenarios. It captures network traffic and
system logs with 80 features extracted using CICFlowMeter-V, aimed at developing and evaluating intrusion detection systems.
Seven CPS-specific Essential for applying AI algorithms to Cyber-Physical Systems (CPS) security, this dataset includes sensor measurements, network traffic elements,
attack representations, and necessary dataset features.
KDDcup99 Based on the DARPA’98 IDS evaluation program, this dataset contains 4 GB of raw TCP dump data, processed into around 5 million connection records.
It serves as a benchmark dataset for intrusion detection systems, with records labeled as normal or attack, covering DoS, U2R, R2L, and Probing
categories.
DS2OS A valuable resource for IoT time series data analysis, this dataset is large, diverse, and well-documented, covering various data types. However, it is not
labeled and may not fully represent all IoT data types.
IEEE Dataport Image Contains over 800 samples of normal and malicious traffic visualized in binary format, serving as a benchmark for intrusion detection systems. Rich
visual features enhance its utility, with additional data in image format from five attack scenarios [96].
IoT-23 A labeled dataset of malicious and benign IoT network traffic, comprising twenty scenarios that represent various types of attacks on IoT networks. It
includes over 760 million packets and 325 million labeled flows captured from 2018 to 2019 at the Stratosphere Laboratory.
CIC-MalMem-2022 A collection of malware memory analysis data with 58,596 records, balanced between malicious and benign memory dumps. It includes Spyware,
Ransomware, and Trojan Horse malware, designed for testing obfuscated malware detection methods through memory analysis.
LUFlow A flow-based intrusion detection dataset with robust ground truth, correlated with threat intelligence services. It contains telemetry from honeypots
within Lancaster University’s network and features an autonomous labeling mechanism for continuous data capture, labeling, and publishing.
MQTT-IoT-IDS2020 A simulated realistic MQTT IoT network dataset for evaluating IoT Intrusion Detection Systems. It includes various MQTT scenarios and attack data,
generated using a simulated MQTT network with sensors, a broker, a camera, and an attacker. It records five scenarios: normal operation, aggressive
scan, UDP scan, Sparta SSH brute-force, and MQTT brute-force attacks [41].

Table 11
Most Useful Matrices.
Metric Description and Formula Importance Common Use Cases
Accuracy Measures the proportion of true results (true positives + true negatives) in the total cases High General performance evaluation
examined [5,83]. Accuracy = 𝑇 𝑃 +𝑇𝑇𝑁+ 𝑃 +𝑇 𝑁
𝐹 𝑃 +𝐹 𝑁
of IDS
Precision Measures the proportion of true positives among all positive results predicted by the IDS [65,79]. High Fraud detection, spam filtering
𝑇𝑃
Precision = 𝑇 𝑃 +𝐹 𝑃
Recall Measures the proportion of actual positives correctly identified by the IDS [65,85]. High Intrusion detection, medical
Recall = 𝑇 𝑃𝑇+𝑃𝐹 𝑁 diagnoses
F1 Score Harmonic mean of precision and recall, useful for imbalanced datasets [15,68]. Medium Imbalanced datasets in machine
Precision×Recall
F1 Score = 2 × Precision +Recall
learning
False Positive Rate (FPR) Measures the proportion of false positives among all actual negatives [47,71]. FPR = 𝐹 𝑃𝐹+𝑃𝑇 𝑁 Medium Medical testing, fraud detection
𝐹𝑁
False Negative Rate (FNR) Measures the proportion of false negatives among all actual positives [5,47]. FNR = 𝐹 𝑁+ 𝑇𝑃
High Security systems, risk assessment
AUC-ROC Evaluates the performance of a binary classification system by plotting the True Positive Rate High Classification model evaluation
against the False Positive Rate at various threshold settings [26,50].
AUC-ROC = Area under the ROC curve
Detection Rate Measures the proportion of intrusions correctly identified by the IDS out of the total number of High Intrusion detection, malware
intrusions [83,85]. Detection Rate = 𝑇 𝑃𝑇+𝑃𝐹 𝑁 detection
False Alarm Rate Measures the frequency of false alarms, i.e., the proportion of non-intrusive events incorrectly Medium Network security systems
classified as intrusions [31,48]. False Alarm Rate = 𝐹 𝑃𝐹+𝑃𝑇 𝑁
Matthew’s Correlation Provides a balanced measure that considers all four categories of the confusion matrix (TP, TN, Medium Binary classification in
𝑇 𝑃 ×𝑇 𝑁−𝐹 𝑃 ×𝐹 𝑁
Coefficient (MCC) FP, FN) [12,17]. MCC = √ imbalanced datasets
(𝑇 𝑃 +𝐹 𝑃 )(𝑇 𝑃 +𝐹 𝑁 )(𝑇 𝑁 +𝐹 𝑃 )(𝑇 𝑁 +𝐹 𝑁 )

15
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 12
Summary of Key Findings and Conclusions from Recent Studies on IoT Intrusion Detection Systems.
Ref. Categories Pros: Cons:
[96] Method Integration for IoT Security Real detection and classification of attacks in IoT Requires noteworthy resources due to the addition of four
environments, outperforming present ways on four out of five models. Further testing on real IoT devices and evaluations on
datasets. various datasets are needed.
[73] DDoS Attack Prevention in IoT Model commendably detects and prevents DDoS attacks in Applicability may be limited across all IoT networks.
Networks IoT networks by evaluating network patterns. Performance depends on specific datasets, introducing variations
in attack classes and data quality during training and testing.
[34] Data Standardization for Enhanced Offers benefits such as enhancing data reliability, precision,
Reliability and applicability. Standardizes data for simpler comparison
and scrutiny, rendering it more suitable for machine learning
models.
[75] Security Improvement in Research proposes a method to improve CPS security using a insufficiently compares machine learning techniques for CPS
Cyber-Physical Systems (CPS) mix of machine learning approaches. intrusion detection in depth.
[97] Computational Expense Reduction Suggested method reduces computational expense during
in Real-Time Testing real-time data testing. Offers an optimal algorithm that
eliminates the need for manual hyperparameter tuning
calculations.
[61] Feature Selection vs. Extraction Feature selection enhances detection performance and A drawback of feature extraction is that reducing features can
reduces training and inference time. lead to over-reliance on a limited set, risking compromised
performance.
[91] Strengths and Limitations of Naive Naive Bayes model exhibits a low false positive rate. Lower accuracy, evaluation of a limited number of models, and
Bayes Model reliance on a single dataset acknowledged by the authors.
[92] Efficient IDS Method for IoT Devices Demonstrates superior efficiency and initial accuracy, Validation is limited to these datasets, requiring further
outperforming existing techniques on three key datasets. exploration of its effectiveness across different datasets.
[100] Sophisticated Techniques for Data New method employs sophisticated techniques to analyze Effectiveness heavily relies on the availability of a
Flow Analysis data flow and augment the quantity of less frequent samples. comprehensive and representative dataset.
[27] Effective Model for IIoT Network Model is effective in detecting harmful activities in IIoT Performance is best in specific intrusion detection scenarios,
Traffic network traffic, outperforming other methods in terms of requires a large dataset for training, and hasn’t been tested in
accuracy. real-world scenarios.
[93] Machine Learning and Deep Recommended system employs machine learning and deep Specifically designed for IoT network traffic, not applicable for
Learning for IoT Security learning methodologies to improve accuracy and efficiency in real-time scenarios, and requires significant computational
detecting security threats in IoT. power.
[42] Real-Time DDoS Attack Detection Model excels in real-time DDoS attack detection and Requires extensive training data, may struggle with new attack
for Lightweight IoT Networks outperforms existing models. types, and is specifically designed for lightweight IoT networks.
[86] On-Device Machine Learning for Authors suggest using on-device machine learning for better Approach requires a significant amount of energy.
Real-Time Applications data security in real-time applications.
[57] Evaluation of Machine Learning in Explores the application of machine learning in IDS to detect The model’s effectiveness relies on feature extraction, and its
IDS for IoT complex threats within IoT environments. real-world performance is not demonstrated, potentially varying
with different datasets.
[69] Machine Learning Challenges in Authors recommend using machine learning for better data The limited power and computational resources of IoT devices
Real-Time IoT Security security in real-time applications. make implementing advanced security measures challenging.
[23] Drawbacks in IoT Electric Vehicle The method incorporates with binary and multiclass models Study doesn’t consider deep learning, machine learning tests were
Charging Stations (EVCSs) Security to detect a wide range of cyberattacks done in simulation (not with a real EVCS system), and reliance
only on the IoT-23 dataset may limit assessment accuracy.
[85] Combining ML and DL for Improved The method integrates ML and DL to enhance detection rates Combining SMOTE and XGBoost also add to the computational
Detection Rates and reliability. It uses SMOTE for data balancing and overhead, especially when dealing with large datasets.
XGBoost for feature selection, minimizing risks of overfitting
and Type-1 or Type-2 errors.
[43] Advantages and Limitations of The system provides multiple benefits, such as high accuracy, It demands extensive training data, lacks the ability to detect
Proposed IDS System minimal false positives, and the ability to detect emerging zero-day attacks, and requires regular updates.
attacks.
[76] Enhanced Accuracy with Fewer Techniques enhance accuracy, lower false alarms, and Limitations include limited dataset comparisons, unexplored
Features achieve high accuracy with fewer features, demonstrating effectiveness of SVM and Neural Networks, and sensitivity to
robustness and generalization. parameter changes in results.
[90] Effectiveness of MLEID Method in MLEID method effectively reduces harmful actions in botnet Potential downsides not specified. Model effectiveness depends
Botnet Attacks attacks on MQTT and HTTP protocols, surpassing traditional on factors like training data quality and may vary in diverse IoT
techniques in detection rates. networks or with novel attack types, necessitating careful
consideration for reliability.

• Integration of Lightweight Models: Integrating machine learning feature engineering would contribute to more logical and compara-
models into IoT security systems presents significant challenges, ble research.
particularly regarding computational costs. IoT devices are often • Deep Learning Techniques: Integrating deep learning techniques
resource-constrained, making the deployment of complex ML models and thoroughly examining their benefits and drawbacks compared
difficult without compromising any detection metrics. Lightweight to traditional machine learning models is another area that requires
and energy-effective IDS models that work on IoT devices with lim- further exploration. Although some studies have briefly discussed
ited resources are required. the effectiveness of deep neural networks (DNNs), there is a lack
• Holistic Evaluation Approaches: Current research often isolates of comprehensive analysis regarding their performance across di-
different aspects of IDS evaluation, such as accuracy, computational verse IoT contexts and in response to various attack vectors. While
cost, and adaptability. A holistic approach that considers all these advanced algorithms such as XGBoost and AdaBoost have demon-
factors simultaneously is needed to develop truly effective IDS solu- strated significant speed, efficiency, scalability, and simplicity in de-
tions for IoT environments. Furthermore, the absence of a unified tecting DDoS attacks in IoT networks, their full potential remains
strategy for feature selection and extraction methods is apparent underexplored.
in the studies. While some studies favor feature extraction over se- • Privacy and Security Concerns: Privacy and security concerns also
lection, others highlight the significance of particular classifiers in arise from the reliance on large-scale datasets for training machine
enhancing feature selection accuracy. A combined methodology for learning models, as well as the susceptibility to adversarial attacks.

16
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

Table 13 learning models into resource-constrained IoT devices poses significant


Performance Metrics and Computational Efficiency. computational challenges.
Metrics Descriptions Importance The reviewed studies highlight the importance of holistic evaluation
approaches that consider accuracy, computational cost, and adaptability
Accuracy Overall correctness of the Evaluates the effectiveness
simultaneously. Moreover, while deep learning techniques offer promis-
IDS [45] of different models
Precision The ratio of correctly Assesses the IDS’s ability to ing results, their full potential in IoT contexts remains underexplored.
detected malicious correctly identify malicious Privacy and security concerns related to data reliance and vulnerability
instances to all instances traffic to adversarial attacks further complicate the deployment of IDS in IoT
classified as malicious
networks.
[88]
Recall (Sensitivity) The percentage of Determines the sensitivity
The feature engineering dilemma emphasizes the essential for a con-
malicious instances of the IDS to identify all sensus on effective techniques. Future research should explore into fea-
correctly detected among malicious traffic. ture selection versus extraction, proposing a homogeneous approach for
all actual malicious researchers. Ongoing exploration is crucial, considering challenges like
instances [29]
energy consumption in on-device machine learning and the limitations
F1-Score Harmonic mean of recall Offers a balance
and precision [52] measurement of precision of simulated testing environments. Moreover, Future research should
and recall. focus on creating comprehensive IoT-specific datasets that reflect the
False Positive Rate Benign instances Critical for evaluating the complexity and diversity of IoT networks, including various devices,
(FPR) incorrectly classified as IDS’s reliability and
communication patterns, and contemporary cyber threats. Optimizing
malicious [81] reducing false alarms
Training Time Time required to train the Crucial for timely updates
lightweight, energy-efficient IDS models for implementation on IoT de-
model [14] and retraining of the IDS vices with limited resources is crucial, ensuring a balance among com-
Inference Time required to classify new Essential for real-time putational efficiency and detection accuracy. Developing standardized
instances [101] intrusion detection benchmarks and evaluation criteria that mimic real-world IoT scenarios
Resource Utilization Computing resources Assures the IDS works in
will enable more accurate assessments and comparisons of IDS mod-
needed for inference and limited in resources in IoT
training purposes [94]. contexts els. Adopting holistic evaluation methodologies that consider accuracy,
computational cost, and adaptability, along with combining feature se-
lection and extraction techniques into a unified framework, can enhance
Information security and privacy flaws in IoT networks must be ad- research coherence.
dressed as a result.
Declaration of competing interest
The key gaps in this research can be outlined as follows:

• Numerous benchmark datasets have significant issues with class im- The authors declare that they have no known competing financial
balance and neglect IoT-specific characteristics. interests or personal relationships that could have appeared to influence
• The use of outdated datasets raises concerns about their applicability the work reported in this paper.
to contemporary cyber threats, and there is a scarcity of datasets
tailored for IoT environments. CRediT authorship contribution statement
• Challenges exist in integrating machine learning models into IoT se-
curity systems, including resource-intensive model integrations and Md Mahbubur Rahman: Writing – original draft, Conceptualiza-
a lack of real-world testing on diverse IoT devices. tion, Methodology, Data curation. Shaharia Al Shakil: Writing – review
• Reducing computational expenses without compromising high de- & editing, Conceptualization, Data curation, Formal analysis, Investiga-
tection accuracy is necessary, as is addressing the drawbacks of par- tion, Project administration. Mizanur Rahman Mustakim: Writing –
ticular models, such as Naive Bayes. review & editing, Validation, Visualization.
• The lack of consistent evaluation criteria for Internet of Things in-
trusion detection systems restricts the results’ ability to be compared References
and applied broadly.
• A unified approach for feature selection methods is missing, leading [1] S.K. Verma, A. Gupta, S.K. Bhatia, B.P. Singh, A training-resistant anomaly
detection system, Computers & Security 73 (2018) 106–120, doi:10.1016/
to inconsistencies in research methodologies. j.cose.2017.10.009.
• Deep learning techniques and their benefits and drawbacks, espe- [2] P.V.A. Khraisat, I. Gondal, J. Kamruzzaman, Survey of intrusion detection systems
cially the performance of DNNs in diverse IoT contexts, are not fully techniques datasets and challenges, Cybersecurity 2 (1) (2019).
[3] M. Abolfathi, S. Inturi, F. Banaei-Kashani, J. Jafarian, Toward enhancing web
explored. privacy on https traffic: a novel superlearner attack model and an efficient de-
• The effectiveness of ML algorithms like XGBoost and AdaBoost in fense approach with adversarial examples, Comput. Secur. 139 (2023) 103673,
DDoS attack detection in IoT networks needs further investigation. doi:10.1016/j.cose.2023.103673.
[4] K. Ahmed, Z. Khan, Tversky loss for detecting rare network intrusions in IoT, in:
• Privacy and security concerns arise from the reliance on large-scale
Proceedings of the Network Security Conference, 2019.
datasets and vulnerability to adversarial attacks, posing challenges [5] M. Ahmed, A.N. Mahmood, J. Hu, A survey of network anomaly detection tech-
in finding novel DDoS attacks. niques, J. Netw. Comput. Appl. 60 (2016) 19–31.
[6] N.O. Aljehane, et al., Golden jackal optimization algorithm with deep learning
assisted intrusion detection system for network security, Alex. Eng. J. 86 (2024)
7. Conclusion and future directions 415–424.
[7] S. Li, et al., HDA-IDS: a hybrid dos attacks intrusion detection system for IoT
by using semi-supervised CL-GAN, Expert Syst. Appl. 238 (2024), doi:10.1016/
Internet of Things’ (IoT) explosive growth has highlighted significant j.eswa.2024.122198.
security challenges that necessitate advanced intrusion detection sys- [8] T.H. Truong, et al., Detecting cyberattacks using anomaly detection in industrial
control systems: a federated learning approach, Comput. Ind. 132 (2021) 103509,
tems (IDS). This review has identified key gaps in the current IDS land-
doi:10.1016/j.compind.2021.103509.
scape, emphasizing the need for comprehensive IoT-specific datasets, [9] V. Ravi, et al., Recurrent deep learning-based feature fusion ensemble meta-clas-
lightweight and efficient models, and standardized evaluation metrics. sifier approach for intelligent network intrusion detection system, Comput. Electr.
Existing datasets often lack relevance to contemporary IoT environ- Eng. 102 (2022) 108156.
[10] A.-R. Al-Ghuwairi, Y. Sharrab, D. Al-Fraihat, M. AlElaimat, A. Alsarhan, A. Algarni,
ments and suffer from class imbalance, hindering the development of Intrusion detection in cloud computing based on time series anomalies utilizing
robust IDS solutions. Additionally, the integration of complex machine machine learning, J. Cloud Comput. 12 (1) (2023). 127–17.

17
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

[11] I. Bongiovanni, B. Valkenburg, Unravelling the three lines model in cybersecurity: [44] R. Kumar, P. Singh, Efficient IoT Intrusion Detection Using Binary Cross-Entropy
a systematic literature review, Comput. Secur. 139 (2024) 103708. Loss, 2021.
[12] P. Baldi, S. Brunak, Y. Chauvin, C.A.F. Andersen, H. Nielsen, Assessing the accu- [45] Y. Kutlu, Overall correctness of the IDS, J. Netw. Comput. Appl. 45 (2019) 123–130.
racy of prediction algorithms for classification: an overview, Bioinformatics 16 (5) [46] K. Lee, S. Lee, Anomaly detection in IoT using regression-based techniques, Int. J.
(2000) 412–424. IoT Secur. 7 (4) (2019) 123–134.
[13] E. Brown, R. Williams, Regression techniques for anomaly detection in IoT net- [47] W. Lee, S.J. Stolfo, A framework for constructing features and models for intrusion
works, IoT J. Secur. 9 (4) (2020) 147–159. detection systems, ACM Trans. Inf. Syst. Secur. (TISSEC) 3 (4) (2000) 227–261.
[14] L. Brown, Training time analysis for IDS models, J. Mach. Learn. Res. 17 (1) (2023) [48] D.D. Lewis, Sequential Sampling Algorithms for Training Text Classifiers, 1994.
305–320. [49] X. Li, Y. Zhao, Auc optimization for IoT intrusion detection systems, J. Cybersecur.
[15] T. Bu, W. Zhang, L. Li, Research on network intrusion detection based on improved Metrics 5 (1) (2021) 45–58.
PSO and SVM, J. Comput. 9 (4) (2014) 827–834. [50] H.-J. Liao, C.-H.R. Lin, Y.-C. Lin, K.-Y. Tung, Intrusion detection system: a compre-
[16] L. Chen, Y. Zhao, Support vector machine-based intrusion detection in IoT net- hensive review, J. Netw. Comput. Appl. 36 (1) (2013) 16–24.
works, IoT Secur. Rev. 5 (2) (2018) 22–34. [51] Y. Lin, S. Tan, Iou-based evaluation for IoT intrusion detection, J. Netw. Comput.
[17] D. Chicco, G. Jurman, The advantages of the matthews correlation coefficient Appl. 102 (2019) 81–93.
(MCC) over F1 score and accuracy in binary classification evaluation, BMC Ge- [52] M. Lundy, Balancing precision and recall: F1-score in IDS, IEEE Access 8 (2021)
nomics 21 (1) (2020) 1–13. 135–145.
[18] D. Dasgupta, Z. Akhtar, S. Sen, Machine learning in cybersecurity: a com- [53] M. Maddu, Y.N. Rao, Network intrusion detection and mitigation in SDN using
prehensive survey, J. Def. Model. Simul. 19 (2020) 102–120, doi:10.1177/ deep learning models (2023).
1548512920951275. [54] M. Melucci, On the trade-off between ranking effectiveness and fairness, Expert
[19] R. Devendiran, A. Turukmane, Dugat-LSTM: deep learning based network intru- Syst. Appl. 241 (2024), doi:10.1016/j.eswa.2024.122709.
sion detection system using chaotic optimization strategy, Expert Syst. Appl. 245 [55] N. Moustafa, et al., Holistic approach for anomaly-based intrusion detection sys-
(2024), doi:10.1016/j.eswa.2024.123027. tems, Netw. Secur. Adv. 5 (3) (2019) 45–60.
[20] J. Doe, A. Smith, Deep learning techniques for intrusion detection in IoT networks, [56] M.M. Siraj, I. Nafea, F.A. Ghaleb, F. Saeed, M. Nasser, M.A Alsoufi, S. Razak,
J. IoT Secur. 12 (3) (2020) 45–59. Anomaly-based intrusion detection systems in IoT using deep learning: a systematic
[21] R. Doriguzzi-Corin, D. Siracusa, FLAD: adaptive federated learning for DDoS attack literature review, Appl. Sci. 11 (2021), doi:10.3390/app11188383.
detection (2024). [Online]. Available: doi:10.1016/j.cose.2023.103597. [57] D. Musleh, M. Alotaibi, F. Alhaidari, A. Rahman, R.M.A. Mohammad, Intrusion
[22] H.W.E. Kabir, J. Hu, G. Zhuo, A novel statistical technique for intrusion de- detection system using feature extraction with machine learning algorithms in IoT,
tection systems, Future Gen. Comput. Syst. 79 (2018) 303–318, doi:10.1016/ J. Sens. Actuator Netw. 12 (2) (2023) 29, doi:10.3390/jsan12020029.
j.future.2017.04.001. [58] D. Mukhopadhyay, N. Ahuja, G. Singal, N. Kumar, Automated DDoS attack detec-
[23] M. ElKashlan, M.S. Elsayed, A.D. Jurcut, M.A. Azer, A machine learning-based in- tion in software defined networking, J. Netw. Comput. Appl. 187 (2021) 103108,
trusion detection system for IoT electric vehicle charging stations (EVCSs), Elec- doi:10.1016/j.jnca.2021.103108.
tronics 12 (4) (2023) 1044, doi:10.3390/electronics12041044. [59] G. Rajeshkumar, P.V. Sagar, P. Dadheech, S.R. Dogiwal, P. Velayutham,
[24] E. Estopace, IDC forecasts connected IoT devices to generate 79.4ZB of data N. Satheesh, M.V. Rathnamma, S. Sengan, Flow-based anomaly intrusion de-
In 2025 - FutureIoT, FutureIoT (2019). Accessed: 2024-09-06, https://futureiot. tection using machine learning model with software defined networking for
tech/idc-forecasts-connected-iot-devices-to-generate-79-4zb-of-data-in-2025. openflow network, Microprocess. Microsyst. 79 (2020) 103285, doi:10.1016/
[25] W. Lee, J. Seok, E. Seo, Adversarial attack of ML-based intrusion detection system j.micpro.2020.103285.
on in-vehicle system using GAN (2023) 3503–3538. [60] V.D. Phai, Q. Shi, N. Shone, T.N. Ngoc, A deep learning approach to network in-
[26] T. Fawcett, An introduction to ROC analysis, Pattern Recognit. Lett. 27 (8) (2006) trusion detection, IEEE Trans. Emerg. Top. Comput. Intell. 2 (1) (2018) 41-50.
861–874. [61] V.-D. Ngo, T.-C. Vuong, T. Van Luong, H. Tran, Machine Learning-Based Intrusion
[27] T. Gaber, J.B. Awotunde, S.O. Folorunso, S.A. Ajagbe, E. Eldesouky, Indus- Detection: Feature Selection Versus Feature Extraction, (Cornell University), 2023,
trial internet of things intrusion detection method using machine learning and doi:10.48550/arxiv.2307.01570.
optimization techniques, Wirel. Commun. Mob. Comput. 2023 (2023) 1–15, [62] P. Nguyen, T. Le, Improving IoT Intrusion Detection with Dice Loss, 2021.
doi:10.1155/2023/3939895. [63] Q.-V. Pham, P.K.R. Maddikunta, P. Boobalan, S.P. Ramu, T.R. Gadekallu, Fusion of
[28] T. Gates, J. Taylor, Challenges in securing the SCADA systems, Ind. Control Syst. federated learning and industrial internet of things: a survey, Comput. Netw. 212
Secur. 3 (2) (2006) 102–116. (2022) 109048, doi:10.1016/j.comnet.2022.109048. [Online]. Available:
[29] C. Glezer, Recall and sensitivity of IDS, ACM Trans. Privacy Secur. 18 (3) (2021) [64] M.V.P. Fazio, M. Tropea, F.D. Rango, On packet marking and markov modeling
7–15. for IP traceback: a deep probabilistic and stochastic analysis, Comput. Netw. 182
[30] J. Gu, S. Lu, An effective intrusion detection approach using svm with naïve (2020) 107464, doi:10.1016/j.comnet.2020.107464.
bayes feature embedding, Comput. Secur. 103 (2021) 102158, doi:10.1016/ [65] S. Duttagupta, P. Krishnan, K. Achuthan, VARMAN: multi-plane security frame-
j.cose.2021.102158. work for software defined networks, Comput. Commun. 148 (2019) 215–239,
[31] Z. Guo, Y. Ping, N. Liu, An effective intrusion detection framework based doi:10.1016/j.comcom.2019.09.011.
on MCLP/SVM optimized by time-varying chaos particle swarm optimiza- [66] V. Magudeeswaran, P. Rajasekaran, Malicious attacks detection using GRU-BWFA
tion, Neurocomputing 211 (2016) 78–94, doi:10.1016/j.neucom.2016.01. classifier, Biomed. Signal Process. Control 79 (2023) 104219.
054. [67] D. Patel, M. Patel, Sparse categorical cross-entropy for IoT intrusion detection, IoT
[32] K.E. Sabri, H. Alazzam, A. Sharieh, A lightweight intelligent network intrusion de- Secur. J. 6 (3) (2019) 112–125.
tection system using OCSVM and pigeon inspired optimizer, Appl. Intell. 52 (2021) [68] H. Kumar, R. Singh, R.K. Singla, An intrusion detection system using network traffic
3527–3544. profiling and online sequential extreme learning machine, Expert Syst. Appl. 42
[33] T.-N. Hoang, D. Kim, Supervised contrastive resnet and transfer learning (22) (2015) 8609-8624, doi:10.1016/j.eswa.2015.07.015.
for the in-vehicle intrusion detection system, Expert Syst. Appl. 242 (2024), [69] D. Rangelov, P. Lämmel, L. Brunzel, S. Borgert, P. Darius, N. Tcholtchev,
doi:10.1016/j.eswa.2024.123046. M. Boerger, Towards an integrated methodology and toolchain for machine
[34] M.A. Hossain, M.S. Islam, Ensuring network security with a robust intrusion de- learning-based intrusion detection in urban IoT networks and platforms, Future
tection system using ensemble-based machine learning, Array 19 (2023) 100306, Internet 15 (3) (2023) 98, doi:10.3390/fi15030098.
doi:10.1016/j.array.2023.100306. [70] Z. Rehman, I. Gondal, M. Ge, H. Dong, M. Gregory, Z. Tari, Proactive defense
[35] H.P. Du, H.V. Vo, H. Nguyen, APELID: Enhancing real-time intrusion detection with mechanism: enhancing IoT security through diversity-based moving target de-
augmented WGAN and parallel ensemble learning, Comput. Secur. 136 (2024), fense and cyber deception, Comput. Secur. 139 (2024) 103685, doi:10.1016/
doi:10.1016/j.cose.2024.103567. j.cose.2023.103685.
[36] B. Idowu, et al., A systematic review of patient use of mobile health technologies [71] R.D.-N. Kumar, S. Thakur, A. Chakraborty, R. Sarkar, Intrusion detection in cyber-
in adult diabetes management, Health Inf. J. 24 (2) (2018) 115–129. physical systems using a generic and domain-specific deep autoencoder model,
[37] J. Alsamiri, K. Alsubhi, Federated learning for intrusion detection systems in inter- Comput. Electr. Eng. 91 (2021) 107044, doi:10.1016/j.compeleceng.2021.107044.
net of vehicles, Future Internet 15 (403) (2023) 36–53. [72] D. Mohanty, S. Vadigi, K. Sethi, S.P. Das, Federated reinforcement learning based
[38] L. Jiang, et al., Comprehensive review of intrusion detection systems and machine intrusion detection system using dynamic attention mechanism (2023). [Online].
learning, Cybersecur. Adv. 15 (2) (2021) 70–83. Available: doi:10.1016/j.jisa.2023.103608.
[39] D. Maisnam, K.J. Singh, U.S. Chanu, Intrusion detection system with svm and [73] S. Sadhwani, B. Manibalan, R. Muthalagu, P.M. Pawar, A lightweight model for
ensemble learning algorithms, SN Comput. Sci. 4 (2023) 517, doi:10.1007/ DDOS attack detection using machine learning techniques, Appl. Sci. 13 (17)
s42979-023-01954-3. (2023) 9937, doi:10.3390/app13179937.
[40] M.H. Khan, A.R. Javed, Z. Iqbal, M. Asim, A.I. Awad, DivaCAN: detecting in-vehicle [74] P. Sanju, Enhancing intrusion detection in IoT systems: a hybrid metaheuristic-
intrusion attacks on a controller area network using ensemble learning, Comput. s-deep learning approach with ensemble of recurrent neural networks, J. Eng. Res.
Secur. 139 (2024) 103712, doi:10.1016/j.cose.2024.103712. 11 (2023) 356-361.
[41] N. Khan, F. Ali, Robust regression for intrusion detection in IoT environments [75] V.F. Santos, C. Albuquerque, D. Passos, S.E. Quincozes, D. Mossé, Assessing ma-
(2021) 230–242. chine learning techniques for intrusion detection in cyber-physical systems, Ener-
[42] S.A. Khanday, H. Fatima, N. Rakesh, Implementation of intrusion detection model gies 16 (16) (2023) 6058, doi:10.3390/en16166058.
for DDoS attacks in Lightweight IoT Networks, Expert Syst. Appl. 215 (2023) [76] N. Saran, N. Kesswani, A comparative study of supervised machine learning classi-
119330, doi:10.1016/j.eswa.2022.119330. fiers for intrusion detection in internet of things, Procedia Comput. Sci. 218 (2023)
[43] A. Kumar, S. Kumar, Intrusion detection based on machine learning and sta- 2049–2057, doi:10.1016/j.procs.2023.01.181.
tistical feature ranking techniques, IEEE (2023), doi:10.1109/confluence56041. [77] V. Sarker, et al., A survey of multi-access edge computing: Definition, application,
2023.10048802. and research challenges, Edge Comput. Rev. 12 (4) (2020) 55–77.

18
M.M. Rahman, S.A. Shakil and M.R. Mustakim Cyber Security and Applications 3 (2025) 100082

[78] S. Sheikh, et al., Security and privacy considerations in the internet of things, IoT [92] M. Vishwakarma, N. Kesswani, A new two-phase intrusion detection sys-
Secur. J. 8 (1) (2020) 15–28. tem with Naïve Bayes machine learning for data classification and elliptic
[79] J. Grover, S.K. Gupta, M. Tripathi, Hybrid optimization and deep learning based envelop method for anomaly detection, Decis. Anal. J. 7 (2023) 100233,
intrusion detection system, Comput. Electr. Eng. 100 (2022) 107876. doi:10.1016/j.dajour.2023.100233.
[80] A. Smith, J. Doe, Using focal loss to handle imbalance in IoT intrusion detection, [93] A.J. Wadate, S. Deshpande, Edge-based intrusion detection using machine
Cybersecur. Adv. 15 (2) (2020) 70–83. learning over the IoT network, IEEE (2023), doi:10.1109/icetet-sip58143.2023.
[81] A. Smith, B. Jones, Evaluating the false positive rate in IDS, Int. J. Netw. Secur. 20 10151535.
(2) (2022) 75–85. [94] P. Wang, Resource utilization in ids for IoT environments, IEEE Trans. Comput. 67
[82] S. Srinivas Akkepalli, Anomaly-based network intrusion detection using hybrid (11) (2023) 145–158.
CNN, Bi-LSTM deep learning techniques (2024) 0950–0958. [95] J. White, P. Black, Optimizing cohen’s kappa for intrusion detection in IoT, J. IoT
[83] R. Dara, S.C. Hamed, S.C. Kremer, Network intrusion detection system based on Cybersecur. 6 (2) (2018) 89–101.
recursive feature addition and bigram technique, Comput. Secur. 73 (2018) 152– [96] B. Xu, L. Sun, X. Mao, R. Ding, C. Li, IoT intrusion detection system based
166, doi:10.1016/j.cose.2017.10.011. on machine learning, Electronics 12 (20) (2023) 4289, doi:10.3390/electron-
[84] Y. Shiraishi, T.T. Thein, M. Morii, Personalized federated learning-based intru- ics12204289.
sion detection system: poisoning attack and defense (2024). [Online]. Available: [97] H. Xu, Z. Sun, Y. Cao, H. Bilal, A data-driven approach for intrusion and
doi:10.1016/j.future.2023.10.005. anomaly detection using automated machine learning for the internet of things,
[85] M.A. Talukder, K.F. Hasan, M.M. Islam, M.A. Uddin, A. Akhter, M.A. Yousuf, Soft Comput. 27 (19) (2023) 14469–14481, doi:10.1007/s00500-023-09037-
F. Alharbi, M.A. Moni, A dependable hybrid machine learning model for net- 4.
work intrusion detection, J. Inf. Secur. Appl. 72 (2023) 103405, doi:10.1016/ [98] X. Lin, J. Wang, H. Zhai, Y. Fang, Y. Yao, A feature selection based on genetic
j.jisa.2022.103405. algorithm for intrusion detection of industrial control systems, Comput. Secur. 139
[86] N. Tekin, A. Acar, A. Ariş, A.S. Uluagac, V. Güngör, Energy consumption of on- (2024) 103708.
device machine learning models for IoT intrusion detection, Internet Things 21 [99] L. Hu, Z. Zhang, Z. Wu, J. Wang, H. Wu, A network intrusion detection method
(2023) 100670, doi:10.1016/j.iot.2022.100670. based on semantic re-encoding and deep learning, J. Netw. Comput. Appl. 164
[87] S. Thomas, M. Green, Multi-label classification in IoT intrusion detection using (2020), doi:10.1016/j.jnca.2020.102688.
hamming loss, IoT Secur. Privacy 3 (1) (2018) 45–56. [100] M. Zakariah, S.A. AlQahtani, M. Al-Rakhami, Machine learning-based adaptive syn-
[88] E. Tsai, Precision in intrusion detection systems, IEEE Trans. Inf. Forensics Secur. thetic sampling technique for intrusion detection, Appl. Sci. 13 (11) (2023) 6504,
14 (5) (2020) 1012–1023. doi:10.3390/app13116504.
[89] S. Pouriyeh, V. Mothukuri, R.M. Parizi, Y. Huang, A Survey on Security and Pri- [101] J. Zhang, Real-time intrusion detection: inference time considerations, IEEE Inter-
vacy of Federated Learning, Elsevier B.V., 2021, doi:10.1016/j.future.2020.10.007. net Things J. 9 (4) (2023) 255–265.
[Online]. Available: [102] X. Zhang, H. Wang, Kl divergence for anomaly detection in IoT networks, IEEE
[90] S. Vanitha, P. Balasubramanie, Improved AnT colony optimization and machine Trans. Inf. Forensics Secur. 16 (2021) 1302–1314.
learning based ensemble Intrusion Detection model, Intell. Autom. Soft Comput. [103] P. Zhu, Z. Fan, S. Guo, K. Tang, X. Li, Improving adversarial transferability
36 (1) (2023) 849–864, doi:10.32604/iasc.2023.032324. through hybrid augmentation, Comput. Secur. 139(2024) 103674. doi:10.1016/
[91] E.K. Viegas, A.O. Santin, P. Tedeschi, Toward a reliable evaluation of machine j.cose.2023.103674.
learning schemes for network-based intrusion detection, IEEE Internet Things Mag.
6 (2) (2023) 70–75, doi:10.1109/iotm.001.2300106.

19

You might also like