0% found this document useful (0 votes)
20 views23 pages

IoT Dataset 2023

Uploaded by

Vaishali Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views23 pages

IoT Dataset 2023

Uploaded by

Vaishali Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Internet of Things 22 (2023) 100780

Contents lists available at ScienceDirect

Internet of Things
journal homepage: www.elsevier.com/locate/iot

Review article

Internet of Things (IoT) security dataset evolution: Challenges and


future directions
Barjinder Kaur a , Sajjad Dadkhah a ,∗, Farzaneh Shoeleh a ,
Euclides Carlos Pinto Neto a ,∗, Pulei Xiong b , Shahrear Iqbal b , Philippe Lamontagne b ,
Suprio Ray a , Ali A. Ghorbani a
a Canadian Institute for Cybersecurity - University of New Brunswick (UNB), Fredericton, New Brunswick, Canada
b National Research Council Canada, Ottawa, Ontario, Canada

ARTICLE INFO ABSTRACT

Keywords: The evolution of mobile technologies has introduced smarter and more connected objects into
Internet of Things (IoT) our day-to-day lives. This trend, known as the Internet of Things (IoT), has applications in
Intrusion detection system (IDS) smart homes, smart cities, industrial automation, health monitoring systems, and has become
IoT attacks
an essential component of the communication and networking industry. However, different
IoT security
communication and protocol standards, weak security defaults and the difficulty of distributing
IoT datasets
Machine learning
updates have exacerbated cybersecurity threats to critical applications that employ IoT. To
mitigate the threats and counter these attacks, a promising approach is to develop a robust
intrusion detection framework specifically aimed at securing IoT. This paper presents our efforts
to catalogue and compare attacks, datasets and machine learning algorithms and architectures
for intrusion detection systems for IoT devices. We classify attacks aimed at IoT devices at
different layers and protocols. This work also highlights potential features that can be used
by machine learning-based intrusion detection systems to detect different types of attacks.
We provide a comparative study of IoT datasets used for model training and identify key
properties which helps in assessing their suitability in particular scenarios. Finally, we discuss
our observations and propose the research directions for building a robust IoT intrusion
detection system.

1. Introduction

Advancements in mobile technologies have paved the way for the emergence of the Internet of Things (IoT), revolutionizing,
for example, healthcare, homes, and cities with smart infrastructure [1]. These smart devices can be seen as interconnected parts
and are built with network interface cards and lightweight processors and managed through various interface services (e.g., web
pages, graphical user interface, and remote login) [2]. With an ever-increasing demand for IoT devices, new and unique threats
are affecting these devices. In this sense, an important step is to develop a forensic IoT methodology to capture and investigate the
adversarial behavior and identify the role of devices in performing cyberattacks [3]. With IoT networks generating an enormous
amount of data, analytical techniques are generally incapable of handling this data in real-time.

∗ Corresponding author.
E-mail addresses: [email protected] (B. Kaur), [email protected] (S. Dadkhah), [email protected] (F. Shoeleh), [email protected] (E.C.P. Neto),
[email protected] (P. Xiong), [email protected] (S. Iqbal), [email protected] (P. Lamontagne), [email protected] (S. Ray),
[email protected] (A.A. Ghorbani).

https://doi.org/10.1016/j.iot.2023.100780
Received 25 October 2022; Received in revised form 31 March 2023; Accepted 1 April 2023
Available online 6 April 2023
2542-6605/Crown Copyright © 2023 Published by Elsevier B.V. All rights reserved.
B. Kaur et al. Internet of Things 22 (2023) 100780

Fig. 1. A stair-step model of this article showing different steps and goals achieved after completing each step.

Table 1
A comparison of the state-of-the-art surveys.
Survey IoT layered IoT attacks IoT IoT IoT IoT detection Accuracy IoT
attacks in-depth protocols vulnerabilities features techniques datasets

DeepLearning Traditional
Methods Methods

H.Hindy et al. [2018] ✓ ✓ ∙ ∙ ∙ ⊚ ✓ ∙ ∙


[10]
Ring et al. [2019] [11] ⊚ ⊚ ∙ ∙ ∙ ∙ ∙ ∙ ∙
Khraisat et al. [2019] ⊚ ∙ ∙ ∙ ∙ ∙ ⊚ ∙ ⊚
[12]
MA Ferrag et al. [2020] ∙ ∙ ∙ ∙ ∙ ✓ ∙ ∙ ⊚
[13]
J Asharf et al. [2020] ✓ ∙ ✓ ∙ ∙ ✓ ✓ ∙ ⊚
[14]
Dilara et al. [2020] [15] ∙ ∙ ∙ ∙ ∙ ✓ ∙ ∙ ⊚
khraisat et al. [2021] ⊚ ⊚ ∙ ∙ ∙ ✓ ✓ ∙ ⊚
[16]
Hamid et al. [2021] [17] ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ⊚
Aversano et al. [2021] ∙ ✓ ∙ ⊚ ⊚ ✓ ✓ ∙ ∙
[18]
Lohiya et al. [2021] [19] ∙ ∙ ⊚ ∙ ∙ ✓ ✓ ∙ ∙
This survey ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

(✓) Fully discussed, (⊚) Partially discussed, (∙) Not discussed.

Traditional IoT systems have static defense mechanisms in the form of firewalls, anti-viruses, or software patches, which has
become a constraint for the device manufacturers due to the heterogeneity of such devices [4]. Furthermore, the multi-functionality
factor has also posed the challenge of developing and deploying a security mechanism capable of detecting attacks over a wide range
of devices. This has resulted in the in-feasibility of robust security measures [5]. The widespread deployment of IoT applications
means that their security has become a vital element for secure communications [6]. Also, with the advancement in Artificial
Intelligence (AI), many deep and machine learning approaches are being proposed to improve intrusion detection systems (IDS).
Several survey papers have highlighted the different traditional benchmark IDS systems proposed [7,8].
However, the existing studies do not offer any insight into the latest developments. Current survey papers also evaluate the
threats at different layers in IoT systems by comparing tools available for simulations and their roles in the existing scope of IoT
architecture layers. However, with new protocols being proposed and used at different layers, the role of attack and data recording
tools were not discussed in such studies [9].
We stress that the access to complete information is crucial to building a robust IDS for IoT environments. New IoT devices are
attached to the internet every second, and existing devices and methodologies are becoming obsolete. Therefore, security researchers
should have access to an investigation that touches every aspect of the latest developments related to IoT security, which could help
detect zero-day exploits and defend against new threats. Fig. 1 depicts a stair-step visualization of our contributions where each
‘‘step’’ highlights the topics discussed in the paper, and the ‘‘goals’’ define what the reader will learn before going to the next step.

2
B. Kaur et al. Internet of Things 22 (2023) 100780

1.1. Review of previous surveys

Recent past research shows that numerous IDS initiatives were developed and evaluated using public datasets or datasets provided
on request. Several studies were reviewed and have been published, highlighting the security of IoT across different domains.
The survey work presented in [10] mainly focuses on various types of intrusions in IDS while providing every detail. The authors
also provided a taxonomy of the attacks, including the layers which are most impacted by these attacks. The attack tools are
explained in great detail. But the article does not give direction about the algorithms that can be used to detect the attacks for
IDS nor discuss available benchmark datasets in detail to check the system’s efficacy. An exhaustive literature survey on the NIDS
datasets covering a variety of features, formats, and relationships between datasets is performed [11]. The paper also discusses attack
scenarios, traffic generators, recording environments, and data repositories. It provides some recommendations for the researchers
to make decisions based on their scenarios, i.e., the study discussed 34 datasets and identified 15 different characteristics. However,
the number of samples in each attack for every dataset has not been discussed in detail. Furthermore, the protocols on which the
attacks are performed and the models used to detect those attacks are also not presented.
A survey presenting contemporary IDS in the form of taxonomy is reviewed [12], citing notable works while discussing the
datasets commonly used for evaluation purposes. In this paper, different evasion techniques for detecting attacks are also surveyed
with future research challenges. The same authors [16] presented a comprehensive survey paper discussing recent notable works
related to various intrusion detection datasets. They also presented a taxonomy of IDS, mostly highlighting types of attacks based
on intrusion methods. Different performance evaluation metrics and the importance of feature selection are also mentioned while
discussing the challenges of evasion techniques. Although their study discusses the datasets related to attacks in the network and
partially covers the IoT environment, there is a lack of details about the features used to detect those attacks and the accuracy
achieved by the models for detecting those attacks.
Various deep learning approaches for intrusion detection can be broadly categorized into two different models: deep discrimina-
tive and generative/unsupervised. The authors also categorized 35 different cybersecurity datasets into seven categories. Two real
traffic datasets, CSE-CICIDS2018 and Bot-IoT, have been used to evaluate the binary and multi-class performance of the proposed
models. However, there is a lack of discussion about other available IoT datasets [13].
The authors in [14] discussed different IoT architecture layer attacks with threat dimensions and attack surfaces, including
the technology, frequency bands, and respective data rates, and highlighting various available network and IoT-related datasets.
However, they do not discuss the features and approaches which help precisely detect these attacks.
A review analyzing the recent methods for intrusion detection is presented in the form of deep learning methodologies. A detailed
analysis of the intrusion detection benchmark datasets has also been conducted, which shows the usage of deep learning models
for training. The authors partially discussed the IoT-related dataset but did not provide any information regarding the attacks
performed [15].
A recent review article based on eleven different contributions investigated several relevant IoT-based botnet attacks. The
literature paper partially discussed the available IoT datasets. However, Hamid et al. [17] followed characterization techniques to
detect and mitigate such attacks. This study has not discussed the IoT device name, the type in which the attacks were performed,
or the algorithms used to detect the attacks. Finally, the works presented in [18] and [19] bring insights on the application of ML
and DL in IoT security, but do not focus on IoT vulnerabilities and layered attacks.
The review papers discussed so far consider only specific aspects of the datasets, i.e., they do not cover detailed requirements
about IoT-based environments. A solid foundation is required before proposing IDS for IoT. This paper offers a comparative study
of the available IoT datasets and techniques followed to detect the attacks and highlights the need to have an up-to-date and robust
IDS for IoT environment. Table 1 compares the state-of-the-art surveys concerning IoT datasets, features, models, and accuracy
achieved.

1.2. Contributions

This paper helps researchers identify the current IoT datasets for intrusion detection and their characteristics. The following are
the main contributions of this paper:

• We analyze some of the current state-of-the-art surveys providing information regarding which survey covers IoT dataset
information fully or partially;
• We investigate different security attacks in the IoT environment by highlighting the taxonomy of each attack. Security aspects
(e.g., confidentiality, integrity, availability, the impact of the threats, and privacy) are considered. Details of different attacks
on protocols are discussed in-depth alongside the exploited vulnerabilities from which IoT suffers;
• Several well-known publicly available IoT datasets are analyzed and compared by identifying fundamental properties, i.e.,
domain, availability, kind of traffic, data format, labeled/non-labeled, duration, network type, country, number of devices,
number of samples for observation and evaluation purposes;
• An extensive discussion on future directions is conducted throughout the paper and summarized in the final section. The
motivation for this initiative relies on fostering the development of new initiatives to benefit the scientific and industrial
communities. In this sense, gaps in the current state-of-the-art are presented covering different IoT security aspects.

3
B. Kaur et al. Internet of Things 22 (2023) 100780

Fig. 2. Taxonomy of attacks on IoT performed at different layers.

Overall, this paper contributes to researchers interested in proposing or developing a robust IoT-based intrusion detection system
and significantly impacting this upcoming research arena. The rest of the paper is organized as follows: In Section 2, well-known
attacks at different layers are discussed with in-depth attack taxonomy included; attacks on protocols and vulnerabilities exploited
are also highlighted. In Section 3, various available IoT related datasets are presented with properties in detail. We discuss the
techniques used by the datasets to detect the attacks in Section 4. In Section 5, the list of potential features used by the datasets
are provided in details. Finally, a conclusion of the work with future research directions is presented in Section 6.

2. Security attacks in IoT

In the past decade, IoT devices have been facing different kinds of attacks, making users conscious of using these devices more
carefully. The interest of the attackers is either information gathering, data theft or denial of service to the legitimate users. The
main objective of this section is to determine actual attacks and examine respective targets by dividing the attacks into two broad
categories, active and passive attacks. We further highlight different layers of network models that are affected by these attacks. We
also discuss different protocol attacks and exploited vulnerabilities related to the IoT environment. Fig. 2 illustrates our proposed
taxonomy of attacks on IoT.

2.1. Active attacks

In this type of attack, the intruder breaks into the system and alters the corresponding information. It attempts to control
the resources by introducing malicious code, disrupting the normal operations by transmitting bulk information, saturation and
disseminating the information [20]. For example, the Denial of Service (DoS) can start dropping data packets and consuming
resources [21]. The active attacks are defined layer-wise in the following section.

2.1.1. Physical layer attacks


Another name for this layer is the sensing layer, where various channels are used for communication among the devices. The
physical attacks range from unauthorized access to the device to the removal or blocking of the connections of legitimate users. The
attacker causes physical damage to the device by not letting the system shut down, ultimately leading to overheating [22].

4
B. Kaur et al. Internet of Things 22 (2023) 100780

2.1.2. MAC/network/transport layer attacks


As MAC addresses are coded into network interface cards, cards, the attacker launches a spoofing attack by changing legitimate
users’ MAC addresses. Other threats like Man-in-the-Middle (MITM) attacks can prevent the working of specific nodes on a
network [23].
At the network layer, a significant amount of packets are injected, causing the network to be congested. This depletes the
resources throughout the network. Also, most of the attacks are performed on the IP protocol due to efficient delivery of packets
from one point to another, thus providing easy access to the service at this layer.

2.1.3. Application layer attacks


This layer supports several protocols with different functionality; for instance, SMTP is used for mail transfer, while HTTP enables
web services. IoT device usage in different domains like smart housing and office application layer protocols, provides an easy and
accessible ground for cyber attackers.

2.2. Passive attacks

This type of attacks usually gather information without consent from the user and further exploit the information by decrypting
it. Passive attacks are further categorized into the following attacks:

2.2.1. Software
Usually, malicious software is used to infect the system for stealing the data system to steal data or even to launch DoS attacks.
This software could in such forms as viruses, worms, and spyware. The SCADA systems are the most vulnerable to attacks due to
their connectivity to the IoT network. Due to the exponential growth in the IoT technology sector, conventional SCADA systems are
integrated with cooperating business networks and IT systems.

2.2.2. Cryptanalysis
Several lightweight cryptographic algorithms have been developed to meet the increasing demand for IoT devices that use
extremely low resources without human intervention. A vast amount of raw information is recorded daily from devices enabled
with IoT As most commonly used algorithms use significantly less computing resources, the security of IoT devices using lightweight
cryptographic algorithms is more vulnerable to cryptanalysis techniques such as differential cryptanalysis or slide attacks. Providing
security against these attacks to IoT devices, which are connected at different layers, has become crucial. Some techniques or
strategies must be used to counter the attacks when working with the IoT model.

2.2.3. Volume based and bandwidth/resource depletion attack


Here, the attacker utilizes all the bandwidth by sending bots over the network, preventing legitimate users from accessing
allocated services. Also, the attacker starts depleting the resources by allocating or depleting them. Different layers of network
architecture are targeted. This attack is severe, causing systems to crash due to factors such as memory and CPU usage overflow.

2.2.4. Web based attack


The attacker forwards numerous requests to a particular website on the web, making it non-functioning for many hours. Also,
eavesdropping is performed by intruders to capture personal data of the users.

2.3. Attacks on different types of protocols

The protocols are designed to provide effective communication from source to destination. Moreover, these protocols usually
work in a simple yet collaborative environment not concerned with integrity and privacy mechanisms. Due to these reasons, the
protocols working at different layers are more vulnerable to attacks [25]. In Fig. 3, A fishbone diagram, popularly known to represent
security and safety, has been used for categorizing individual attacks on protocols. These diagrams provide a thoughtful analysis of
the possible contributing factors (or sub-causes) of a particular problem [24]. In fact, different protocols and threats are considered.

2.3.1. Attacks on internet protocol (IP)


The IP protocol works at the network layer, whose functionality is to route the packets from the source to the destination
following the router rules. Here every next hop is determined by the router, which ensures that packets reach their destinations.
Several attacks can occur against the IP protocol.

2.3.2. Attacks on TCP


The functionality of this protocol is to provide guaranteed delivery of the packets over a network. The TCP protocol usually
follows a mechanism of transmitting a packet [26]. But the attacker intrudes into the system by initiating many half connections [27]
performing TCP-SYN flood attack, bypassing the firewalls that do not block FIN packets [28], session hijacking [29].

5
B. Kaur et al. Internet of Things 22 (2023) 100780

Fig. 3. A fishbone model for categorizing attacks concerning protocols [24].

2.3.3. Attacks on UDP


An interface for the IP is the User Datagram Protocol (UDP), whose function is to transmit a datagram from one application to
another with low overheads. However, due to its nature, UDP is always vulnerable to different attacks [30], other attacks include
port loopback, fraggle attack [31].

2.3.4. Attacks on ICMP


This protocol sends a one-way message used by the routers in the network, mainly information regarding errors related to data
processing by the destination hosts. During this process, many attacks (e.g., address sweep, smurf attack, and OS fingerprinting
attack) are based on sending malformed ICMP packets to the destination [32].

2.4. Type of exploited vulnerability

A vulnerability is a weak area where an attacker finds an easy way to access the system. The increasing variety of IoT devices
led to more vulnerabilities and cyber attacks. Common vulnerabilities related to IoT devices are weak passwords [33], open
ports [34], insecure application interfaces [35], exposed network interface [36], exposed software [34], device memory, insecure
cloud interface [36], password cracking, underlying infrastructure security, and virtualization threats [37,38]. Table 3 highlights
the types of vulnerability and attack types.
Moreover, IoT-based networks are different from traditional networks in several ways, as shown in Table 2. In fact, there is an
open research problem regarding how the internal communication pattern can differ and how Machine Learning (ML) can be used
to underpin such patterns. For example, IoT profiling [6,39] is a current research problem that is focused on the communication and
exchange of IoT devices, involving how Machine Learning (e.g., Deep Neural Networks and Random Forest) [40,41] and anomaly
detection (e.g., Isolation Forest) [42] methods can be used more accurately from interpretability efforts [43]. Finally, the granular
analysis of the intersection between features of IoT-based networks, IoT profiling and behavioral aspects in IoT-based attacks, and
how ML techniques underpin single and multiple threats, is included in the future directions of this research.
Different Machine learning methods can recreate a critical role in IoT security by providing enhanced threat detection, anomaly
detection, and predictive analytics capabilities. The following represent some of the important ML techniques for detecting different
threats in IoT environment:

• Anomaly detection: Anomaly detection is a technique that involves identifying abnormal or unexpected behavior in an IoT-
based network. IoT can include monitoring sensors and devices for unusual behavior that could indicate a security breach or
threat. Machine learning algorithms can be trained to recognize these anomalies and alert security personnel to investigate
further. Using either nearest neighbor, cluster-based, classification-based, or statistical-based anomaly algorithms such as k-NN,
Local Outlier Factor (LOF), and Connectivity-based Outlier Factor (COF) to be trained based on different normal behavior of
the devices such as their time interval to connect to the sensor, connection time for sending ARP request or other network
protocol behavior of the IoT devices shows how vital a machine learning can be in IoT environment, especially in preventing
an attack. Furthermore, installing a lightweight anomaly algorithm on your IoT-based network can save the organization from
timely attacks;

6
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 2
Comparison of traditional and IoT-based networks [44–46].
Traditional networks IoT-based networks
Data generation Interaction of multiple systems Automated procedures
trigged by sensors and actuators
Topology and Services Centralized Decentralized
Security Robust Security Mechanism Lightweight Security Mechanisms
Adaptability Stable and Rigid Topologies Highly flexible and adaptable
Computation Processing Power Processing power provided
Available at end nodes by remote systems
Communication Supports complex protocols Prioritizes Lightweight Protocols

Table 3
Common type of attacks along with vulnerability.
Type of Vulnerabilities
attack
Weak Open Data Insecure application Missing Exposed network Exposed Node Insecure Device Infrastructure Virtualization Password
passwords port confidentiality interfaces privileges interfaces software tempering cloud interface memory security threat cracking

Aidra [47] ✓
Backdoor [48] ✓ ✓
Bashlite [49] ✓
Botnet [49] ✓
Brute Force [50] ✓ ✓ ✓ ✓ ✓
DDoS [51] ✓ ✓
Doflo [52] ✓
DoS [47] ✓ ✓ ✓
Data Theft [52] ✓ ✓ ✓ ✓
Hajime [47] ✓
Mirai [49] ✓ ✓ ✓
Info. Gather [52] ✓ ✓ ✓ ✓ ✓
MITM [53] ✓ ✓ ✓
Reconnaissance [51] ✓ ✓
Ransomeware [48] ✓ ✓ ✓ ✓ ✓ ✓
SlowIte [50] ✓
Smurf [54]
Spoofing [23] ✓ ✓ ✓
Scanning [52] ✓
Sniffing [55] ✓ ✓ ✓
Tsunami [47] ✓
Trojan [56] ✓ ✓
Torii [47] ✓
Virus [57] ✓ ✓ ✓
Web [48] ✓ ✓ ✓ ✓ ✓ ✓ ✓
Worm [58] ✓ ✓ ✓

• Predictive analytics: Predictive analytics uses historical data to identify patterns and trends that can help predict future
behavior. In IoT security, predictive analytics can be used to anticipate potential threats and take preventive measures before
they occur;
• Deep learning: In IoT security, deep learning can be used to analyze large amounts of data generated by sensors and devices
in real-time, detecting security threats and anomalies more quickly and accurately.
• Reinforcement learning: In IoT security, reinforcement learning can help devices learn to identify and respond to security
threats in real-time.

Machine learning is vital in IoT security by providing enhanced threat detection and prevention capabilities. Using the mentioned
methods, organizations can analyze vast amounts of data generated by IoT devices and identify potential risks more quickly and
accurately.

3. IoT attack datasets

In the past few years, many datasets in the IoT security domain have been assembled, each with pros and cons. Due to the
exponential adoption of IoT devices, researchers are increasingly working towards IoT-related datasets because of the upsurge in
unknown vulnerabilities and threats. IoT devices’ effectiveness, security, and applicability against normal or abnormal behavior are
tested by capturing the data in a simulated or realistic environment. Therefore, dataset quality plays a vital role in providing an
effective model for detecting intrusion in a real environment. This section discusses publicly available datasets proposed for IDS
in IoT environments. Fig. 4 presents the year-wise distribution of IoT datasets proposed, which shows a sharp increase in the year
2020. In addition, Table 4 depicts different datasets with technical details and properties which reflect general information about
them.

3.1. Search criteria

In order to find IoT datasets available to support security solutions, three main steps were considered as illustrated in Fig. 5.
With the intention of identifying recent datasets published in the past 5 years, we did a search through Google Scholar using the

7
B. Kaur et al. Internet of Things 22 (2023) 100780

Fig. 4. Number of IoT datasets proposed year-wise.

Fig. 5. Search criteria adopted in this research.

following search strings: ‘‘IoT security + Dataset’’, ‘‘IoT security + Data’’, ‘‘IoT dataset + Machine Learning’’, ‘‘IoT cybersecurity +
Dataset’’, ‘‘IoT IDS + Dataset’’, and ‘‘IoT DDoS + Dataset’’.
In order to expand our search for datasets available, we also focussed on technical papers that were leveraged by the use of IoT
datasets. In this sense, we searched for technical contributions in the following areas: IoT IDS, IoT DDoS Detection, IoT Intrusion
Prevention, IoT Spoofing, IoT Reconnaissance, IoT Flooding, IoT Mirai, IoT Privacy.
Finally, we also searched for new datasets of research groups that have published widely used cybersecurity datasets in the
past. Our main focus was on searching datasets in the following areas: Malware (e.g., Android and Adware), DNS, Dark Web, IDS
(e.g., DDoS, IPS, and DoS).

3.2. Features

Domain: Before capturing data, the target domain needs to be selected. As IoT devices are most common in the home
environment and provide an easy testbed for cyberattacks, most recorded data focuses on this domain.
Availability: The viability and quality of the proposed datasets for intrusion detection can only be checked if it is made publicly
available for research. The property is marked if it is publicly available or available upon request.
Kind of Traffic: This is an important property, as it helps to evaluate the system on an array of IoT devices and attacks.
‘‘Simulated’’ means data captured within a testbed, and ‘‘real’’ means network traffic captured in the real-world scenario with real
devices.
Data Format: Capturing data plays an important role as it further helps in analyzing network traffic. The data can be recorded
in PCAP format, which includes payload information and feature format with processed data. Another type is the log file, which
collects system-generated records like the operating system or server events.
Evaluation: The label field defines whether the instance collected is normal or malicious.
Total Duration: The time span over which the data is captured is defined.
Network Type: This property is used for evaluation to check adaptability to a specific environment.
Country: This field defines the name of the countries where the data is captured.
Number of Devices: To evaluate the viability of the intrusion detection system, different IoT devices are used in the testbed.
This field shows the number of IoT devices included in each dataset.

8
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 4
Overview of IoT based Dataset properties.
Dataset Domain Availability Kind of traffic Data format Evaluation Totalduration Network Country #Devices #Samples Reproducibility
type

pcap feature log Labeled Benign Malicious

N-BaIoT (2018) [59] Home Yes Real n.s. Yes No Yes n.s. Small Negev, 9 502595 6506674 PR
Israel

BoT-IoT (2019) [52] Home Yes Simulated Yes Yes No Yes n.s. Small Canberra, 5 9543 73360900 R
Australia

Kitsune (2019) [51] Home Yes Simulated Yes Yes No Yes n.s. Small San diego, 9 700000 26472754 R
USA

IoTNIDS (2019) [60] Home Yes Real Yes No No Yes 1 week Small n.s. 2 1756276 1219254 R

MedBIoT (2020) [49] Home Yes Real/simulated Yes Yes No Yes 6 days Medium Estonia, 7 12540478 5305089 R
Europe

IoT-23 (2020) [61] Home Yes Real Yes No Yes Yes 1 year Small Czech 3 30858735 710098146 R
Republic,
Prague

IoTIDS (2020) [23] Home Yes Real No Yes No Yes 1 week Small Oshawa, 2 40073 585710 PR
Canada

MQTT (2020) [50] Home/office Yes Simulated Yes Yes No Yes 1 week Small Genoa, 8 11915716 165463 R
Italy

MQTT-IoT-IDS(2020) [62] Home Yes Simulated Yes Yes No Yes n.s. Small Dundee, 1 1056230 352355 R
Scotland

TON_IoT (2020) [48] Home/cities Yes Simulated Yes Yes Yes Yes n.s. Small Canberra, 10 245000 124619 R
/industry Australia

IoTHIDS (2018) [47] Home On request Simulated No No No No n.s. Small n.s n.s n.s n.s HR

IoT-SH (2019) [53] Home On request Simulated No No No No 5 weeks Small Cardiff, 8 170785 50000 HR
Wales

Edge-IIoTSet (2022) [63] Industry Yes Real Yes Yes No Yes n.a. Medium Guelma, 10 11223940 9728708 R
Algeria

Number of Samples (Benign/Malicious): This field describes the number of samples captured for each type, which helps choose
algorithms according to the available data.
Reproducibility: describes how reproducible the research is based on several aspects. In terms of reproducibility, datasets can
be categorized as Reproducible (R), Partially Reproducible (PR), and Hardly Reproducible (HR). This classification is based on how
detailed the data collection is in each contribution as well as how clear it is for other researchers to generate a similar dataset
following the described approach and is evaluated as follows:

𝑅𝑠 = 𝐴 + 𝑇 𝑟 + 𝑃 𝑐 + 𝑁𝑡 (1)

Where 𝑅𝑠 stands for the reproducibility score and from in 0 ≤ 𝑅𝑠 ≤ 1, A represents the availability of the dataset (0.25), Tr is the
type of traffic used (0.25 for simulated, 0.12 for real), and Pc and Nt represent, respectively, the availability of .pcap files (0.25)
and the network type (0.25 for small, 0.12 for medium). In this context, all four components are assigned the same weight since
they are complementary to each other and fundamental to validating proposed solutions in the IoT security ecosystem. For example,
in order to be used, a dataset must be available to download. Also, if the traffic is simulated, it is simpler to be reproduced in any
system. Furthermore, the availability of pcap files enables researchers to access the real traffic and engineer their own features. The
network type has direct implications for how a dataset production can be reproduced. Finally, these weights can also be configured
in a different way to evaluate reproducibility from different perspectives [64]. In fact, each dataset is evaluated as R, PR, or HR as
follows:
⎧ 𝑅 𝑅𝑠 ≥ 0.75

𝑅𝑡 = ⎨ 𝑃 𝑅 0.5 > 𝑅𝑠 > 0.75 (2)
⎪𝐻𝑅 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

3.3. Datasets

N-BaioT (2018) [59]: A novel network-based dataset has been proposed for detecting botnet attacks in the IoT environment.
The authors used two widely known botnets, Mirai and BASHLITE, to perform an attack on nine commercial IoT devices. Different
statistical features were extracted over several temporal windows and then used by a deep learning autoencoder for attack detection.
IoTHIDS (2018) [47]: A host-based IoT dataset which captures data from real IoT devices. The experiment was conducted in a
laboratory and consisted of three devices infected by Mirai, Hajime, Adira, BASHLITE, Doflo, Tsunami, and Wroba malware botnets.
IoT-SH (2019) [53]: This dataset was proposed using 8 different smart home devices on which 12 different attacks categorized
using 4 main attack categories were performed. The authors followed a three-layer IDS to detect the attacks and used various
combinations of rule-based and machine learning approaches to classify the attacks.
BoT-Iot (2019) [52]: This dataset is the representation of realistic traffic, which has been created containing heterogeneous
network profiles. DDoS, DoS, data theft, and scan attacks are performed on five devices. A set of new features were extracted

9
B. Kaur et al. Internet of Things 22 (2023) 100780

Fig. 6. Various machine learning and deep learning approaches used with IoT datasets.

using the Correlation Coefficient and Joint Entropy techniques. The features are then fed into various machine and deep learning
methodologies for reporting the attack detection accuracy.
Kitsune (2019) [51]: An IoT dataset composed of four different categories of attacks performed on nine IoT devices. Among them,
one IoT device (i.e., security camera) was infected by a real Mirai botnet sample. The authors presented Kitsune as a plug-and-play
network intrusion detection system (NIDS) to detect normal and malicious samples.
IoTNIDS (2019) [60]: A dataset collected in a real environment recorded from two IoT devices (speaker and camera). The Mirai,
MITM, DoS, and scanning attacks were performed on these devices.
MedBIoT (2020) [49]: A novel medium-sized IoT network architecture dataset is provided with labeled behavior. The dataset
comprises seven IoT devices, some of which are real, and others are emulated. The authors used 100 statistical features passed into
different machine learning techniques to detect the attacks.
IoT-23 (2020) [65]: A botnet dataset captured during 1 year that contains real environment captures of benign and malicious
samples. This dataset was created as part of the Apsemat project that focuses on researching the attacks on IoT. During the period,
authors used 23 different botnet attacks on the devices.
IoTIDs (2020) [23]: A dataset proposed with a set of IoT-related flow-based features, which were selected using the Correlation
Coefficients technique and ranked with the Shapira–Wilk algorithm. For the study, the authors performed four different attacks on
two different devices (speaker and camera) and recorded the data. They used SVM, G-NB, LDA, and LR for attack detection and
classification.
MQTT (2020) [50]: The objective of the authors is to provide a dataset that uses a protocol dedicated to IoT network scenarios.
Eight IoT devices were connected to the MQTT broker to represent it as real scenarios. A set of 33 different features were explored
and fed into various machine learning algorithms to report the accuracy.
MQTT-IoT-IDS (2020) [62]: A dataset using a lightweight protocol, i.e., MQTT used in IoT network. The authors used a camera
feed, 12 MQTT sensors, and a broker in the network. Their dataset consists of five scenarios, which include normal and attack
data. The authors used packet-based, uni- and bi-flow features, inputted into six different machine learning algorithms to check the
system’s effectiveness.
TON-IoT (2020) [48]: A Telemetry-based data-driven IoT/IIoT dataset, which is heterogeneous in nature. The data captured
consists of both normal and attack samples with ground truth. The dataset, which further represents realistic traffic, includes the
sub-category of attacks. Apart from seven IoT devices, the dataset includes data recorded from operating system logs and network
traffic. Various machine and deep learning algorithms are used, and they report the accuracy achieved.
Edge-IIoTSet (2022) [63]: a realistic cybersecurity dataset of IoT and IIoT applications to enable the development of intrusion
detection systems in centralized and distributed applications. The authors present a in-depth description of the testbed used, as well
as the dataset generation framework and considerations on centralized and federated learning. It is also focused on addressing the
limitations of existing datasets.

4. Methodologies to detect IoT attacks

There is always a need to provide a robust environment to secure the IoT devices connected over a network. Various detection
approaches like tree-based, classical supervised, rule-based and deep learning methodologies have been proposed over the years
to detect these attacks. These algorithms provide promising solutions to the issues that arise [66]. Furthermore, different ML and
DL techniques have been applied to IoT datasets. Fig. 6 depicts techniques categorized into tree-based, classical supervised, deep
learning, and rule-based models. We have also provided information in Table 5 showing the number of features used, the type of
features extracted, techniques used for feature selection and algorithms used to detect the attacks.

10
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 5
Performance comparison of IoT datasets with features type, selection technique and methodologies.
Datasets # of Features Feature type Feature Selection Algorithms k-fold Accuracy Precision Recall FAR FPR F1-Score

BoT-IoT 10 Flow corr coeff, entropy SVM,RNN,LSTM 5 0.88 (SVM) 1 0.88 – 0 –

Kitsune 115 Statistics – IF,GMM – – – – – – –

N-BaIoT 115 Statistics – IF,LOF,SVM, AE 4 – – – – 0.007 ± 0.01(AE) –

Med-BIoT 100 Statistics – k-NN,DT,RF 10 0.97 (RF) 0.98 0.97 – – 0.96

IoTID 83 Flow corr coeff SVM,LR,GNB, 3,5,10 88 (DT) 88 88 – – 88


LDA,RF, DT

IoT-SH 10 Packet corr coeff SVM,MLP, 10 – 0.99 (J48) 0.99 – – 0.99


NB,RF,
OneR,ZeroR,
LR,J48,BN

TON-IoT 44 Flow – LR,LDA, 4 0.77 (CART) 0.77 0.77 – – 0.75


k-NN, CART,
RF,NB,
SVM,LSTM

MQTT 33 Packet – NN,GB, – 0.91 (RF) – – – – 0.91


NB,DT,
MLP,RF

MQTT-IoT-IDS 43 Packet & flow – LR,k-NN, 5 88.55 (DT) 88.55 88.55 – – 88.54
SVM,RF,
NB,DT

Edge-IIoTset 63 Packet & flow Feature Importance DT, RF, SVM, – 0.9467 (NN) 0.8019 0.7699 – – 0.7733
KNN, NN

4.1. Tree-based machine learning (ML) models

The most commonly used techniques are tree-based ML models, consisting of nodes and branches. Generally, a tree is built
recursively, splitting the training samples by creating a node with the help of a feature that separates the data effectively. Some
information criterion governs the selection of the feature. For predicting the label of the given test/query sample, it compares feature
values from the root node and traverses down via different branches and nodes to reach the terminal node.
Decision Tree (DT) [67] is a technique widely used in detecting, classifying, and mitigating IoT attacks. Regarding Intrusion
Detection Systems (IDS) for IoT, some contributions focus on a combination of DT with rule-based approaches [67] or hybrid
models [68]. It has been used in simple and challenging problems since it is efficient and easy to train and interpret. Similarly,
Random Forest (RF) [69] has presented outstanding result in several initiatives. Both in hybrid [70] or direct applications [71], RF
has been adopted to classify and detect IoT threats in several systems. In fact, it refers to an ensemble of multiple DT where each
tree is built by randomly selecting the data points.
Isolation Forest [72] is an unsupervised outlier detection algorithm based on the isolation of outliers and is useful in many cases,
such as intrusion detection [73] and anomaly detection [74]. Finally, the random partitioning of features performed by this algorithm
results in shorter tree paths for outliers in comparison to normal data points. Furthermore, Classification and Regression Trees
(CART) [75] is a non-parametric, decision tree-based algorithm that constructs a binary tree structure from the training set. This
technique has been used in several IoT-based security applications, e.g., IDS in smart grids [68] and anomaly detection SDN-based
applications [76].

4.2. Classical supervised machine learning models

This is the most common branch in ML and is used for intrusion detection in IoT. Here, we describe some of the most commonly
used methodologies in IoT applications.Among them, SVM, k-NN, and NB are mostly selected for reporting the results.
Support Vector Machine (SVM) [77] has been used in IoT privacy preserving [78] and in secure smart home applications [79].
This technique is based on a hyperplane in an N-dimensional feature space that optimally separates the classes and can be resource
demanding in some cases. Similarly, Naive Bayes (NB) [80] has been successfully employed in several IoT security solutions since
it does not require extensive datasets in order to produce good results. Examples of IoT applications based on NB are IoT-Fog-Cloud
anomaly detection [81] and attacks detection in wireless sensor networks [82]. This scalable technique works with the a-priori
assumption that the features are independent of each other.
IoT security applications can be supported by Linear Discriminant Analysis (LDA) [83] in different cases, such as in improving
the performance of classifiers in intrusion detection [84] and IoT terminal recognition [85]. This method finds a new feature space
to maximize the class separability while minimizing the within-class variability. In addition to classification, LDA is popularly used
as a dimensionality reduction technique. Logistic Regression (LR) [71] is another popular approach used for binary classification
problems where a non-linear sigmoid/logistic function governs the decision. It has been used in problems such as trust-based
misbehavior detection [41] and intrusion detection [86], and maps an input to a value between 0 and 1 where a threshold value
can be set to indicate the query sample be classified in one of the binary classes.
Multilayer Perceptron (MLP) [87] refers to a shallow variant of feed-forward neural networks with high performance in several
problems. It has been used in solutions related to, e.g., trustworthiness [88] and attack detection [89]. k-NN [90] is a non-parametric

11
B. Kaur et al. Internet of Things 22 (2023) 100780

supervised machine learning approach commonly used for multi-class classification problems has presented remarkable performance
in solutions of Intrusion Detection Systems (IDS) in Wireless Sensor Network (WSN) [91,92]. To be robust against noise and effective
in working on large datasets, the output value is predicted by calculating the distance of the k-nearest neighbor from the core point.

4.3. Deep learning based approaches

Recently, with the required computing capability and huge datasets for model training, deep learning algorithms are heavily
used across different applications due to supremacy in terms of accuracy. This section discusses variants of deep learning algorithms
popularly used in literature.
Convolution Neural Networks (CNNs) [93] is an extension of the multilayer perceptron composed of an input layer, many
hidden layers, and an output layer. The hidden layers are made by convolution, non-linear, down-sampled (i.e., pooling), and fully
connected layers. This model has been successfully used in problems such as near real-time security [40] and malware analysis [94].
Furthermore, Recurrent Neural Networks (RNNs) [95] refer to a popular neural network architecture to model the sequential data
due to their ability to remember the inputs with the help of their internal memory. In IoT security, RNNs have been used for several
purposes, such as IoT IDS [96] and Networking Malware Threat Detection [97].
The Long Short-Term Memory (LSTM) [98,99] is an advanced variant of RNNs, which helps in learning long-term data
dependencies. This was achieved with the help of memory known as gated cells, which have 3 gates: input, output, and forget
gate. The functionality is flow control of information that needs to keep or delete to predict the output in the network. Finally, Auto
Encoder [100] is a specific type of neural network which works in a feedforward direction. Here the output is the same as the input,
where input is first compressed into low-dimensional form and then reconstructed at the output. This technique has been used in
specific attacks, such as impersonation attack detection [101] and anomalous communication detection [102] in IoT.

4.4. Rule based machine learning approaches

In this type of machine learning, specific rules are defined to detect malicious activity. Whenever the data is processed, the
system checks the defined attack behavior in the rule book and then reports what it detects.
OneR [103] is a depth-1 decision tree, which is simple and often used in intrusion detection. This algorithm generates a particular
set of rules starting with a single attribute. This way, every branch is assigned the most frequent class. Similarly, ZeroR [104] is a
simple rule-based classifier is ZeroR, which relies only on the target class and ignores all the predictor values. From the frequency
table maintained, it never chooses the non-target value. Therefore, this classifier works best when baseline performance needs to
be determined for other classifiers.

5. Potential features used in IoT datasets for attack detection

Over the years, various frameworks have been proposed for detecting network intrusions. But these proved shallow if a proper
set of features is not provided. Although researchers have proposed different techniques to select appropriate features to recognize
the type of attacks, we still lack the tools to determine what makes a good set of features. To detect attacks on individual host
there are HIDS and to examine the abnormal behavior of the network different NIDS are available. The features are categorized into
different groups that are described next:

5.1. Connectivity features

These features carry all the information related to source and destination. These features are further categorized according
to their usability in extracting the information related to the attacks as Basic These are the features that usually contain the
information regarding the source and destination flow like source–destination IP, protocol type, etc. These features are also defined
as coarse-grained or network identifier features, are usually required for any attack detection. Examples of IoT applications based
on these features are IoT device identification [105], intrusion detection [106], and malicious traffic detection [107]. Time refers
to information regarding the duration of connectivity, activeness, idleness, or direction of the two flows is described in detail using
time features. Bytes refers to checking the original number of the source to destination IP and total count of bytes sent in forward
or backward direction. These features give essential information regarding attacks. Flag is about insights into how protocols like
TCP and MQTT are subject to numerous attacks. The features provide information regarding the setting of flags in packets (either
forward or backward) and the connection state.

5.2. Layered features

The increase of cyber attacks on IoT-related devices has focused on the type of communication protocols used at different
layers. As protocols are the primary communication medium over the network, they are more vulnerable to attacks. To identify the
features, protocols at each layer need to be identified and appropriately analyzed. Example of applications that use these features
are fingerprint-based [108], anomaly-based [109], and traffic-based [110] solutions. The layers on which the intrusion happens
include L4: Application Layer, L3: Transport Layer, L2: Network Layer, L1: Network Interface or Sensing Layer.

12
B. Kaur et al. Internet of Things 22 (2023) 100780

5.3. Communication features

IoT has become an extended concept, as it expresses a framework through which diverse objects, such as smart metering,
smartphones, or wireless sensor nodes, are blended and connected via the internet through unique addressing. These devices interact
and work in a coordinated way to achieve specific goals in a stipulated time frame. This diversity in technology has provided a fertile
ground for threats to manifest [111]. The feature set includes features related wifi to where it should help define intrusion detection
accuracy by minimizing processing and storage requirements [112], bluetooth low-energy(BLE) which is a well-designed connectivity
protocol used by lower-powered devices and is an emerging choice in many IoT device applications and becoming an easy prey
for cyber attackers and zigbee which is popular for use in IoT devices such as smart meters and home automation. This popularity
resulted in the discovery of security vulnerabilities. The Zigbee frame provides crucial features that can help counter attacks on IoT
devices and several works are using machine learning for this end [113–115].

5.4. Dynamic features

These features are essential for analyzing the behavior of the information flow. These include traffic statistics, such as the packet
size, original IP addresses, and inter-arrival time. The dynamic features are useful, as they reveal information about deviation from
the norm of traffic patterns to identify attacks [116]. In [117], authors used calculations of flow features, such as mean, max, and
std to propose an intrusion detection system. Similarly, a novel software-defined network-based IDS framework is developed, and
statistics features are used for attack categorization. The connectivity and layered features used by IoT datasets for attack detection
are presented in Fig. 7(a), and flag features are depicted in Fig. 7(b) and dynamic features are shown in Fig. 7(c). In fact, these
features are vital for ML models once they provide information about the underlying characteristics of the traffic and enables such
techniques to identify patterns or normal and malicious network activities.

6. Key challenges & conclusion

IoT has become an essential commodity in our day-to-day lives. With its multifaced capability to cater to users’ needs, it has
become a personal assistant that everyone is interested in using. However, the popularity of IoT has attracted intruders who hack
the system and take control by stealing user credentials or sending multiple requests to break down the system. Subsequently, the
research community is working towards proposing realistic/simulated datasets and providing suitable solutions based on specific
scenarios. This work has identified ten parameters for performing comparative analysis among IoT datasets. However, we need to
discuss some critical aspects related to the publicly available datasets.

• Device Selection and Functionality: The datasets proposed include smart IoT devices, such as baby monitors, doorbells, and
security cameras. This technology is changing quickly, and the embedded techniques, functionalities, criteria, and specifications
are obsolete or will become obsolete with time. During this study, we also observed that some of the devices used in
the proposed datasets have either single (camera) or multiple functionalities (baby-monitor). This raises a concern that an
attacker has an advantage in multi-functional systems because of the many ways of entering the system without being noticed.
Therefore, we hope that future datasets will include the latest up-to-date devices so that any new attack can be easily identified.
• Traffic Format: Among the datasets discussed, only a few have exposure to real-world data in a simulated environment. Also,
the tools to collect such data vary, i.e., there is a lack of following a particular standard format, which affects the usefulness
of these intrusion detection systems.
• Attacks and Variants: During this survey, we noticed that some of the proposed datasets used a main category of attacks
while others used subcategories of attacks. This creates a dilemma about whether to provide a solution for the main class or
a subcategory of an attack. As the proposed systems need to maintain a view of real-world scenarios, it is best to know both
categories and circumstances.
• Perfect Methodology: New approaches are being proposed to detect attacks on the system. During our study, we found that
no perfect approach could help identify a particular attack. Because one attack can be detected by more than one approach or
by a combination of machine or deep learning approaches, one or more approaches must be used to detect one or more types
of attacks. Therefore, we recommend a deeper investigation towards proposing better solutions.
• Security updates: IoT manufacturers are more eager to produce and deliver their devices as fast as they can, without giving
security too much thought. Also, most manufacturers offer only firmware updates.

With the ever-increasing demand for IoT devices in various domains, such as health, industry, and education, IoT environments
offer potential ground for intruder attacks. In contrast, the vulnerabilities of devices threaten user security and privacy. A robust
solution for IoT must be developed to cover the properties of a wide range of attacks, including zero-day attacks. This paper
presents an extensive survey of state-of-the-art approaches to intrusion detection IoT datasets by using protocol description, attacks,
vulnerabilities, feature description, detection methodologies, and accuracy, and by showing whether the compared surveys review
IoT datasets. We have discussed security attacks in IoT that threaten various layers in the network framework, highlighting in-depth
attacks performed at the layers. However, these attacks are performed on different protocols working at multiple layers, so we also
discussed the attacks on TCP/IP protocols suite with exploited vulnerabilities. Further, we have presented publicly available IoT
datasets in detail, exploring the type of data format used, whether the data is labeled or not, the duration of the data recorded,
type of network, which country the experiment was performed, the number of devices used in their experiment and the number of
benign and malicious samples collected. We then discussed various deep and machine learning methodologies used to detect attacks
on IoT devices. We also enlisted potential features which would help detect unexpected behavior in IoT devices.

13
B. Kaur et al. Internet of Things 22 (2023) 100780

Fig. 7. A venn-diagram representing various features used by different IoT datasets.

7. Additional tables

In order to provide further insights on the characteristics of the datasets investigated in this research, we present a description
of attacks on IoT at different layers (Table 6), a detailed summary of IoT dataset properties (Table 7), a list of potential features
(Table 8), and a list of features used in IoT datasets (Table 9).

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared
to influence the work reported in this paper.

Data availability

No data was used for the research described in the article.

Acknowledgments

The authors would like to thank the Canadian Institute for Cybersecurity (CIC) for financial and educational support. This project
was also supported partly by collaborative research funding from the National Research Council of Canada’s Artificial Intelligence
for Logistics Program.

14
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 6
Description of attacks on IoT at different layers.
Layer name Attack type Attack name Description Ref Violation Level of impact Threat
type

Tampering These attacks generally focused on hardware components [118] A H S

DoS Jamming Occupies the communication channel [118] A H S


between the nodes thus preventing them from communicating with each
Physical/ Sensing
other

Malicious Code injection Constantly sending the malicious messages to block or overload the [118] C M P
network

Battery Exhaustion Attack by sending large [34] A H S


number of request

Physical Damage Power loss Uninterrupted power loss [58] A H S

Personal Identification Social Engineering Using personal information from IoT gadgets to gain illegal access [119] C H P

Data transit Eavesdropping Reading messages by unintended receiver [120] I M S

Selective Forwarding To block the messages from forwarding and malicious [121] I H P
user acting as a black hole this will prevent
from messages to further propagate

POD Attack performed by making the packet [122] I M S


DoS/DDoS
size invalid ultimately destroying the target device.

Flooding Blocking a machine or [123] A H S


network by sending invalid information.
MAC/Network/Transport
TCP Flood Sending overwhelming number of ping requests [59] A H S

UDP Flood A numerous packets sent to random ports in this attack [23] A H S

Smurf Consume enough server resources to [124] A H S


and making unresponsive system to legitimate traffic

ICMP A large number of ICMP message are [125] A H S


sent to crash the system.

Telent Unavailability of Telnet services for remote administration [51] A H S

Sybil Replicating the single node and [55] I H P


presenting it using multiple identities

ARP spoofing Changing the MAC and IP address [126] I M S


Spoofing
MAC spoofing Used by the attacker to present as legitimate IoT device [127] I H S

MAC flooding Interrupt normal flow of traffic between sender and receive [20] I H S

MITM Eavesdropping Attacker collect information from a network by snooping on transmitted [128] I M P
data

blackhole Attack is to drop the incoming data packets [129] I M S

Hole attacks sinkhole Making the victim node attractive [55] A M S


so that all flow goes
towards victim node thus making it prone to attack

wormhole Entraps the packets and send it [130] A H S


to far off node inside or outside the network

greyhole Depletion of the resources and causing the [131] A H S


overhead to the network is the motive of the attacker

Phishing social engineering Disclosure of username, [132] C H P


password and confidential Information

Injection SQL Injection Gaining unauthorized access to several websites [132] C H P


Application
Malware virus,worms Disrupt or intercept the legitimate confidential data. [58] C H P

DoS SMTP Resource starvation [133] A H P

Buffer overflow An attack performed by writing a long sequence of malicious code [134] C M P

DDoS Amplification The attacker forge IP addresses of the [135] C H P


CoAP nodes to perform amplification

MITM Replay Here the attacker store the information about [136] C M P
the user without its consent or knowledge

SCADA SQL Injection Most common attack of injecting the [137] C H P


SQL code into the system thus slowing the services
Data Layer
Virus, Worms, Trojan,spyware An adversary can damage the system [58] C H P
Malware
by using malicious code which spreads to attached emails

Eavesdropping Reading messages by unintended receiver [120] C M P

Ciphertext The ciphertext is easily accessible also [58] C L P


Cryptanalysis
the corresponding plaintext

Plaintext Some parts of the plaintext is known to the attacker and it modifies it [58] C H P

Violations (C), (I), (A) mean Confidentiality, Integrity and Availability.


Level of Impacts (H), (M), (L) mean High, Medium and Low.
Threat Types (S), (P) mean Security, Privacy.

15
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 7
A detailed summary of IoT dataset properties.
Dataset Device type Simulation method Commu.type Data Attack type Attack Name Attack tool Attack samples
Capturing
Tool

N-BaIoT doorbell-Danmini n.s. wifi wireshark Bashlite Scan binaries 255,111


(2018) doorbell-Ennio Junk 261,789
Thermostat-Ecobee TCP 859,850
babymonitor-Philips B120N/10 UDP Flooding 946,366
security camera-Provision PT-737E COMBO 515,156
security camera-Provision PT-838 Mirai Scan binaries 537,979
security camera-SimpleHome XCS7-1002-WHT Ack 643,821
security camera-SimpleHome XCS7-1003-WHT Syn 733,299
webcam-Samsung SNH 1011 N UDP Flooding 1,229,999
UDP Plain 523,304

BoT-IoT weather station node-red n.s. tshark, Information Gathering service scanning nmap, hping3 1463364
(2019) smart fridge argus OS fingerprinting xprobe3,nmap 358275
Motion activated lights Denial of service DDoS hping3,golden eye 38532480
garage door DoS hping3,golden eye 33005194
smart thermostat Information Theft Keylogging Metasploitable3 1469
Data Theft Metasploitable3 118

Kitsune Themostat n.s. wifi NFQueue, Reconnaissance OS Scan nmap 1697851


(2019) baby monitor afpacket, Fuzzing sfuzz 2244139
web cam tshark Man in the Middle Video Injection video jack 2472401
doorbell ARP MitM ettercap 2504267
doorbell Active Wiretap raspberry pi 3B 4554925
security camera - SNCEM602RC Denial of Service SSDP Flood saddam 4077266
security camera -SNCEM600 SYN DoS hping3 2771276
security camera-SNCEB600 SSL Renegotiation THC 6084492
security camera-SNCEB602R Botnet malware Mirai Telnet 764137

IoTNIDS speaker-SKT NGU (NU 100) n.s. wifi n.s. Mirai UDP Flooding binaries 949284
(2019) Camera-EZVIZ WiFi (C2C Mini O Plus 1080P) HTTP Flooding 10646
ACK Flooding 75632
Telnet Bruteforce 1924
Host discovery 673
MITM ARP Spoofing Nmap 101885
DoS SYN flooding Nmap 64646
Scanning port Nmap 20939
os 1817
host dicovery 2454

MedBIoT Smart Switch-Sonoff Tasmota raspberry PI n.s. Tcpdump Botnet Mirai binaries 842674
(2020) Smart Switch-TPLink Bashlite 4143276
Light Bulb-TPLink Torii 319139
Lock
Switch
Fan
light

IoT-23 smart LED lamp-Philips HUE n.s. n.s. wireshark Botnet Mirai binaries 235,332,146
(2020) Personal Assistant-Amazon echo home Torii 100,000
Doorlock-Smofy Trojan 24000
Gagfyt 271000000
Kenjiro 109,000,000
Okiru 13,000,000
Hakai 23,000
IRCBot 73,000,000
Linix_Hajime 6,437,000
Muhstik 496,000
Hide & Seek 1,686,000

IoTID speaker-SKT NGU (NU 100) n.s. wifi CIC DoS SYN flooding n.s 59391
(2020) Camera-EZVIZ WiFi (C2C Mini O Plus 1080P) Flowmeter Mirai Host Bruteforce 121181
ACK Flooding 55124
HTTP Flooding 55818
MITM UDP flooding 183554
ARP Spoofing 35377
Scan Host Port 22192
OS 53073

MQTT Temperature IoT-Flock MQTT broker wireshark DoS SlowIte n.s. 9202
(2020) Light intensity Flooding MQTT-malaria 613
Humidity MQTT Publish flood IoT-Flock 130223
Motion sensor Malformed Data MQTTSA 10924
CO-Gas MITM Bruteforce MQTTSA 14501
Smoke
Fan speed controller
Door lock

MQTT- camera VLC n.s. tcpdump. Scan Aggressive nmap 70768


IoT-IDS UDP 210819
(2020) Bruteforce Sparta MQTT-PWN 947177

(continued on next page)

16
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 7 (continued).
Dataset Device type Simulation method Commu.type Data Attack type Attack Name Attack tool Attack samples
Capturing
Tool

TON_IoT Mobile-iPhone 7 Node-RED wireshark, Scan Port Nmap 25000


(2020) Smart TV bro DoS DDoS UFONet 3973
Kali VM Backdoor Metasploitable3 35000
Fridge Malware Ransomware Metasploitable3 16030
Garage door Web Data injection bash scripts 35000
GPS tracker XSS (XSSer) toolkit 6116
Modbus Information theft Password cracking CeWL,Hydra toolkit 35000
Motion Light MITM Ettercap toolkit 1403
Thermostat
weather

Edge-IIoTset Smart TV N.A. Wireshark DoS/DDoS TCP SYN Flood hping-3 2020120
(2022) Wireless Router UDP Flood hping-3 3201626
Rapberry Pi 4 Model B HTTP Flood GET request 229022
ESP32 ICMP Flood hping-3 2914354
Arduino Uno Information Gathering Port Scanning Nmap and Netcat 22564
Sensors & Actuators OS Fingerprinting xprobe2 1001
Desktop Computers Vulnerability Scan Attack Nikto 145869
Smart Phone Man in the Middle (MITM) DNS/ARP Spoofing Ettercap 1403
Injection Attacks XSS XSSer 15915
SQL sqlmap 51203
Uploading Metasploit 37634
Malkware Backdoor Metasploit 24862
Password Cracking CeWL 1053385
Ransomware OpenSSL 10925

Table 8
List of potential features.
Category Name Description Profile Effectiveness

CF Basic Source IP Indicates the address of origin IP H N


Destination IP Describes the destination IP address H N
Source Port The number of source port is stored in this feature H H
Destination Port It stores the port number of destination and available in TCP stack H H
Protocol Type Shows the type of protocol e.g. TCP,UDP etc. H N,H

Time Duration Record total duration or flow duration H N,H


Jitter Source and destination jitter time in msec L N,H
Inter Arrival Time (avg,std,max,min) inter arrival time between two flows either in backward or forward direction L N,H
Active Time Time during which flow was idle before becoming active L N,H
Idle Time Time during which flow was active before becoming idle L N,H

Flag ACK, SYN,FIN,URG,RST (fwd/bwd)Flag setting in the packets traveling in either direction H N,H

Bytes Count Indicate the information regarding total number of unique source or destination IP bytes M N
Header Length Describes the information regarding total bytes used for headers in forward or backward direction H N

LF L4 MQTT This is a lightweight protocol and supports various authentication mechanisms built upon TCP L N,H
CoAP It is an open web transfer protocol designed especially for constrained devices and built on UDP L N,H
HTTP Used to deliver data on www H N,H
HTTPs Extension of HTTP , used for secure communication H N,H
DNS Translation of domain names into IP addresses M N,H
Telnet To provide bidirectional interactive communication L N,H
SMTP Suitable for mail transfer L N,H
SSH Used for remote logins L N,H
IRC Provides communication in the text format M N,H

L3 TCP It indicates connection oriented protocol and describes about 3-way handshake for full communication H N
UDP It is a connection less protocol and used for establishing low latency connections among devices H N

L2 DHCP It is used to dynamically assigns the IP addresses to the devices on the network L N
ARP Used for discovering link layer addresses such as MAC address M N
RARP It is the reverse of ARP L N
ICMP Used by networking devices to send error messages H N
IGMP Allows the network to direct multi-cast M N
IPv(4,6) It is core communication protocol with different versions 4 and 6 H N

L1 MAC Provides a channel of access and address mechanism M H


LLC Can be used for packets on Ethernet networks M H

Cm F wifi Type field indicates 0-management, 1-control, or 2-data H N


Subtype indicates the type of management, control, or data frame H N
DS status Indicates directionality of the frame H N
Fragments Set to 1, the frame has been fragmented and has more fragments to transmit H N
Duration reserve the medium for the amount of time H N
Protocol Version Describes which version is used. H N
Src/dst MAC address of the node that initiated the frame and final destination H N
receiver addr. MAC address that identifies the wireless device that is the immediate recipient of the frame H N
Sequence number Indicates the sequence number assigned to the frame M N

(continued on next page)

17
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 8 (continued).
Category Name Description Profile Effectiveness

BLE Type integer value corresponding to the packet H N


Direction selecting the direction of the packet if sent by gateway it is zero and received is 1. H N
Timestamp The packet’s timestamp in epoch format H N
Length The packet’s length in bytes. H N
Taxonomy Indicates whether the packet is management or data related M N
Parameter Length Extracted in case of HCI command type packets describing length of all the parameters measured in bytes L N
Data length The length of the packet’s data in bytes in case of ACL data packets L N
Event code The code related to the different types of events described in HCI event packets L N

Zigbee Source address The source address. H N


Destination address target information. H N
Destination PAN ID PAN Id information. H N
Extended PAN ID It is a default coordinator MAC address M N
Timestamp packet timestamp H N
Packet length packets byte length H N
Data length payload information in bytes H N
Data payload data in raw form H N

DF Mean Indicate average of a number H N,H


Min Minimum length of a packet H N,H
Max Maximum packet length H N,H
Std calculating std length H N,H
IAT Inter arrival time between packets M N,H
Tot size Size of the packet in bytes or bits M N,H
Tot sum packets byte count L N,H
Avg Average size of a packet L N,H
Magnitude The root squared sum of the two streams’ means M N
Radius The root squared sum of the two streams’ variances L N,H
Co-variance An approximated co-variance between two streams H N,H
Variance Ratio between outgoing and incoming traffic H N,H
Number Defines packet count L N,H
Pearson Correlation An approximated correlation coefficient between two streams H N
Weight Number of items observed in recent history L N,H

Connectivity Features (CF), Layered Features (LF), Communication Features (Cm F), Dynamic Features (DF)
Profiles (H), (M), (L) mean High, Medium and Low respectively.
Effectiveness (N), (H) mean Network Intrusion Detection , Host Intrusion Detection system respectively.

Table 9
List of Features used in IoT datasets.
Category Name BoT-IoT Kitsune N-BaIoT MedBIoT IoTID IoT-23 IoT-SH TON-IoT MQTT MQTT-IoTIDS Edge-IIoTset

CF Basic Source IP ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Destination IP ✓ ✓ ✓ ✓ ✓ ✓
Source Port ✓ ✓ ✓ ✓ ✓ ✓
Destination Port ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Protocol Type ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
State ✓ ✓ ✓ ✓

Service ✓ ✓

Time Duration ✓ ✓ ✓ ✓ ✓ ✓
Jitter ✓ ✓ ✓ ✓
Inter Arrival Time ✓ ✓
TTL ✓ ✓

Flag ACK, SYN,FIN,URG,RST ✓ ✓ ✓ ✓

count ✓

Bytes Count ✓ ✓ ✓ ✓
Header Length ✓ ✓

LF L4 MQTT ✓ ✓ ✓
CoAP ✓
HTTP ✓ ✓ ✓
HTTPs
DNS ✓ ✓ ✓
BOOTP ✓

NTP ✓

L3 TCP ✓ ✓
UDP ✓ ✓
SSL ✓ ✓

L2 DHCP ✓
ICMP ✓ ✓
IGMP ✓
IPv(4,6) ✓

L1 MAC ✓ ✓ ✓ ✓
LLC

(continued
on
next
page)

18
B. Kaur et al. Internet of Things 22 (2023) 100780

Table 9 (continued).
Category Name BoT-IoT Kitsune N-BaIoT MedBIoT IoTID IoT-23 IoT-SH TON-IoT MQTT MQTT-IoTIDS Edge-IIoTset

DF Mean ✓ ✓ ✓ ✓ ✓ ✓
Min ✓ ✓ ✓
Max ✓ ✓ ✓
Std ✓ ✓ ✓ ✓
IAT ✓ ✓
Tot size ✓
Tot sum
Tot num of bytes ✓
Tot num of packets ✓
Avg ✓ ✓
Magnitude ✓ ✓ ✓
Radius ✓ ✓ ✓
Covariance ✓
Variance ✓ ✓
Number
Pearson Correlation ✓ ✓ ✓
Rate ✓
Weight ✓

Labeling Category ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Sub-category ✓ ✓ ✓ ✓ ✓ ✓ ✓
Label ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Connectivity Features (CF), Layered Features (LF), Dynamic Features (DF).

References

[1] M. Alaa, A.A. Zaidan, B.B. Zaidan, M. Talal, M.L.M. Kiah, A review of smart home applications based on Internet of Things, J. Netw. Comput. Appl. 97
(2017) 48–65.
[2] P. Kumar, R. Saini, P.P. Roy, D.P. Dogra, A bio-signal based framework to secure mobile devices, J. Netw. Comput. Appl. 89 (2017) 62–71.
[3] G.A. Pimenta Rodrigues, R. de Oliveira Albuquerque, F.E. Gomes de Deus, G.A. De Oliveira Júnior, L.J. García Villalba, T.-H. Kim, et al., Cybersecurity
and network forensics: Analysis of malicious traffic towards a honeynet with deep packet inspection, Appl. Sci. 7 (10) (2017) 1082.
[4] M. Vögler, J. Schleicher, C. Inzinger, S. Nastic, S. Sehic, S. Dustdar, LEONORE–Large-scale provisioning of resource-constrained IoT deployments, in:
2015 IEEE Symposium on Service-Oriented System Engineering, IEEE, 2015, pp. 78–87.
[5] E. Anthi, S. Ahmad, O. Rana, G. Theodorakopoulos, P. Burnap, EclipseIoT: A secure and adaptive hub for the Internet of Things, Comput. Secur. 78
(2018) 477–490.
[6] S. Dadkhah, H. Mahdikhani, P.K. Danso, A. Zohourian, K.A. Truong, A.A. Ghorbani, Towards the development of a realistic multidimensional IoT profiling
dataset, in: 2022 19th Annual International Conference on Privacy, Security & Trust, PST, IEEE Computer Society, 2022, pp. 1–11.
[7] I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, Toward generating a new intrusion detection dataset and intrusion traffic characterization, in: ICISSp, 2018,
pp. 108–116.
[8] M. Ring, S. Wunderlich, D. Grüdl, D. Landes, A. Hotho, Creation of flow-based data sets for intrusion detection, J. Inf. Warf. 16 (4) (2017) 41–54.
[9] W.H. Hassan, et al., Current research on Internet of Things (IoT) security: A survey, Comput. Netw. 148 (2019) 283–294.
[10] H. Hindy, D. Brosset, E. Bayne, A. Seeam, C. Tachtatzis, R. Atkinson, X. Bellekens, A taxonomy and survey of intrusion detection system design techniques,
network threats and datasets, 2018, arXiv. org.
[11] M. Ring, S. Wunderlich, D. Scheuring, D. Landes, A. Hotho, A survey of network-based intrusion detection data sets, Comput. Secur. 86 (2019) 147–167.
[12] A. Khraisat, I. Gondal, P. Vamplew, J. Kamruzzaman, Survey of intrusion detection systems: Techniques, datasets and challenges, Cybersecurity 2 (1)
(2019) 1–22.
[13] M.A. Ferrag, L. Maglaras, S. Moschoyiannis, H. Janicke, Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative
study, J. Inf. Secur. Appl. 50 (2020) 102419.
[14] J. Asharf, N. Moustafa, H. Khurshid, E. Debie, W. Haider, A. Wahab, A review of intrusion detection systems using machine and deep learning in Internet
of Things: Challenges, solutions and future directions, Electronics 9 (7) (2020) 1177.
[15] D. Gümüşbaş, T. Yıldırım, A. Genovese, F. Scotti, A comprehensive survey of databases and deep learning methods for cybersecurity and intrusion
detection systems, IEEE Syst. J. (2020).
[16] A. Khraisat, A. Alazab, A critical review of intrusion detection systems in the Internet of Things: Techniques, deployment strategy, validation strategy,
attacks, public datasets and challenges, Cybersecurity 4 (2021) 1–27.
[17] H. Hamid, R.M. Noor, S.N. Omar, I. Ahmedy, S.S. Anjum, S.A.A. Shah, S. Kaur, F. Othman, E.M. Tamil, IoT-based botnet attacks systematic mapping
study of literature, Scientometrics 126 (4) (2021) 2759–2800.
[18] L. Aversano, M.L. Bernardi, M. Cimitile, R. Pecori, A systematic review on deep learning approaches for IoT security, Comp. Sci. Rev. 40 (2021) 100389.
[19] R. Lohiya, A. Thakkar, Application domains, evaluation data sets, and research challenges of IoT: A systematic review, IEEE Internet Things J. 8 (11)
(2020) 8774–8798.
[20] A. Sundararajan, A. Chavan, D. Saleem, A.I. Sarwat, A survey of protocol-level challenges and solutions for distributed energy resource cyber-physical
security, Energies 11 (9) (2018) 2360.
[21] H. Çepik, Ö. Aydın, G. Dalkılıç, Security vulnerability assessment of Google home connection with an Internet of Things device, in: Multidisciplinary
Digital Publishing Institute Proceedings, Vol. 74, no. 1, 2021, p. 1.
[22] H.A. Abdul-Ghani, D. Konstantas, M. Mahyoub, A comprehensive IoT attacks survey based on a building-blocked reference model, Int. J. Adv. Comput.
Sci. Appl. 9 (3) (2018) 355–373.
[23] I. Ullah, Q.H. Mahmoud, A scheme for generating a dataset for anomalous activity detection in IoT networks, in: Canadian Conference on Artificial
Intelligence, 2020, pp. 508–520.
[24] S. Chockalingam, W. Pieters, A. Teixeira, N. Khakzad, P.v. Gelder, Combining Bayesian networks and fishbone diagrams to distinguish between intentional
attacks and accidental technical failures, in: International Workshop on Graphical Models for Security, Springer, 2018, pp. 31–50.
[25] B. Leiner, Y. Rekhter, The multiprotocol internet, in: RFC 1560, USRA, IBM, Citeseer, 1993.
[26] S. Echeverría, G. Lewis, C. Mazzotta, C. Grabowski, K. O’Meara, A. Vasudevan, M. Novakouski, M. McCormack, V. Sekar, KalKi: A software-defined IoT
security platform, in: 2020 IEEE 6th World Forum on Internet of Things, WF-IoT, IEEE, 2020, pp. 1–6.
[27] P. Kumar, M. Tripathi, A. Nehra, M. Conti, C. Lal, SAFETY: Early detection and mitigation of TCP SYN flood utilizing entropy in SDN, IEEE Trans. Netw.
Serv. Manag. 15 (4) (2018) 1545–1559.

19
B. Kaur et al. Internet of Things 22 (2023) 100780

[28] D. Stiawan, D. Wahyudi, A. Heryanto, M. Idris, F. Muchtar, M.A. Alzahrani, R. Budiarto, et al., TCP FIN flood attack pattern recognition on Internet of
Things with rule based signature analysis, Int. J. Online Biomed. Eng. 15 (7) (2019).
[29] N.N. Thilakarathne, Security and privacy issues in IoT environment, Int. J. Eng. Manag. Res. 10 (2020).
[30] V. Cozzolino, N. Schwellnus, J. Ott, A.Y. Ding, UIDS: Unikernel-based intrusion detection system for the Internet of Things, in: DISS 2020-Workshop on
Decentralized IoT Systems and Security, 2020.
[31] M.M. Salim, S. Rathore, J.H. Park, Distributed denial of service attacks and its defenses in IoT: A survey, J. Supercomput. (2019) 1–44.
[32] K. Gurulakshmi, A. Nesarani, Analysis of IoT bots against DDOS attack using machine learning algorithm, in: 2018 2nd International Conference on
Trends in Electronics and Informatics, ICOEI, IEEE, 2018, pp. 1052–1057.
[33] E.H. Spafford, Opus: Preventing weak password choices, Comput. Secur. 11 (3) (1992) 273–278.
[34] T. Alladi, V. Chamola, B. Sikdar, K.-K.R. Choo, Consumer IoT: Security vulnerability case studies and solutions, IEEE Consum. Electron. Mag. 9 (2) (2020)
17–25.
[35] K. Nirmal, B. Janet, R. Kumar, Analyzing and eliminating phishing threats in IoT, network and other web applications using iterative intersection,
Peer-To-Peer Netw. Appl. (2020) 1–13.
[36] S. Rizvi, R. Orr, A. Cox, P. Ashokkumar, M.R. Rizvi, Identifying the attack surface for IoT network, Internet Things 9 (2020) 100162.
[37] M. Pearce, S. Zeadally, R. Hunt, Virtualization: Issues, security threats, and solutions, ACM Comput. Surv. 45 (2) (2013) 1–39.
[38] E.C.P. Neto, S. Dadkhah, A.A. Ghorbani, Collaborative DDoS detection in distributed multi-tenant IoT using federated learning, in: 2022 19th Annual
International Conference on Privacy, Security & Trust, PST, IEEE Computer Society, 2022, pp. 1–10.
[39] M. Safi, S. Dadkhah, F. Shoeleh, H. Mahdikhani, H. Molyneaux, A.A. Ghorbani, A survey on IoT profiling, fingerprinting, and identification, ACM Trans.
Internet Things 3 (4) (2022) 1–39.
[40] M.V. de Assis, L.F. Carvalho, J.J. Rodrigues, J. Lloret, M.L. Proença Jr., Near real-time security system applied to SDN environments in IoT networks
using convolutional neural network, Comput. Electr. Eng. 86 (2020) 106738.
[41] K. Prathapchandran, T. Janani, A trust-based security model to detect misbehaving nodes in Internet of Things (IoT) environment using logistic regression,
1850 (1) (2021) 012031.
[42] O. AbuAlghanam, H. Alazzam, E. Alhenawi, M. Qatawneh, O. Adwan, Fusion-based anomaly detection system using modified isolation forest for Internet
of Things, J. Ambient Intell. Humaniz. Comput. 14 (1) (2023) 131–145.
[43] M.T. Ribeiro, S. Singh, C. Guestrin, Model-agnostic interpretability of machine learning, 2016, arXiv preprint arXiv:1606.05386.
[44] K. Rose, S. Eldridge, L. Chapin, The Internet of Things: An overview, Internet Soc. (ISOC) 80 (2015) 1–50.
[45] S. Li, L.D. Xu, S. Zhao, The Internet of Things: A survey, Inf. Syst. Front. 17 (2015) 243–259.
[46] L. Tan, N. Wang, Future internet: The Internet of Things, in: 2010 3rd International Conference on Advanced Computer Theory and Engineering, Vol. 5,
ICACTE, IEEE, 2010, pp. V5–376.
[47] V.H. Bezerra, V.G.T. da Costa, R.A. Martins, S.B. Junior, R.S. Miani, B.B. Zarpelao, Providing IoT host-based datasets for intrusion detection research, in:
Anais Do XVIII Simpósio Brasileiro Em Segurança Da Informação E De Sistemas Computacionais, 2018, pp. 15–28.
[48] A. Alsaedi, N. Moustafa, Z. Tari, A. Mahmood, A. Anwar, TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion
detection systems, IEEE Access 8 (2020) 165130–165150.
[49] A. Guerra-Manzanares, J. Medina-Galindo, H. Bahsi, S. Nõmm, MedBIoT: Generation of an IoT botnet dataset in a medium-sized IoT network, in: ICISSP,
2020, pp. 207–218.
[50] I. Vaccari, G. Chiola, M. Aiello, M. Mongelli, E. Cambiaso, MQTTset, A new dataset for machine learning techniques on MQTT, Sensors 20 (22) (2020)
6578.
[51] Y. Mirsky, T. Doitshman, Y. Elovici, A. Shabtai, Kitsune: An ensemble of autoencoders for online network intrusion detection, 2018, arXiv preprint
arXiv:1802.09089.
[52] N. Koroniotis, N. Moustafa, E. Sitnikova, B. Turnbull, Towards the development of realistic botnet dataset in the Internet of Things for network forensic
analytics: Bot-IoT dataset, Future Gener. Comput. Syst. 100 (2019) 779–796.
[53] E. Anthi, L. Williams, M. Słowińska, G. Theodorakopoulos, P. Burnap, A supervised intrusion detection system for smart home IoT devices, IEEE Internet
Things J. 6 (5) (2019) 9042–9053.
[54] A. Hamza, H.H. Gharakheili, T.A. Benson, V. Sivaraman, Detecting volumetric attacks on lot devices via SDN-based monitoring of mud activity, in:
Proceedings of the 2019 ACM Symposium on SDN Research, 2019, pp. 36–48.
[55] M.U. Farooq, M. Waseem, A. Khairi, S. Mazhar, A critical analysis on the security concerns of Internet of Things (IoT), Int. J. Comput. Appl. 111 (7)
(2015).
[56] H. Mohammed, S.R. Hasan, F. Awwad, Fusion-on-field security and privacy preservation for IoT edge devices: Concurrent defense against multiple types
of hardware Trojan attacks, IEEE Access 8 (2020) 36847–36862.
[57] R. Kumar, X. Zhang, W. Wang, R.U. Khan, J. Kumar, A. Sharif, A multimodal malware detection technique for android IoT devices using various features,
IEEE Access 7 (2019) 64411–64430.
[58] J. Deogirikar, A. Vidhate, Security attacks in IoT: A survey, in: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud),
I-SMAC, 2017, pp. 32–37.
[59] Y. Meidan, M. Bohadana, Y. Mathov, Y. Mirsky, A. Shabtai, D. Breitenbacher, Y. Elovici, N-baiot—network-based detection of IoT botnet attacks using
deep autoencoders, IEEE Pervasive Comput. 17 (3) (2018) 12–22.
[60] H. Kang, D.H. Ahn, G.M. Lee, J. Yoo, K.H. Park, H.K. Kim, IoT network intrusion dataset, IEEE Dataport (2019).
[61] A. Parmisano, S. Garcia, M. Erquiaga, Stratosphere laboratory. A labeled dataset with malicious and benign IoT network traffic, 2020, January.
[62] H. Hindy, E. Bayne, M. Bures, R. Atkinson, C. Tachtatzis, X. Bellekens, Machine learning based IoT intrusion detection system: An MQTT case study,
2020, arXiv preprint arXiv:2006.15340.
[63] M.A. Ferrag, O. Friha, D. Hamouda, L. Maglaras, H. Janicke, Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications
for centralized and federated learning, IEEE Access 10 (2022) 40281–40306.
[64] G.Z. Papadopoulos, A. Gallais, G. Schreiner, T. Noel, Importance of repeatable setups for reproducible experimental results in IoT, in: Proceedings of the
13th ACM Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, & Ubiquitous Networks, 2016, pp. 51–59.
[65] A. Parmisano, S. Garcia, M. Erquiaga, A Labeled Dataset with Malicious and Benign IoT Network Traffic, Stratosphere Laboratory, Praha, Czech Republic,
2020.
[66] J. Cui, L. Wang, X. Zhao, H. Zhang, Towards predictive analysis of android vulnerability using statistical codes and machine learning for IoT applications,
Comput. Commun. 155 (2020) 125–131.
[67] M.A. Ferrag, L. Maglaras, A. Ahmim, M. Derdour, H. Janicke, Rdtids: Rules and decision tree-based intrusion detection system for Internet-of-Things
networks, Future Internet 12 (3) (2020) 44.
[68] S.M. Taghavinejad, M. Taghavinejad, L. Shahmiri, M. Zavvar, M.H. Zavvar, Intrusion detection in IoT-based smart grid using hybrid decision tree, in:
2020 6th International Conference on Web Research, ICWR, IEEE, 2020, pp. 152–156.
[69] A. Verma, V. Ranga, Machine learning based intrusion detection systems for IoT applications, Wirel. Pers. Commun. 111 (4) (2020) 2287–2310.
[70] M.G. Karthik, M.M. Krishnan, Hybrid random forest and synthetic minority over sampling technique for detecting Internet of Things attacks, J. Ambient
Intell. Humaniz. Comput. (2021) 1–11.

20
B. Kaur et al. Internet of Things 22 (2023) 100780

[71] M. Hasan, M.M. Islam, M.I.I. Zarif, M. Hashem, Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches, Internet
Things 7 (2019) 100059.
[72] M. Eskandari, Z.H. Janjua, M. Vecchio, F. Antonelli, Passban IDS: An intelligent anomaly-based intrusion detection system for IoT edge devices, IEEE
Internet Things J. 7 (8) (2020) 6882–6897.
[73] K. Sadaf, J. Sultana, Intrusion detection based on autoencoder and isolation forest in fog computing, IEEE Access 8 (2020) 167059–167068.
[74] O. AbuAlghanam, H. Alazzam, E. Alhenawi, M. Qatawneh, O. Adwan, Fusion-based anomaly detection system using modified isolation forest for Internet
of Things, J. Ambient Intell. Humaniz. Comput. (2022) 1–15.
[75] L. Breiman, J. Friedman, C.J. Stone, R.A. Olshen, Classification and Regression Trees, CRC Press, 1984.
[76] P. Amangele, M.J. Reed, M. Al-Naday, N. Thomos, M. Nowak, Hierarchical machine learning for IoT anomaly detection in SDN, in: 2019 International
Conference on Information Technologies, InfoTech, IEEE, 2019, pp. 1–4.
[77] S.U. Jan, S. Ahmed, V. Shakhov, I. Koo, Toward a lightweight intrusion detection system for the Internet of Things, IEEE Access 7 (2019) 42450–42471.
[78] M. Shen, X. Tang, L. Zhu, X. Du, M. Guizani, Privacy-preserving support vector machine training over blockchain-based encrypted IoT data in smart
cities, IEEE Internet Things J. 6 (5) (2019) 7702–7712.
[79] H.-T. Hsu, G.-J. Jong, J.-H. Chen, C.-G. Jhe, Improve IoT security system of smart-home by using support vector machine, in: 2019 IEEE 4th International
Conference on Computer and Communication Systems, ICCCS, IEEE, 2019, pp. 674–677.
[80] Z.A. Baig, S. Sanguanpong, S.N. Firdous, T.G. Nguyen, C. So-In, et al., Averaged dependence estimators for DoS attack detection in IoT networks, Future
Gener. Comput. Syst. 102 (2020) 198–209.
[81] S. Manimurugan, IoT-fog-cloud model for anomaly detection using improved Naïve Bayes and principal component analysis, J. Ambient Intell. Humaniz.
Comput. (2021) 1–10.
[82] S. Ismail, H. Reza, Evaluation of Naïve Bayesian algorithms for cyber-attacks detection in wireless sensor networks, in: 2022 IEEE World AI IoT Congress,
AIIoT, IEEE, 2022, pp. 283–289.
[83] T. Saranya, S. Sridevi, C. Deisy, T.D. Chung, M.A. Khan, Performance analysis of machine learning algorithms in intrusion detection system: A review,
Procedia Comput. Sci. 171 (2020) 1251–1260.
[84] D. Zheng, Z. Hong, N. Wang, P. Chen, An improved LDA-based ELM classification for intrusion detection algorithm in IoT application, Sensors 20 (6)
(2020) 1706.
[85] W. Cheng, C. Xu, M. Zhang, S. Ji, IoT terminal recognition method based on linear discriminant spectral analysis, in: 2022 IEEE 5th International
Electrical and Energy Conference, CIEEC, IEEE, 2022, pp. 1648–1653.
[86] C. Ioannou, V. Vassiliou, An intrusion detection system for constrained WSN and IoT nodes based on binary logistic regression, in: Proceedings of the
21st ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, 2018, pp. 259–263.
[87] M. Roopak, G.Y. Tian, J. Chambers, Deep learning models for cyber security in IoT networks, in: 2019 IEEE 9th Annual Computing and Communication
Workshop and Conference, CCWC, IEEE, 2019, pp. 0452–0457.
[88] I. Garcia-Magarino, R. Muttukrishnan, J. Lloret, Human-centric AI for trustworthy IoT systems with explainable multilayer perceptrons, IEEE Access 7
(2019) 125562–125574.
[89] I.F. Kilincer, F. Ertam, A. Sengur, R.-S. Tan, U.R. Acharya, Automated detection of cybersecurity attacks in healthcare systems with recursive feature
elimination and multilayer perceptron optimization, Biocybern. Biomed. Eng. 43 (1) (2023) 30–41.
[90] A.Y. Khan, R. Latif, S. Latif, S. Tahir, G. Batool, T. Saba, Malicious insider attack detection in IoTs using data analytics, IEEE Access 8 (2019) 11743–11753.
[91] W. Li, P. Yi, Y. Wu, L. Pan, J. Li, A new intrusion detection system based on KNN classification algorithm in wireless sensor network, J. Electr. Comput.
Eng. 2014 (2014).
[92] G. Liu, H. Zhao, F. Fan, G. Liu, Q. Xu, S. Nazir, An enhanced intrusion detection model based on improved KNN in WSNs, Sensors 22 (4) (2022) 1407.
[93] G.D.L.T. Parra, P. Rad, K.-K.R. Choo, N. Beebe, Detecting Internet of Things attacks using distributed deep learning, J. Netw. Comput. Appl. 163 (2020)
102662.
[94] J. Jeon, J.H. Park, Y.-S. Jeong, Dynamic analysis for IoT malware detection with convolution neural network model, IEEE Access 8 (2020) 96899–96911.
[95] S. Smys, A. Basar, H. Wang, Hybrid intrusion detection system for Internet of Things (IoT), J. ISMAC 2 (04) (2020) 190–199.
[96] M. Almiani, A. AbuGhazleh, A. Al-Rahayfeh, S. Atiewi, A. Razaque, Deep recurrent neural network for IoT intrusion detection system, Simul. Model.
Pract. Theory 101 (2020) 102031.
[97] M. Woźniak, J. Siłka, M. Wieczorek, M. Alrashoud, Recurrent neural network model for IoT and networking malware threat detection, IEEE Trans. Ind.
Inform. 17 (8) (2020) 5583–5594.
[98] A. Samy, H. Yu, H. Zhang, Fog-based attack detection framework for Internet of Things using deep learning, IEEE Access 8 (2020) 74571–74585.
[99] S. Gamage, J. Samarabandu, Deep learning methods in network intrusion detection: A survey and an objective comparison, J. Netw. Comput. Appl. 169
(2020) 102767.
[100] M. Yousefi-Azar, V. Varadharajan, L. Hamey, U. Tupakula, Autoencoder-based feature learning for cyber security applications, in: 2017 International
Joint Conference on Neural Networks, IJCNN, IEEE, 2017, pp. 3854–3861.
[101] S.J. Lee, P.D. Yoo, A.T. Asyhari, Y. Jhi, L. Chermak, C.Y. Yeun, K. Taha, IMPACT: Impersonation attack detection via edge computing using deep
autoencoder and feature abstraction, IEEE Access 8 (2020) 65520–65529.
[102] M.R. Shahid, G. Blanc, Z. Zhang, H. Debar, Anomalous communications detection in IoT networks using sparse autoencoders, in: 2019 IEEE 18th
International Symposium on Network Computing and Applications, NCA, IEEE, 2019, pp. 1–5.
[103] Y. Hamid, M. Sugumaran, L. Journaux, Machine learning techniques for intrusion detection: A comparative analysis, in: Proceedings of the International
Conference on Informatics and Analytics, 2016, pp. 1–6.
[104] P. Debnath, P. Chittora, T. Chakrabarti, P. Chakrabarti, Z. Leonowicz, M. Jasinski, R. Gono, Jasi nska, E. Analysis of earthquake forecasting in India using
supervised machine learning classifiers, Sustainability 13 (2021) 971, 2021, s Note: MDPI stays neutral with regard to jurisdictional claims in published.
[105] J. Kotak, Y. Elovici, IoT device identification using deep learning, in: 13th International Conference on Computational Intelligence in Security for
Information Systems, CISIS 2020 12, Springer, 2021, pp. 76–86.
[106] Y. Otoum, D. Liu, A. Nayak, DL-IDS: A deep learning–based intrusion detection framework for securing IoT, Trans. Emerg. Telecommun. Technol. 33 (3)
(2022) e3803.
[107] M. Shafiq, Z. Tian, A.K. Bashir, X. Du, M. Guizani, CorrAUC: A malicious BoT-IoT traffic detection method in IoT network using machine-learning
techniques, IEEE Internet Things J. 8 (5) (2020) 3242–3254.
[108] J. Bassey, D. Adesina, X. Li, L. Qian, A. Aved, T. Kroecker, Intrusion detection for IoT devices based on RF fingerprinting using deep learning, in: 2019
Fourth International Conference on Fog and Mobile Edge Computing, FMEC, IEEE, 2019, pp. 98–104.
[109] T. Saba, A. Rehman, T. Sadad, H. Kolivand, S.A. Bahaj, Anomaly-based intrusion detection system for IoT networks through deep learning model, Comput.
Electr. Eng. 99 (2022) 107810.
[110] L. Ma, Y. Chai, L. Cui, D. Ma, Y. Fu, A. Xiao, A deep learning-based DDoS detection framework for Internet of Things, in: ICC 2020-2020 IEEE International
Conference on Communications, ICC, IEEE, 2020, pp. 1–6.
[111] V.H. La, R. Fuentes, A.R. Cavalli, A novel monitoring solution for 6LoWPAN-based wireless sensor networks, in: 2016 22nd Asia-Pacific Conference on
Communications, APCC, 2016, pp. 230–237.

21
B. Kaur et al. Internet of Things 22 (2023) 100780

[112] M. Usha, P. Kavitha, Anomaly based intrusion detection for 802.11 networks with optimal features using SVM classifier, Wirel. Netw. 23 (8) (2017)
2431–2446.
[113] F. Sadikin, T. Van Deursen, S. Kumar, A ZigBee intrusion detection system for IoT using secure and efficient data collection, Internet Things 12 (2020)
100306.
[114] G.D. O’Mahony, P.J. Harris, C.C. Murphy, Detecting interference in wireless sensor network received samples: A machine learning approach, in: 2020
IEEE 6th World Forum on Internet of Things, WF-IoT, IEEE, 2020, pp. 1–6.
[115] G. Qing, H. Wang, T. Zhang, Radio frequency fingerprinting identification for Zigbee via lightweight CNN, Phys. Commun. 44 (2021) 101250.
[116] M. Anagnostopoulos, G. Spathoulas, B. Viaño, J. Augusto-Gonzalez, Tracing your smart-home devices conversations: A real world IoT traffic data-set,
Sensors 20 (22) (2020) 6600.
[117] I. Ullah, Q.H. Mahmoud, A two-level flow-based anomalous activity detection system for IoT networks, Electronics 9 (3) (2020) 530.
[118] I. Butun, P. Österberg, H. Song, Security of the Internet of Things: Vulnerabilities, attacks, and countermeasures, IEEE Commun. Surv. Tutor. 22 (1)
(2019) 616–644.
[119] Y. Liu, H.-H. Chen, L. Wang, Physical layer security for next generation wireless networks: Theories, technologies, and challenges, IEEE Commun. Surv.
Tutor. 19 (1) (2016) 347–376.
[120] E.M. Ghourab, A. Mansour, M. Azab, M. Rizk, A. Mokhtar, Towards physical layer security in Internet of Things based on reconfigurable multiband
diversification, in: 2017 8th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, IEMCON, 2017, pp. 446–450.
[121] A. Hariri, N. Giannelos, B. Arief, Selective forwarding attack on IoT home security kits, in: Computer Security, 2019, pp. 360–373.
[122] A. Abdollahi, M. Fathi, An intrusion detection system on ping of death attacks in IoT networks, Wirel. Pers. Commun. (2020) 1–14.
[123] R. Rizal, I. Riadi, Y. Prayudi, Network forensics for detecting flooding attack on Internet of Things (IoT) device, Int. J. Cyber-Security Digit. Forensics 7
(4) (2018) 382–390.
[124] K. Sonar, H. Upadhyay, A survey: DDoS attack on Internet of Things, Int. J. Eng. Res. Dev. 10 (11) (2014) 58–63.
[125] B. Kepçeoğlu, A. Murzaeva, S. Demirci, Performing energy consuming attacks on IoT devices, in: 2019 27th Telecommunications Forum, TELFOR, IEEE,
2019, pp. 1–4.
[126] Y. Arslan, A solution for ARP spoofing: Layer-2 MAC and protocol filtering and arpserver, 2017, arXiv preprint arXiv:1708.01302.
[127] N. Wang, L. Jiao, P. Wang, M. Dabaghchian, K. Zeng, Efficient identity spoofing attack detection for IoT in mm-wave and massive mimo 5G communication,
in: 2018 IEEE Global Communications Conference, GLOBECOM, 2018, pp. 1–6.
[128] Q. Gou, L. Yan, Y. Liu, Y. Li, Construction and strategies in IoT security system, in: 2013 IEEE International Conference on Green Computing and
Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, IEEE, 2013, pp. 1129–1132.
[129] S. Ali, M.A. Khan, J. Ahmad, A.W. Malik, A. ur Rehman, Detection and prevention of black hole attacks in IoT & WSN, in: 2018 Third International
Conference on Fog and Mobile Edge Computing, FMEC, IEEE, 2018, pp. 217–226.
[130] M. Goyal, M. Dutta, Intrusion detection of wormhole attack in IoT: A review, in: 2018 International Conference on Circuits and Systems in Digital
Enterprise Technology, ICCSDET, IEEE, 2018, pp. 1–5.
[131] R. Mehta, M. Parmar, Trust based mechanism for securing IoT routing protocol RPL against wormhole &grayhole attacks, in: 2018 3rd International
Conference for Convergence in Technology, I2CT, IEEE, 2018, pp. 1–6.
[132] A.N. Shaikh, A.M. Shabut, M.A. Hossain, A literature review on phishing crime, prevention review and investigation of gaps, in: 2016 10th International
Conference on Software, Knowledge, Information Management & Applications, SKIMA, 2016, pp. 9–15.
[133] E. Cambiaso, G. Papaleo, M. Aiello, Slowcomm: Design, development and performance evaluation of a new slow DoS attack, J. Inf. Secur. Appl. 35
(2017) 23–31.
[134] K. Chen, S. Zhang, Z. Li, Y. Zhang, Q. Deng, S. Ray, Y. Jin, Internet-of-Things security and vulnerabilities: Taxonomy, challenges, and practice, J. Hardw.
Syst. Secur. 2 (2) (2018) 97–110.
[135] M.T. Manavi, Defense mechanisms against distributed denial of service attacks: A survey, Comput. Electr. Eng. 72 (2018) 26–38.
[136] P. Rughoobur, L. Nagowah, A lightweight replay attack detection framework for battery depended IoT devices designed for healthcare, in: 2017
International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions), ICTUS, 2017, pp. 811–817.
[137] B. Zhu, A. Joseph, S. Sastry, A taxonomy of cyber attacks on SCADA systems, in: 2011 International Conference on Internet of Things and 4th International
Conference on Cyber, Physical and Social Computing, IEEE, 2011, pp. 380–388.

Barjinder Kaur received her PhD in Computer Science & Engineering from DCRUST, India in 2018. Her PhD work was focused on
Brain-Computer-Interface (BCI) systems for biometric and emotion recognition problems using machine learning techniques. Currently,
she is a post-doctoral fellow with Prof. Ali Ghorbani in Canadian Institute of Cybersecurity (CIC) at University of New Brunswick
(UNB) where she is working on IoT data security and threat prediction problems.

Sajjad Dadkhah: is a faculty member, research associate, and cybersecurity team leader at the Canadian Institute of Cybersecurity
(CIC), Faculty of Computer Science, University of New Brunswick (UNB). He has been involved in several security projects as a
research assistance and security consultant in different organizations such as Kyushu University (Japan), Universiti Malaya (UM), and
IRIS Smart Technology Complex. He has more than ten years of experience in digital multimedia security, computer security, IoT
security and machine learning-based detection systems. He is the Managing editor and Board member of Applied Soft Computing
(ASOC) Elsevier journal since 2016.

22
B. Kaur et al. Internet of Things 22 (2023) 100780

Farzaneh Shoeleh received the Ph.D. degree in Machine Intelligence and Robotics at the University of Tehran, Iran, in 2018. She
was a postdoctoral researcher in Iran Telecommunication Research Center (ITRC), Iran from July 2015 to April 2019. She is currently
working as the Research Team Lead of the Canadian Institute for Cybersecurity (CIC), University of New Brunswick, Canada. Her
research interests include Machine Learning, Data Science, Cybersecurity, Reinforcement Learning, and Transfer Learning.

Euclides Carlos Pinto Neto is a Computer Scientist (2016) at the Federal Rural University of Pernambuco – UFRPE - & Athlone
Institute of Technology (AIT). He holds an MSc and PhD in Computer Engineering (2018 and 2021, Digital Systems Department at
Poli-USP, University of São Paulo, Brazi) as a member of the Safety Analysis Group at Poli-USP. His research is focused on applications
of artificial intelligence techniques on security and safety-critical systems.

Pulei Xiong is a research officer at National Research Council Canada. He has over a decade of industrial and academic experience in
cyber security. His current research interests include security and privacy of machine learning, privacy-preserving application, mobile
and IoT security, and security compliance for emerging technologies.

Shahrear Iqbal received his Ph.D. in Cybersecurity from Queen’s University, Kingston, ON, Canada in 2017. He is now a research
officer at National Research Council (NRC) Canada. Before joining NRC, he was a researcher at Security Compass, Toronto. He is a
specialist in software security, in the area of mobile/IoT/connected devices. Previously, he was a faculty member in the Department
of Computer Science and Engineering (CSE) at Bangladesh University of Engineering and Technology (BUET). He is currently doing
research on the security of connected devices with a focus on context-aware authentication and automated defense mechanisms.

Philippe Lamontagne received his Ph. D. in quantum cryptography from Université de Montréal in 2018. His thesis is focused
on the provable security of quantum protocols for secure two-party computation. He is now a research officer at Canada’s National
Research Council (NRC). As a member of the NRC’s cybersecurity team, his research interests include cryptography, quantum-resistant
algorithms and privacy-preserving techniques.

Suprio Ray is an Associate Professor with the Faculty of Computer Science, University of New Brunswick, Fredericton, Canada. He
received a Ph.D. degree from the Department of Computer Science, University of Toronto, Canada. His research interests include big
data and database management systems, run-time systems for scalable data science, provenance and privacy issues in big data and
data management on modern hardware.

Ali A. Ghorbani is currently a Professor of computer science, the Tier 1 Canada Research Chair in cybersecurity, and the Director
of the Canadian Institute for Cybersecurity, which he established, in 2016. He is the Co-Inventor on three awarded and one filed
patent in the fields of cybersecurity and web intelligence. He has published over 280 peer-reviewed articles during his career. He
has supervised over 190 research associates, postdoctoral fellows, and students during his career. His book Intrusion Detection and
Prevention Systems: Concepts and Techniques (Springer, October 2010). He is the Co-Founder of the Privacy, Security, Trust (PST)
Network in Canada and its annual international conference. He has served as the Co-Editor-in-Chief for the International Journal of
Computational Intelligence, from 2007 to 2017.

23

You might also like