0% found this document useful (0 votes)
24 views

A Survey of Smart Home IoT Device Classification Using Machine Learning-Based Network Traffic Analysis

Uploaded by

futureceovivek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

A Survey of Smart Home IoT Device Classification Using Machine Learning-Based Network Traffic Analysis

Uploaded by

futureceovivek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Received 11 August 2022, accepted 31 August 2022, date of publication 8 September 2022, date of current version 19 September 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3205023

A Survey of Smart Home IoT Device


Classification Using Machine Learning-Based
Network Traffic Analysis
HOUDA JMILA , GREGORY BLANC , MUSTAFIZUR R. SHAHID, AND MARWAN LAZRAG
SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, 91764 Palaiseau, France
Corresponding author: Houda Jmila ([email protected])
This work was supported in part by the Vulnerability and Attack Repository for IoT (VarIoT) Project under Grant TENtec n.28263632, and
in part by the Connecting Europe Facility of the European Union.

ABSTRACT Smart home IoT devices lack proper security, raising safety and privacy concerns. One-size-
fits-all network administration is ineffective because of the diverse QoS requirements of IoT devices. Device
classification can improve IoT administration and security. It identifies vulnerable and rogue items and
automates network administration by device type or function. Considering this, a promising research topic
focusing on Machine Learning (ML)-based traffic analysis has emerged in order to demystify hidden patterns
in IoT traffic and enable automatic device classification. This study analyzes these approaches to understand
their potential and limitations. It starts by describing a generic workflow for IoT device classification. It then
looks at the methods and solutions for each stage of the workflow. This mainly consists of i) an analysis of
IoT traffic data acquisition methodologies and scenarios, as well as a classification of public datasets, ii) a
literature evaluation of IoT traffic feature extraction, categorizing and comparing popular features, as well
as describing open-source feature extraction tools, and iii) a comparison of ML approaches for IoT device
classification and how they have been evaluated. The findings of the analysis are presented in taxonomies
with statistics showing literature trends. This study also explores and suggests undiscovered or understudied
research directions.

INDEX TERMS Classification, security, device, fingerprinting, identification, internet of things, machine
learning, network traffic, survey.

I. INTRODUCTION production, and performance) above security [2]. This results


In the last decade, the Internet of Things (IoT) has spread: in an ineffective security design for IoT devices. As revealed
according to IoT Analytics [1], the IoT market will rise by by Wikileaks [3], poorly secured IoT devices are ideal targets
18% to 14.4 billion active connections in 2022. Researchers for attackers seeking to obtain unauthorized access and infer
have suggested several definitions of the IoT, but almost all sensitive information: e.g., smart TVs were converted into lis-
agree that it is a framework of sensors, industrial machines, tening devices. Attackers can also use compromised devices
video cameras, mobile phones, etc., all of which are collec- to inject malicious data and conduct large-scale attacks
tively referred to as IoT devices and can interact directly against third parties or other devices inside the network [4].
with one another or over the internet. IoT is used in smart Automatically classifying devices is the first step toward
environments (homes, cities, campuses, etc.) to help users securing IoT networks. It enables the detection of vulnerable
understand and control their environment. devices and the enforcement of access control.
Despite its undeniable advantages, IoT expansion The growing diversity and heterogeneity of IoT devices,
raises security and privacy concerns. Most IoT device each with its own QoS requirements (cameras require more
manufacturers tend to prioritize the three Ps (prototyping, bandwidth than smart light bulbs, healthcare device traf-
fic must be prioritized, and so on), makes one-size-fits-all
The associate editor coordinating the review of this manuscript and network management ineffective. IoT device classification
approving it for publication was Taehong Kim . enables network management automation. By setting QoS

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 10, 2022 97117
H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 1. The scope of the survey is highlighted in red. We focus on IoT device classification in
smart homes, also called consumer IoT devices. We analyze approaches using machine
learning-based traffic analysis.

and network management policies based on the type of ufacturers, and even if they were, there is no standard for
device, each automatically classified device can be assigned recognizing device brands and types accordingly. To cope
to a class with predetermined policies. with this problem, researchers have examined IoT network
Note that the term device classification is often confused traffic and witnessed that IoT devices perform very specific
with many similar tasks, namely i) traffic classification, tasks [9]: for example, it is possible to turn on or off a
ii) intrusion detection, iii) device identification, and iv) device smart bulb or change its brightness and light color, however,
fingerprinting. Traffic classification is a broad research field a smart bulb can not stream videos or send emails. Therefore,
that involves classifying network traffic based on various we assume that the IoT network traffic could follow a stable
parameters [5] (see Fig. 1). For instance, traffic can be classi- and predictable pattern that may characterize it. Machine
fied as either legitimate or malicious based on attack patterns: learning may reveal hidden network traffic patterns and learn
this is called intrusion detection. It can also be classified by their characteristics, making device classification easier. This
the device that generates the traffic (device classification). study explores IoT device classification using ML-based net-
The devices can be categorized into groups of similar devices, work traffic analysis. To characterize a device, we focus on all
such as devices for energy management or devices for health the network traffic it creates, which is device-specific and not
monitoring, or according to their function, such as cameras, application-specific because it comprises all the applications
hubs, home assistants, etc. Device identification classifies (tasks) executed by the device, which can be distinct.
devices more finely according to their model or constructor, According to [10], IoT devices can be divided into con-
such as D-link camera, Nest camera, Alexa home assistant, sumer, commercial, and industrial categories. Consumer
or Google home mini assistant, etc. Device fingerprinting is IoT devices include personal devices, such as smartphones,
the finest level of device classification. It gives each device and internet-connected home devices like cameras, home
instance (e.g., camera A and camera B are two instances of assistants, and smart lamps. Larger organizations employ
the Nest Camera) a distinct fingerprint that is ‘‘impossible to commercial IoT devices for smart city deployments, trans-
forge and independent of environmental changes and mobil- portation and electric car monitoring, health monitoring sys-
ity’’ [6]. In this study, we focus on device classification as tems, etc. Industrial IoT devices improve process control
a specific case of traffic classification, broader than device and productivity, such as sensors, robots, and power plant
identification and device fingerprinting. controllers. Some devices, like cameras and sensors, can
A simple way to classify IoT devices is to mon- belong to multiple categories. This survey focuses on con-
itor their MAC addresses and DHCP negotiation [7]. sumer IoT devices, commonly called smart home devices.
Sivanathan et al. [8] outline the shortcomings of this method. This choice is motivated by the rich and abundant litera-
First, IP and MAC addresses can be easily spoofed by other ture on smart home devices due to i) the availability of
devices, making them unreliable identifiers. Furthermore, data, compared to its confidentiality in the industrial world,
MAC addresses are not necessarily indicative of device man- and ii) the large number of smart home devices, which

97118 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 2. Workflow of IoT device classification using ML-based traffic analysis: input includes the
devices to be classified. First, raw traffic data is collected as pcap files and supplied to the feature
extraction procedure, which creates feature vectors (in text-based format) representing the raw
traffic. ML algorithms use these files to classify the originating device of each sample.
Classification results can be used in various contexts, including cyber security enforcement,
network management, and malicious usage.

represent the largest share of the IoT market (63% according ing,’’ ‘‘unsupervised clustering,’’ ‘‘automated,’’ and ‘‘intel-
to Gartner [11]). Furthermore, many people, including those ligent.’’ Our search was limited to 2018-2022 articles to
unaware of security, use smart home IoT devices, making capture recent advancements. Second, we examined the ref-
their protection crucial. erence lists and citations of the selected articles to find
Other surveys have examined IoT device classification- more papers. Third, we scanned titles and abstracts to reject
related tasks, but none have focused on the topic of this items that did not fit the scope (task: classification, context:
study. Tahaei et al. [7] discussed IoT traffic classifica- smart home, and classification approach: ML-based traffic
tion, while [12] examined ML-based internet traffic classi- analysis). Finally, a deep evaluation of the publications was
fication, although neither focused on device classification. conducted, and articles with insufficient information on all
Sanchez et al. [13] discussed device behavioral fingerprinting stages of the classification procedure were removed. At the
but not IoT devices specifically. Yadav et al. [6] provided a end of this process, 58 papers were deemed pertinent to our
taxonomy for IoT device identification approaches. However, investigation.
they did not focus on ML methods and what data collection,
feature extraction, and model learning require. To the best of II. ANALYSIS STEPS AND CONTRIBUTIONS
our knowledge, no recent work studies exhaustive IoT classi- Fig. 2 shows a general flowchart summarizing the multiple
fication datasets, no current work explores feature extraction steps and actors that can be involved in IoT device clas-
methodologies and compares the most useful and interesting sification using ML-based traffic analysis. The initial step
features for IoT device classification, and no previous work is data acquisition, which consists of collecting raw traffic
examined each step of the IoT classification process as we do. from devices in pcap files (the pcap file format is the
For a comprehensive literature review, we analyzed papers de facto standard for packet captures). The second phase
from different digital libraries like IEEE Xplore, Research- is feature extraction, which aims at representing raw traffic
Gate, Google Scholar, etc. First, we performed a keyword with numerical or categorical information in a text-based
search using terms related to i) ii) IoT devices, like ‘‘IoT format (e.g. csv (Comma-Separated Values) or text) files
devices,’’ ‘‘wearable devices,’’ and ‘‘IoT gadgets,’’ iii) classi- that ML algorithms can use. The final stage is classifi-
fication, like ‘‘classification,’’ ‘‘clustering,’’ ‘‘identification,’’ cation using machine learning algorithms. The classifica-
and ‘‘fingerprinting,’’ iv) traffic analysis, like ‘‘traffic anal- tion result can be used for cyber security enforcement, net-
ysis,’’ ‘‘traffic classification,’’ ‘‘communication analysis,’’ work management, as well as malicious activities like cyber
‘‘network characteristics,’’ ‘‘network packets,’’ and ‘‘network attacks.
flows,’’ and v) machine learning as ‘‘machine learning,’’ To help develop more effective solutions for IoT device
‘‘deep learning,’’ ‘‘artificial intelligence,’’ ‘‘supervised learn- classification, this study investigates the literature regarding

VOLUME 10, 2022 97119


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 3. Table of content and discussed questions.

each stage of the process and attempts to provide answers to fier), 2) the learning strategy (supervised, un-supervised,
the following research topics. semi-supervised), and 3) the machine learning tech-
niques to use (deep or shallow algorithms).
• RQ1. How to design a practical data-acquisition method
• Q4. How to choose the classification granularity?
for IoT device classification? Data acquisition is a
Device classification can be performed at different levels
crucial step that should enable the practical and real-
of granularity. It’s crucial to understand the pros and
istic capture of the most relevant information about
cons of each classification level in order to choose the
the environment. To design an effective and practical
optimal granularity for each context and avoid extra
solution for the IoT device classification problem, it is
classification costs.
essential to know: i) which devices should be used
for data-acquisition to represent a realistic smart home To the best of our knowledge, this is the first paper that
environment, ii) when to collect the traffic to capture covers all of the above mentioned challenges and explores
the diversity of the devices’ operational modes, and iii) their impact on IoT device classification. As an attempt to
where to place the collection probe so as to capture address the above-mentioned research questions, this survey
traffic in an effective yet privacy-preserving manner. also produces the following contributions (Fig. 3, which
• RQ2. How to create an efficient feature extraction solu- provides a table of contents, depicts where and how the above
tion? Feature extraction is a critical step that must questions are handled in this study.):
describe the collected traffic as accurately as possible • An analysis of the various applications for the classifi-
to reflect its patterns. To develop an appropriate feature cation of smart home IoT devices.
extraction technique for IoT device classification, it is • An in-depth examination of IoT traffic data collection
necessary to know: i) how to represent a single data strategies. This includes: i) a review of the devices used
sample, as a packet or as a flow of packets, in other to represent a smart home setting, ii) a study of IoT
terms, at what level to extract features (packet-level traffic types (depending on device operation mode) and
or flow-level), ii) in the latter scenario, how to define their utility for classifying devices, iii) a description of
a packet flow (by time interval, number of packets, the architecture and different traffic collection points
or connection), and iii) which are the most informative (depending on the traffic probe location) and a debate
and discriminating features, and how to calculate them. on how realistic they are, and iv) an evaluation of public
• RQ3. How to build effective machine learning classi- datasets for IoT device classification.
fiers for IoT device classification? Classification using • A thorough review of feature extraction approaches.
machine learning algorithms is the last, but not the This includes: i) exploring different feature types and
least important step. To answer this research question, comparing their significance and computation method-
it is essential to decide: 1) the scope of a classifier ologies, ii) exploring deep learning-based automatic
(one classifier per device type or one multi-class classi- feature extraction, iii) describing open-source feature

97120 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

extraction tools, and iv) investigating feature dimension- network (blocked). IoT device classification can also be used
ality reduction for better IoT device categorization. to establish an automatic whitelisting system to ensure only
• A comparison of machine learning approaches for IoT authorized IoT devices can connect to the network, as pro-
device classification and how they were assessed in the posed by Meidan et al. [15]. If the determined IoT device
literature. type is not in the white list, the organization’s SIEM system is
• An examination and assessment of the various classifi- alerted to take appropriate action (e.g., disconnect the device
cation granularity levels. from the network).
• A summary of contributions in the form of taxonomies Note that White listing is more scalable than blacklisting,
and statistics to highlight trends. The statistics were which grows with untrusted devices. Moreover, data from
calculated based on a thorough review of each research authorized (whitelisted) devices is easier to obtain. Neverthe-
article with respect to taxonomies (See Tables 2 to 4 in less, using a whitelist would be less robust against adversary
the Appendix).1 attacks, as an attacker may simulate authorized device behav-
This document follows the classification process from ior to avoid the intrusion detection system.
bottom to top, except for the applications of IoT device
classification, which will be shown first for sake of clarity. B. MALICIOUS USAGE
III. THE DIFFERENT APPLICATIONS OF IoT DEVICE IoT device classification can also be exploited by attackers to
CLASSIFICATION leak sensitive information about the IoT device and its users.
A. NETWORK AND SECURITY MANAGEMENT For instance, Hafeez et al. [16] demonstrate that an adver-
Due to the variety of IoT devices, it is difficult to con- sary, with access to upstream traffic from a smart home net-
trol them with a single policy. One solution is to describe work, can identify the device types and user interactions with
network and security management rules by device class IoT devices, with significant confidence. Dong et al. [17]
and assign each device to a class with automated policies. study the case where an adversary attempts to infer the type
Miettinen et al. [14] describe an interesting use-case where of IoT devices behind a smart home network even when the
newly introduced devices are categorized and the classifica- traffic of all devices is merged behind the gateway using
tion result is used to determine whether the device is vul- VPN (Virtual Private Network) and NAT (Network Address
nerable. The decision is based on a vulnerability assessment Translator) techniques.
of the device type carried out by consulting a vulnerability Sensitive information revealing device types and user inter-
dataset. Consequently, the device is assigned one of the actions, can be used to infer user activities or home pres-
following isolation levels: i) strict, where the device can ence [16]: e.g. if the smart lights are in the off state for a long
only interact with untrusted devices, ii) restricted, where period of time, it means that there is no one at home, opening
it can communicate with untrusted devices but has limited an opportunity for a break-in. Such passive attacks are hard
internet access, and iii) trusted, where the device is allowed to to identify and mitigate. In this context, Hafeez et al. [16]
communicate with other trusted devices and has unrestricted propose a traffic morphing technique helping to hide the
internet access. This mitigation approach allows vulnerable traffic of IoT devices, lowering the occurrence of attacks.
devices to cohabit with other devices without compromising
their security. IV. APPROACHES TO DATA ACQUISITION
Note that detecting vulnerable devices in a smart home This section describes the data acquisition methodologies
is crucial since most IoT devices suffer from poor security found in the literature. In order to organize the findings,
design and can be easily compromised by an attacker to gain we present them along four axes: first, we examine the
unauthorized network access or launch massive attacks. For devices considered for data collection, second we analyze the
instance, in 2016, the Mirai malware infected millions of IoT traffic types that can be captured, third, we discuss data
IoT devices to launch DDoS (distributed denial-of-service) collection scenarios, and finally, we provide a comparative
attacks [4]. The BYOD (Bring Your Own Device) trend, study of public datasets. A taxonomy in Fig. 4 illustrates the
which allows employees to bring their own personal IoT main outcomes of this section.
devices at work and connect them to the corporate network,
extends the attack surface of companies as compromised per- A. THE CLASSIFIED DEVICES
sonal devices may inject malware into the corporate network The input to the IoT device classification process is a list
and cross-contaminate other devices. Similarly, remote work- of devices to be classified. They can be both IoT and
ing has exposed professional devices to a less trustworthy non-IoT devices, also referred to as single-purpose and multi-
environment where they cohabit with possibly more vulnera- purpose devices, since IoT devices are typically intended for
ble smart home devices. a single specific task. An up-to-date list of the most common
As described above [14], black listing approaches detect smart home IoT devices can be found on the website [18].
vulnerable devices that should be disconnected from the Examples of non-IoT devices include laptops, cell phones,
and Android tablets.
1 A dynamic version of the taxonomy and websites is available at: In the literature, some approaches classify only IoT devices,
https://www-public.telecom-sudparis.eu/~blanc_gr/survey/ and others classify both IoT and non-IoT devices.
VOLUME 10, 2022 97121
H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 4. Taxonomy of data-acquisition approaches: the approaches are classified according to i) the devices
under consideration: only IoT, or both IoT and non-IoT devices, ii) the operation mode of the devices, iii) the
probe location, and iv) whether a public dataset is utilized or the traffic is collected by the authors.
Percentages show how often each approach is used in the reviewed papers. This highlights the trends
discussed in Sec. IV.

Fig. 4 shows that the majority of reviewed papers (63%) between the device and the gateway. The decrease in packets
consider the classification of only IoT devices. However, exchanged marks the end of the installation phase. To gener-
we think that this is not the most realistic scenario since ate enough data, the installation process should be performed
the traffic must be collected from all the devices connected multiple times for each device, with a hard reset between each
to the smart home network to ensure its security and auto- save [14].
matic management. Since IoT and non-IoT devices cohabit
in smart homes, they must be considered during the traffic 2) THE INTERACTION AND IDLE TRAFFIC
collection process. However, note that classifying both IoT IoT devices generate mostly interaction and idle traffic. Inter-
and non-IoT devices is more challenging since IoT traffic is action traffic can be triggered either i) by a direct user request,
small and sparse compared to non-IoT data. As shown by like adjusting light bulb color and intensity, or ii) by a change
Dong et al. [17], some IoT devices might be easily confused in the environment observed by the IoT device, such as a
with non-IoT devices. For example, home assistants have sensor that detects motion or a light bulb that detects an
diverse and varied functions (compared to simple single-use inhabitant [19]. Idle traffic mainly includes device-Cloud
devices like light bulbs), making their behavior very similar service exchanges during standby, such as heartbeat mes-
to non-IoT devices. To address this challenge, we suggest sages, regular status updates or notifications [16]. IoT devices
training ML algorithms with mixed (IoT and non-IoT) traffic generate more traffic when active compared to background
to boost their generalization capabilities. mode [20]. This is reasonable since user and environmental
interaction stimulates diverse reactions [20].
B. THE DIFFERENT TYPES OF IoT TRAFFIC
IoT devices generate three types of traffic based on their 3) WHICH TRAFFIC TYPE IS MOST SUITED FOR IoT DEVICE
operation mode, namely: i) setup traffic (also called initial CLASSIFICATION?
traffic) is generated by an IoT device during installation, also Statistics detailed in Fig. 4 show that 86% of reviewed papers
called registration or enrollment, ii) interaction traffic (also use idle and (or) interaction traffic. Only 19% of reviewed
called active traffic) is generated when a device interacts with papers rely on setup traffic for device classification. The
the user or environment (e.g., a home assistant responding to advantage of setup traffic over idle and interaction traffic is
a voice request from the user), and finally, iii) idle traffic rep- its stability, as the IoT device’s behavior during configuration
resents device activity in the absence of external stimulation. is the same regardless of the environment. Moreover, relying
It includes routine communications between the device and on setup traffic allows for rapid recognition once the device
the back-end server, as well as keep-alive or heartbeat signals. is connected to the network. However, as the initialization
state may not appear several times during the IoT device life
1) THE SETUP TRAFFIC cycle, setup traffic is scarce, sparse, and difficult to collect
When a new device with a new MAC address connects to in real-world network monitoring. On the other hand, idle
the network, it follows a device/provider-specific procedure and interaction traffic is more abundant and easier to collect,
to connect [14]. In most situations, this operation is assisted making it better suited for machine learning algorithms, espe-
by a smartphone, laptop, or PC application. The installation cially deep learning.
procedure typically involves: i) activating the device,
ii)connecting with the provider’s app, iii) transmitting WiFi C. DIFFERENT LOCATIONS FOR TRAFFIC PROBE
credentials, and iv) resetting and connecting to the user’s 1) A TYPICAL NETWORK SETUP FOR CAPTURING IoT
network using the credentials provided. TRAFFIC
To collect the installation traffic, existing approaches Fig. 5 shows a typical smart home network architecture.
record the first packets {p1 , p2 , p3 , . . . , pn } exchanged It includes IoT and non-IoT devices connected to an internet

97122 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 5. A typical network configuration for capturing IoT traffic includes i) IoT and
non-IoT devices connected to the gateway via wireless or wired connections and ii)
packet capture and storage modules for collecting traffic. There are two capture points
discussed in the literature: i) at the gateway and ii) after the gateway.

gateway using wireless or wired connections. At least two traffic inside the home network. It is then more realistic to
tools are required for traffic collection: collect traffic from outside the smart home after the gateway.
• a Packet capture module to capture the traffic as pcap However, classifying devices based on such traffic is more
records comprising entire packets from MAC layer to challenging because the original packet headers, such as
application layer. Examples include tcpdump [21] or source IP and port, are hidden. Moreover, the widely used
Wireshark [22], and VPN-enabled gateways encapsulate the original packets in
• a storage module to store the traffic data on a distant an encrypted tunnel, hiding the traffic characteristics. This
server, or within the network. makes device classification even more challenging, and new
To label the ground truth, the MAC address in the packet solutions should be investigated.
header is used to reveal the identity of the device and label Although realistic, this scenario is understudied. This sce-
the data accordingly. nario is used in only four papers: [17], [23], [24], [25]. It is
worth noting that Meidan et al. [25] and Dong et al. [17] made
2) TRAFFIC CAPTURE SCENARIOS
their datasets public so that more research could be done on
The literature considers two scenarios for collecting IoT
this topic.
traffic depending on the location of the probe (capture point):
i) at the gateway, i.e. from inside the home device, or ii) after D. PUBLIC DATASETS COMPARISION
the gateway, i.e. from outside the smart home. 57% of reviewed publications use public datasets, either
At the gateway, the captured traffic is the one flowing completely or to complement or enrich their data. Most of the
between the devices connected to the home network and datasets we mention in this survey were created for IoT device
the gateway and can be separated by IP or MAC address. classification. However, we include other datasets developed
Whereas the traffic captured after the gateway contains traffic for other topics that contain IoT traffic and can be used for
from all connected devices aggregated using a single public IoT device classification.
IP address due to the frequent use of NAT at gateways. Table 1 summarizes the datasets listed below. To compare
them, we specify for each: i) the devices used to generate the
3) WHICH PROBE LOCATION IS MORE PRACTICAL?
traffic (IoT only, or both IoT and non-IoT), ii) the operation
Approaches that gather traffic at the gateway assume the mode of the devices (i.e. setup, interaction, idle), iii) the probe
ability to intercept and sniff the traffic flowing inside the location (i.e., at or after the gateway), iv) the duration of the
smart home. However, this clean and controlled experimental collection, v) the amount of traffic collected, and we provide
setup does not reflect most real-world use cases where traffic vi) a direct access link to the dataset.
is only seen from the outside. A typical application is when
Internet Service Providers (ISPs) classify IoT traffic to iden- 1) IoTSentinel DATASET [14]
tify devices inside a smart home and then allocate resources This dataset was collected to identify IoT devices based on
and configure appropriate security rules according to their their setup traffic. To generate enough traffic, the typical
population and vulnerabilities. But ISPs can not intercept device configuration process was repeated 20 times for each

VOLUME 10, 2022 97123


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

TABLE 1. Publicly available datasets for IoT device classification.

device. During the setup process, all network traffic between Transport and Traffic Sciences in Zagreb. The dataset con-
IoT devices and the gateway was recorded. A representative tains 144 pcap files with 24-hour traffic each.
set of 31 IoT smart home devices available on the Euro-
pean market in the first quarter of 2016 was used. There 5) DADABox DATASET [34]
are 27 different device types (4 types are represented by This dataset was created to compare some approaches to
2 devices each). Most of the devices were connected via WiFi classifying IoT devices. The testbed was developed at the
or Ethernet. Some of them utilised ZigBee or Z-Wave. University of Cambridge, where researchers sporadically
interact with IoT devices. The dataset contains 41 different
2) UNSW DATASET [19], [27]
IoT devices, and the data was collected over a period of
This dataset was published by UNSW researchers and covers 27 weeks.
various IoT research areas. In addition to traffic for IoT
device classification, the dataset includes IoT attack traces, 6) HomeMole DATASET [17]
IoT MUD profiles, and IoT IPFIX records that can be useful This dataset was created to identify IoT devices behind VPN
for other IoT-related research topics (the relevance of MUD and NAT-enabled gateways in smart homes. Three collection
profiles to device classification is discussed in Sec. VIII). scenarios were developed: i) a single device environment in
In this paper, we focus on the traffic for IoT device classi- which only one device is considered, ii) a noisy environment
fication. It was first published in [19] and has since evolved. in which various IoT and non-IoT devices are investigated.
The first version has been extensively used in the literature. Multiple devices may be operating simultaneously at any
The same authors published an updated and more elabo- given time, resulting in traffic aggregation, and iii) a VPN
rate version in [27]. Recent articles now use the modified environment where VPN is enabled. In this case, traffic is
version. collected before and after the VPN.
This study focuses on the IoT traffic traces reported in [27].
They were collected over 26 weeks, from October 1st , 2016 to 7) IoT-deNAT [25]
April 13th , 2017, but only two weeks’ worth of data is avail-
The dataset was collected to detect vulnerable IoT devices
able for download.
behind a home NAT. The traffic is captured considering
3) IoTFinder [31] AND YourThings DATASETS [29] only NetFlow’s [42] statistical aggregations (i.e., Netflow is
The IoTFinder dataset was created to explore IoT device a flow-level aggregation of information, usually a 5-tuple
identification using DNS fingerprints. Thus, the dataset con- header and some counters) instead of the raw data to reduce
tains pcap files of DNS responses for 53 IoT devices from processing and storage.
different vendors. The data was collected from August 1st ,
2019 to September 30th , 2019. 8) THE MON(IOT)R DATASET [38]
YourThings dataset was created by the same authors to This data set examines IoT device information exposure.
analyze security properties for home-based IoT devices. It contains data from 81 IoT devices deployed in two labs
(one at Northeastern University in the United States and
4) SHIoT DATASET [32] the second at Imperial College London in the United King-
This dataset was created for behavior-based IoT device clas- dom) over 30 days between September 2018 and February
sification. The test bed was implemented at the Faculty of 2019. Different types of traffic are provided: i) power traffic

97124 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 6. Taxonomy of feature extraction approaches. The approaches are classified according to i) the use of header or payload
packet level features, ii) the stream definition, iii) the type of used stream level features (volume, protocol, time, or periodicity), iv)
the use of automatic feature extraction (DL based), and v) the use of dimensionality reduction. Percentages show how often each
approach is used in the reviewed papers. This highlights the trends discussed in Sec. V.

(487 samples), which is traffic generated by IoT devices when ine the most commonly used header and payload features
they are turned on, ii) interaction traffic (32,030 samples), and compare them. Second, we analyze stream-level feature
iii) idle traffic covering an average of 8 hours per night for extraction. Third, we explore deep learning based automatic
one week for each lab, and iv) unlabeled traffic, which is feature extraction. Fourth, we provide a list of open-source
generated when 36 participants use the IoT devices in a studio feature extraction tools, and finally, we highlight the feature
at their leisure during the data collection period. Data labeling dimensionality reduction approaches. Fig 6, gives a taxon-
includes the name of the device, where it was used (the US or omy summarizing the approaches and trends.
the UK), when and for how long it was used, and whether or Feature extraction is defined in [45] as ‘‘the process of
not a VPN was used. defining a set of features (. . .) which will most efficiently or
meaningfully represent the information that is important for
9) IoT-23 DATASET [40] analysis and classification.’’ In our case, the feature extraction
IoT-23 is a dataset containing benign and malicious IoT net- step consists of describing the network traffic in the most
work activity. The traffic was captured at the Czech Technical appropriate way to retrieve the maximum amount of infor-
University. The dataset contains 20 pcap files from infected mation about the device.
IoT devices, labeled by the malware that infected them, In the majority of examined articles, significant work
and 3 pcap files containing benign network traffic generated has been dedicated to the extraction of features. Existing
by 3 IoT devices: a smart lamp, a voice assistant, and a smart approaches are diverse and heterogeneous. The objective of
door lock. The packet captures are labeled with the device this section is to summarize them in a logical and consistent
that generated the traffic. As done in [43], legitimate traffic manner.
can be used for IoT device classification,. Network traffic is the volume of data flowing over a net-
work. It is divided into packets of data and delivered over a
10) HOW VALUABLE ARE PUBLIC DATASETS? network before being reassembled by the receiving computer
Public datasets enable comparing different solutions. Unfor- or device. Packets can be used to describe the network either
tunately, the available public datasets for IoT device clas- individually or as a stream of packets, also called a flow
sification are scarce (only 5 of the surveyed papers shared (see Fig. 7).
their datasets publicly) and not diversified: most provide idle These two approaches are known as packet-level and flow-
and interaction traffic, and capture at the gateway, when this level feature extraction methods, respectively. The following
is not the most realistic scenario. Since public datasets are sections present approaches in each category.
not diverse, researchers must collect their own data when
examining new scenarios. For instance, Yu et al. [44] identify A. APPROACHES TO PACKET-LEVEL FEATURE EXTRACTION
IoT devices based on passively receiving broadcast and multi- These approaches describe each packet individually. A packet
cast packets, and had to collect their own data from different consists of a header and a payload. The header contains
WiFi networks. In conclusion, additional datasets exploring protocol information for a given layer, whereas the payload
new classification scenarios should be released, and more contains the data.
diversified IoT traffic needs to be collected, in order to boost
research on IoT device classification. As shown in Fig. 4, the 1) THE MOST IMPORTANT PACKET HEADER FEATURES
most used datasets are UNSW (30%), IoTSentinel (15%), and Extracting features from a packet header is straightforward
YourThings (6%). and has no overhead. One just needs to parse the packet’s
header fields.
V. FEATURE EXTRACTION METHODOLOGIES Depending on the layer and protocol, several fields can be
This section describes feature extraction methodologies. present in the packet header. For example, the IPv4 header
First, we discuss packet-level feature extraction: we exam- contains essential routing and delivery information and con-

VOLUME 10, 2022 97125


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 7. The main methods for feature extraction: packet-level and stream-level. For
stream-level approaches, three definitions are proposed for the stream.

sists of 13 fields, including version, header length, service compared to 11,877 traffic flows [32]. Thus, extracting fea-
type, total length, time to live, and protocol, etc. Relying on tures from packets is more expensive than from flows. Unsur-
source and destination IP addresses and ports for classifica- prisingly, most research concentrates on flow-level features
tion is not recommended due to potential spoofing issues, (81% of reviewed papers).
as mentioned in Sec. I.
The most important header features include i) the packet B. STREAM-LEVEL FEATURES EXTRACTION METHODS
length, which is widely used for IoT device classifica- In this section, we discuss the different stream definitions,
tion [46], and ii) the TCP window size, which is very we investigate and categorize the most important features,
useful for distinguishing between IoT and non-IoT devices and we examine the approaches to calculating them.
as it depends on the memory and processing speed of the
device [47]. Small constrained devices, like sensors, have
1) STREAM DEFINITION
small window sizes, while more powerful devices like video
cameras and home assistants have variable and larger window Features can be extracted from a set of packets known as
sizes [47]. a ‘‘stream.’’ We have identified three main approaches to
defining a stream: i) a stream is a set of N consecutive
2) THE MOST IMPORTANT PAYLOAD FEATURES packets, ordered by arrival time, ii) a stream is a set of
Typically, payloads consist of the header and payload of packets exchanged within a time window 1, iii) a stream is a
the upper layer, which in our case indicates the application connection between a source and a destination where packets
payload. It may consist of textual features indicating the are sent in both directions in a certain order. More information
device’s name, location, manufacturer, type, operating sys- on the approaches using each definition is presented below.
tem, services, etc.
The length of the payload transported inside a TCP mes- a: A STREAM AS A FINITE SEQUENCE OF N PACKETS
sage can indicate the length of the message sent by a given In this category, a fixed number N of consecutive packets
device, and this is device specific [47]. The entropy of the generated and received from a single IoT device is used to
payload has been used as a discriminative feature [47], [48]. construct a ‘‘signature,’’ also called a ‘‘fingerprint’’ of the IoT
In [49], the distribution of payload bytes per flow is used device. 33% of surveyed papers use this definition, in partic-
for IoT device classification. Encrypted packets may make ular approaches leveraging setup traffic (cf. Sec. IV-B1) for
feature extraction from the payload impossible. device classification, because they use the first packets sent
Note that processing each packet separately for feature by the devices when connecting to the network. For example,
extraction is time-consuming and computationally exhaust- in [14] and [50], the authors use the first 12 packets to identify
ing, requiring large storage and processing resources. The an IoT device, and in [51], 30 packets are used. The authors
Google Chromecast generates 2,459,538 packets per day, of [52] extract features from a sequence of 20-21 packets.

97126 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

Shahid et al. [9] consider N consecutive packets where N c: A STREAM AS A SET OF PACKETS BELONGING TO A
varies between 2 and 10. 24% of surveyed papers use this CONNECTION
definition. Due to the abovementioned issues, the majority of reviewed
Note that determining the optimal value of the flow size, papers (50%, see Fig. 6) use this definition of stream. This is
N , is challenging. Small flows allow for quick classification based on the RFC 2722 [56] traffic flow definition, stating
but may not be enough to characterize the device, whereas that a flow is ‘‘an artificial logical equivalent to a call or
large flows can be time and memory-consuming to analyze. connection.’’ Thus, the flow is the ordered sequence of all
Moreover, the appropriate value of N may vary from device packets sent and/or received from a particular source to a
to device since IoT devices generate different quantities of particular unicast, anycast, or multicast destination using
data. A small number of packets may be enough to identify specific ports and transport protocols.
certain device types, while a greater number may be required More concretely, a flow can be defined as a set of packets
for others. This is problematic because machine learning having in common at least two of the following attributes:
algorithms require a fixed size for the input. The authors i) source IP address, ii) source port number, iii) destination
of [14] added padding for devices that emit fewer packets than IP address, iv) destination port number, v) protocol, and vi)
the required size. Furthermore, capturing the same number service type.
of packets for all devices may take a variable amount of Depending on the criteria utilized to define the flow, there
time as IoT objects do not generate traffic at the same rate. are several definition variants. For Marchal et al. [20], the
For example, it is possible to capture packets generated by flow is a sequence of network packets sent by a given IoT
a camera in seconds. However, it takes longer to capture the device using a specified communication protocol. A flow is
same number of packets generated by a motion sensor [46]. described by Sun et al. [49] as a 5-tuple of source and des-
This makes the data collection process complicated and time- tination IP addresses, source and destination port numbers,
consuming. and protocol. For Meidan et al. [25], the service type is also
specified (6-tuplet).
b: A STREAM AS A SET OF PACKETS EXCHANGED WITHIN A Note that a collection of flows can also be used to describe
TIME WINDOW 1 the traffic. The authors of [49] combine features from sev-
This consists of subdividing the captured traffic into dis- eral flows to provide a high-level characterization of device
tinct time-windows of an appropriate duration 1. For exam- activities. Meidan et al. [57] demonstrated that using a set
ple, Fan et al. [53] extract the features every 30 minutes. of consecutive flows gives better classification results since
Pinheiro et al. [46] use a window of one second to enable it contains more information about the traffic. The different
real-time device classification. Hafeez et al. [16] use a stream definitions are illustrated in the left part of Fig. 7.
10-second time window. Le et al. [54] retrieve DNS names
requested by a device over a time period ranging from 10 min- 2) IMPORTANT STREAM-LEVEL FEATURES
utes to 24 hours, and found that performance decreases with In this section, we review the various stream-level features
a decreasing 1. that are widely used for IoT device classification. To orga-
Note that as for the previous category, the choice of the nize them, we divide them into four categories: i) volume
time window size is important and challenging. Long time- features measure the volumetric properties of the stream, ii)
windows give richer information about the device but risk protocol characteristics describe the protocols on the stream,
increasing classification delay and consuming more memory iii) temporal characteristics measure the temporal aspects of
to store traffic attributes [46]. Moreover, it may result in the stream, and iv) periodicity features reflect the stream’s
very similar samples with little feature variation. This could periodicity.
also lead to fewer data samples for learning and testing,
and thus be unsuitable for deep learning-based classification a: VOLUME FEATURES
approaches. Few and redundant samples may also introduce Examples include packet length statistics, the number of
a bias and overfitting. On the other hand, a small time- packets or bytes in the entire flow or in a specific direction
windows may allow real-time classification but may not con- (incoming or outgoing traffic), the flow rate, etc. For instance,
tain enough information to reflect the characteristics of the Pinheiro et al. [46] identify devices based on statistics of the
device’s behavior. Bai et al. [55] showed that a small seg- packet length and number of bytes generated by each device.
mentation window interval degrades the classification results Sivanathan et al. [58] use average packet size and average rate
compared to a larger segmentation. In addition, setting the per flow as two principal attributes. Volume features are very
same interval time for all devices can be inappropriate as the important and widely used (in 60% of reviewed papers).
devices generate different quantities of traffic. For example,
a motion sensor generates close to 140 packets per minute at b: PROTOCOL FEATURES
most, and a camera generates up to 1900 packets per minute Traffic including all protocols and layers, or selected proto-
on average [55]. cols, can be used to extract features. In addition to the widely

VOLUME 10, 2022 97127


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

studied layer-2 to layer-4 protocols, the following application consider the flow activity duration. Sun et al. [49] calculate
layer protocols have been examined: idle time as it reflects device activity frequency.
It is worth noting that the IAT is one of the most useful
• The Domain Name System (DNS) is an essential
time-related features as it varies by device depending on the
Internet service, and is therefore important to IoT
hardware and software configurations [64]. It is therefore,
devices communicating with remote Cloud services. The
widely used in the literature ( [9], [16], [49], [51], [53], [65],
DNS features differentiate IoT from non-IoT devices.
[66], [67], [68]). In particular, we note that the classification
IoT devices connect to limited endpoints, mainly their
of ZigBee, Z-Wave, and Bluetooth IoT devices is often exclu-
provider servers. This behavior can be captured by the
sively based on IAT [65], [66].
number of DNS unique queries, as IoT devices have
fewer unique DNS queries than non-IoT devices [59],
d: PERIODICITY FEATURES
[60]. Moreover, devices can be identified by the domain
names they communicate with [27]. IoT devices generate background communications that
The most frequently used DNS characteristics are: always present relatively constant and periodic patterns.
i) the number of unique DNS queries, ii) the number of Some researchers [20], [69] extract features from periodic
unique domain names, iii) the most frequently queried flows. To do this, they first discretize the flow into a binary
domain names, iv) the number of DNS packets, and v) time series signal representing the existence or not of packets
the number of DNS errors. The papers [27], [53], [54], in the traffic each second. Then, they use the Discrete Fourier
[59], [60], [61], and [62] exploited these features. Transformation to identify the different distinct periods of
• TLS features: TLS/SSL is used by many IoT devices the signal. Once identified, statistical features are used to
to secure internet communication with servers. The describe these periods in detail. Examples include: the num-
TLS protocol consists of two layers: handshake and ber of periods, the maximum and minimum period values, the
record protocols. The handshake layer is the most inter- averages of the occurrence of periods at the minimum period
esting as it comprises of ‘‘text-in-the-clear’’ messages value, and the accuracy and stability of the inferred peri-
exchanged between devices and servers to create a ods [20], etc. Note that approaches for extracting periodicity
secure channel and negotiate ciphers and encryption features often use the time-window-based stream definition
keys. Fan et al. [53] use the number of TLS hand- (Fig. 7). Only 2 papers, namely [20], [69] use periodicity
shakes as a feature. Sun et al. [49] analyze the unen- features.
crypted data of the TLS handshake and exploit the
3) HOW ARE STREAM-LEVEL FEATURES CALCULATED?
plaintext data in the ClientHello, ServerHello,
and Certificate messages to derive the follow- We identified two approaches to calculate stream-level
ing features: the list of proposed ciphersuites, the list features: concatenation and statistics.
of announced extensions, and the length of the pub-
a: CONCATENATION
lic key. The authors noted less fluctuation in the dis-
tribution of ciphersuites and TLS extensions in IoT Stream-level features can be calculated by concatenating
devices, compared to non-IoT devices, because they individual packet features. The authors of [51] define a n ×
advertise a limited and fixed number of ciphersuites. 7 feature matrix with 7 packet header features per packet (n
Thangavelu et al. [61] used the following TLS fea- packets). Similarly, Wan et al. [68] describe a stream of p
tures: the minimum, maximum, and mean of the packets defining a device signature using p vector attributes.
TLS packet length, the flow duration, and the num- In general, only approaches defining the stream as a set
ber of TCP keep-alive probes used in the TLS ses- of N packets (see Sec. V-B1a) use this method because
sion. Valdez et al. [63] derive features from TLS concatenating a small number of packets is unlikely to create
session initialization messages (ClientHello and large signatures.
ServerHello). Features include negotiated ciphers,
b: STATISTICS
proposed cipher suites, server name, and destination
end-point. The second way is to perform statistical calculations on
packet-level features. Depending on whether the measured
feature is numerical (e.g. TTL) or categorical (e.g. proto-
c: TIME-RELATED FEATURES col type), different statistics can be generated, as described
They measure the temporal aspects of the flow. Examples below.
include the inter-packet arrival time (IAT), i.e. the time inter- For numerical features, researchers often calculate:
val between two consecutive packets received, the time a flow • The traditional minimum, maximum, mean, sum,
was active before becoming inactive, the time the last packet standard deviation, variance, which are widely used
was switched [25] and the flow duration, etc. For instance, in the literature.
in [27] and [59], the authors calculate the sleep time of a • The entropy, which measures the degree of disorder
device, the average time interval between two consecutive of features. It is a way of describing the nature of the
DNS requests, and the NTP interval. Thangvelu et al. [61] data without focusing on the data itself. For example,

97128 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

the payload entropy indicates the information content flows is 10. A flow is described using 2.500 bytes of data
of a packet. Packets including text data have less pay- (first 10 packets × 250 bytes). The first 250 bytes of each
load entropy than packets carrying audio data [47]. The packet are concatenated. Streams with fewer than 10 packets
authors of [48] and [47], categorized IoT devices by employ padding.
payload entropy. Fan et al. [53] calculate the entropy of Despite the benefits of these approaches, which sim-
top DNS requests and packet lengths. plify and automate feature extraction, transforming data into
• The skewness [70] and kurtosis, [71] which measure another format (image, vector, etc.) can lead to semantic
the asymmetry and the ‘‘tailedness’’ of the probability information loss. Moreover, this strategy does not take into
distribution, respectively. In [55], the authors use packet account expert knowledge, which can help find the most
length skewness to explore packets’ different lengths in important features. A minority of research papers (12% [74],
a flow. [75], [76]) explored this solution.
• The augmented Dickey-Fuller (ADF) test [72], which
determines whether or not a given time series is station- D. OPEN-SOURCE FEATURE EXTRACTION TOOLS
ary. It was used in [53] to capture how some devices send This section describes the existing feature extraction tools
large packets in a short period of time, causing packet found in the literature. The input of a feature extraction tool is
length to shift substantially. network traffic in pcap format collected by a packet capture
• The spectral density, which characterizes a stationary tool (e.g. tcpdump). The output is text-based format files
population time series in the frequency domain. The (often csv) containing feature vectors. A feature vector is
authors of [73] use spectral analysis of packet length calculated for each observation.
to record device communication patterns, differentiate CICFlowmeter [77] is an open-source feature extractor that
IoT and non-IoT traffic, and determine the device class produces more than 80 volume- and time-related features
generating the packet flows. per TCP flow. The authors use two methods to measure the
• Note that when the stream is defined by a time win- attributes. In the first approach, they measure time-related
dow, finer granularity statistics can be generated by features over the full TCP flow, such as the time between
computing the first quartile, second quartile, and third packets or the time the flow remains active. In the second
quartile of numerical packet features, as [53] does for approach, they fix the time (e.g., every 1 second) and measure
the ‘‘packet length.’’ other volume-related attributes (e.g., bytes per second or
For categorical features, researchers often: packets per second).
• List or count feature values. Huang et al. [51] use Bekerman et al. [78] present a feature extraction tool,
a binary vector coded according to whether specific which is implemented on top of Wireshark [22] and extracts
protocols exist in the traffic flow. In [49], [69], and [55], 972 behavioral features across different protocols and net-
the authors count the types of protocols involved in the work layers. The features describe different observations
device’s communication traffic. of various granularities, namely i) a conversation window,
• Determine the dominant values or their proportion. ii) a group of sessions, iii) a session (e.g., a TCP session),
For example, Msadek et al. [67] identify the set of and iv) a transaction, i.e., an interaction (request-response)
dominant protocols (the most used). Zhang et al. [69] between a client and a server.
count the proportion of TCP/UDP/ARP in the device Joy [79] extracts features from live network flows with
communication flow. a focus on application layers. The main features are: IP
packet arrival lengths and times, the sequence of TLS record
C. WHAT ABOUT AUTOMATIC FEATURE EXTRACTION? arrival lengths and times, other unencrypted TLS data, such
While traditional ML algorithms require costly handcrafted as the list of proposed and selected ciphersuites, DNS names,
features, deep learning approaches may automatically extract addresses, TTLs and HTTP header elements, etc.
and learn the optimum features for the classification, directly
from raw data. As DL requires standardized input data of the E. FEATURES DIMENSIONALITY REDUCTION FOR BETTER
same type and size for all samples, researchers first convert CLASSIFICATION
pcaps into a suitable model input. To do so, Greis et al. [74] Feature dimensionality reduction improves classification
consider the packet captures (in pcap format) collected dur- accuracy and reduces the computational cost. This is a
ing the setup phase and transform the first 784 bytes of traffic pre-processing phase that identifies relevant features and
into a 28 × 28 grey-scale image. Each pixel represents a grey removes irrelevant or redundant ones. Feature dimensionality
value between 0 (black) and 255 (white). When a setup phase reduction is not widely used in IoT device classification.
has less than 784 bytes, the remaining pixel values are set to 0 Only 30% of reviewed papers apply this step. This is because
(black). Similarly, Kotak et al. [75] use TCP payload to create most publications rely on expert knowledge to derive an
greyscale images of the device’s communication pattern. accurate and small set of features, making feature reduction
Yin et al. [76] rely on traffic vectorization. They use unnecessary. On the contrary, articles using feature extraction
the first 10 packets to characterize a flow. This number was tools (see Sec V-D) generate a large number of features and
chosen because the average number of packets in most IoT minimize them using feature reduction.

VOLUME 10, 2022 97129


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 8. Taxonomy of ML based classification approaches. Percentages show how often each
approach is used in the reviewed papers (the percentages do not always sum up to 100 because
some papers use algorithms from multiple categories).

Ghojogh et al. [80] review feature dimensionality reduc- b: APPROACHES APPLYING WRAPPER METHODS
tion approaches. They divide them into two groups: 1) feature Such approaches select the features based on the classifier’s
extraction approaches, where features are projected into a performance. Thus, the selected set can vary from one classi-
lower dimensional subset to extract a new set of features, fier to another. For instance, in [84], the authors use a genetic
and 2) feature selection approaches, where the best subset algorithm based feature selection method. The genetic algo-
of original features is selected. Note that the term ‘‘feature rithm determines the smallest set of packet header features
extraction’’ is also improperly used in the literature to rep- in all network layers that contributes significantly to the
resent the process of describing observations by a vector of classification for a given classifier.
features (cf. Sec V).
VI. CLASSIFICATION
1) APPROACHES USING FEATURE EXTRACTION BASED The aim of the classification step is to predict foreach traffic
DIMENSIONALITY REDUCTION input, represented by a vector of features X = x1 , . . . xf ,
Thangavelu et al. [61] use a common feature extraction the class c of the device that has generated it. Different
method called ‘‘Principal Component Analysis’’ (PCA). classification approaches have been explored in the literature.
Fan et al. [53] use Convolution Neural Network- (CNN) We will classify them according to i) the number of classes
based dimensionality reduction. Similarly, Bao et al. [81] use (multi-class classifier or one-class classifier), ii) supervised
auto encoders for dimensionality reduction. Auto encoders or unsupervised approaches, and iii) shallow or deep learning
learn a mapping from high-dimensional observations to a algorithms. Fig. 8 illustrates the classification results.
lower-dimensional representation space such that the original
A. MULTI-CLASS VS ONE-CLASS CLASSIFIER
observation can be reconstructed from the lower-dimensional
representation [82]. Auto Encoders are widely used for fea- 1) METHODS USING MULTI-CLASS CLASSIFIER
ture learning in general [80]. Similarly, representation learn- Only one classifier is used for the multi-class classification.
ing [83] is a feature extraction method used to learn automatic The trained classification model  outputs a vector of class
discriminative features. It has not been explored yet for IoT membership probabilities Ps = psi 16i≤n denoting the like-
device classification. lihood that the inspected traffic sample s comes from device
class ci . The traffic is labelled as originating from the device
2) APPROACHES USING FEATURE SELECTION BASED having the highest probability. To capture unknown devices,
DIMENSIONALITY REDUCTION a threshold parameter tr can be defined and fine-tuned using
According to Ghojogh et al. [80], there are two feature selec- the validation dataset. If one probability psi exceeds the
tion approaches: i) filter methods, and ii) wrapper methods. threshold parameter tr (psi > tr), the traffic is classified as
originating from the device class ci . Otherwise, it is classified
a: APPROACHES EMPLOYING FILTER METHODS as unknown. A device can also be considered as unknown
Such methods minimize the feature set by selecting the if the feature vector matches more than one class with a
most discriminative ones. The Correlation Criteria is one of low discriminative threshold (0.5 for example). This is the
the most widely used solutions. It is based on calculating most popular method in the state of the art (90% of reviewed
the correlation between each feature and the label vector. papers).
The features with the highest correlation value are selected.
Sivanathan et al. [58] use Correlation-based Feature Subset 2) METHODS USING ONE-CLASS CLASSIFIER
(CFS) and Information Gain (IG). Similarly, Cvitic et al. [32] (A CLASSIFIER PER DEVICE)
use CICFlowmeter for feature extraction (83 features) and A minority of reviewed papers (14%) use this classification
then apply IG. approach. In the following, we describe how this strategy

97130 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

is employed in the literature. This consists of splitting the 1) SUPERVISED CLASSIFICATION


dataset into numerous binary classification problems (focus- In supervised classification, labeled datasets are split into
ing on a single class, regardless of all other classes) and then training, validation, and test datasets. Datasets can be
a binary classifier is trained for each device. Each classifier separated chronologically or randomly. However, temporal
provides either i) a probability pi that the traffic was generated partitioning better matches the real world scenario, when
by a device class ci or ii) a binary decision on whether the a classifier is trained on existing data and then tested on
input matches the device type. In the first case, a threshold t new data. Despite the cost of labeling and the difficulty of
(cutoff value) should be set. If pi > t, the traffic is labelled detecting new devices not included in training, supervised
as originating from the device class ci . t is empirically set classification techniques are commonly employed in IoT
to maximize the classifier’s accuracy [57]. In the second device classification literature (84% of reviewed papers) due
situation, if a device is accepted by multiple classifiers, the to their high accuracy and ease of implementation.
conflict should be resolved, for example, by computing a
distance-based metric between the sample to identify and a 2) UNSUPERVISED CLASSIFICATION
subset of samples from each class that it has a match for [14], Supervised techniques use labeled device class data. Labeling
or by applying majority votes [15] to break the tie between involves significant human effort, which is tedious and not
multiple matches. scalable given the growing number of IoT devices.
Note that using this strategy, classification accuracy can be Unsupervised learning is more scalable since it minimizes
increased by evaluating the classification results of more than human assistance, but it is harder to execute and its accu-
one sample before choosing the device class. For example, racy is likely to be lower than supervised approaches. Thus,
in [57], the authors perform a majority vote on the classifica- only 16% of reviewed papers use unsupervised classification
tion results of several consecutive TCP sessions to determine, approaches. For instance, the authors of [43] propose a clas-
with an accuracy of 100%, if they were generated by a sification method using semi-supervised GANs (generative
certain device. The optimal number of consecutive sessions adversarial networks).
is defined as the minimum number of sessions on which
C. SHALLOW AND DEEP LEARNING
a majority vote provides zero false positives and zero false
negatives on the test dataset. Deep learning uses multiple layers of nonlinear processing
units. All non-deep learning approaches are shallow learning,
3) MULTI-CLASS CLASSIFIER VS ONE-CLASS CLASSIFIER including most machine learning models before 2006 and
Generating a model for multi-class classifiers is challenging neural networks with one hidden layer.
in practice: when a new device type is added to the net- Despite the advantages of deep learning, the majority of
work or the behavior of existing device types legitimately reviewed papers (79%) still use shallow classification algo-
changes (due to firmware upgrades by device manufactur- rithms, probably due to its simplicity and ease of implemen-
ers, for example), the entire model should be re-trained for tation and because some shallow algorithms are intrinsically
all classes [85]. On the contrary, building a classifier per interpretable, like decision trees. Random Forest is a popular
device avoids costly re-learning if a new device type is classifier due to its accuracy and speed, but its classification
added. In addition, building a classifier per device allows time grows linearly with the number of classes, so it may not
for the discovery of new devices: if a sample is rejected by scale to a large number of device types.
the classifiers, it may be identified as a new device type.
Another advantage is its interpretability. When the number D. EVALUATION SCENARIOS
of features is important, one classifier per class gives a set of Accuracy, precision, recall, F1 score, and ROC are classic
interpretable models instead of one complex model. evaluation metrics. Accuracy measures the ratio of correctly
However, the one-class classifier approaches are more predicted observations to the total observations. Precision
computationally expensive since the results of more than one indicates what percentage of positive predictions were cor-
classifier should be computed. Moreover, managing conflicts rect. Recall defines what percentage of positive cases a clas-
might be time-consuming if a sample fits many device types. sifier has caught. F1 score is a harmonic average of precision
As reported in [14], most device type identification time is and recall.
spent on tiebreaks. Moreover, unbalanced training datasets Most of the reviewed research papers (79%) focus on clas-
can affect classifier performance (there are generally fewer sic evaluation metrics. However, traditional evaluation does
samples for one device type compared to the samples of all the not accurately measure the performance and limitations of
remaining samples combined). This issue can be solved by classification algorithms. For instance, accuracy gives equal
utilizing under-sampling and over-sampling approaches [86]. weights to all classes, which is inappropriate if the dataset
is unbalanced (e.g. you can have 90% of total accuracy but,
B. SUPERVISED, UNSUPERVISED in minority classes, most samples are misclassified). The
ML-based classification algorithms are often classified performance of classifiers should be assessed in different
into supervised and unsupervised approaches, with the scenarios and through diverse metrics and measures. Below,
well-known advantages and limitations of each briefly we describe some other metrics found in the literature to
described below. inspire other evaluation methodologies.
VOLUME 10, 2022 97131
H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

1) MEASURING CLASSIFICATION AND LEARNING SPEED device category. For example, Google Home Mini (GHM)
The learning time is significant since classification models and Amazon Alexa are device types within the category of
that learn rapidly are more adapted to real conditions [20]. home assistants. Finally, a device instance is a physical device
The classification time (the time required to classify one sam- instance of a device type. For example, two different GHMs
ple) is critical for instant device identification [14]. In [46], in the same network are two instances of the GHM device
the authors evaluated the training time, the latency, i.e., the type. In the following, we examine how these different levels
time spent performing device identification, and throughput, of classification have been considered in the literature.
i.e. the number of identifications per second.
2) MEASURING CPU, MEMORY CONSUMPTION AND A. CLASSIFYING DEVICES BY CATEGORY
COMPUTATIONAL COMPLEXITY Different definitions of ‘‘category’’ have been proposed in
In [14], the authors measure the CPU used by the secu- the literature. The most used definition relies on ‘‘the main
rity gateway for the classification and for the enforcement functionality (or purpose) of the device,’’ e.g. refrigerator,
mechanism. In [87], the authors calculate the computational TV, watch, or camera, as proposed in [15], [57], and [47]. For
complexity of the different steps of their solution, namely instance, in [55], the devices are classified into hubs, electron-
feature extraction, clustering, and model training. The feature ics, cameras, and switches & triggers. In [92], four categories
extraction cost is estimated to be m × O(n) where m is the are discussed: IP cameras, smart on/off plugs, motion sen-
number of features and n is the number of packets in the sors, and temperature/environmental sensors. A more broader
session. The cost of clustering is calculated based on the steps definition is proposed in [93], where the authors classify the
and loops in the proposed algorithm. The Random Forest devices according to their application domain into healthcare,
training cost depends on the feature vector dimension, the multimedia, hubs, etc. Note that only 22% of papers exam-
number of decision trees, and the number of training samples. ined in this survey use this classification level.
As the number of IoT devices grows, so do their applica-
3) VARYING EVALUATION SCENARIOS tions and features, requiring new device category definitions.
Some papers measure the variation of performance metrics To this end, Cvitic et al. [32], [94] propose classifying devices
in different scenarios. For example, Huang et al. [51] test according to their ‘‘Cu predictability index.’’ Cu measures
the scalability of their approach and show that accuracy the ‘‘level of predictability of behavior’’ of the device. To do
diminishes with many device types. Meidan et al. [15] mea- this, Cu measures the variation in data received and sent
sure the classification accuracy as a function of the number by a device over a period of time. Devices that behave
of consecutive sessions needed for classification. Similarly, in roughly the same way over time are easily predictable,
Song et al. [88] examine the relationship between identi- whereas devices whose usage and interaction with the user
fication accuracy and the number of packets required for modifies their behavior (and consequently the data received
classification. Bai et al. [55] measure the classification results and sent) are more difficult to predict. The authors derive
under different time window sizes and over different ratios of four device categories based on Cu. In doing so, the authors
training and testing datasets. Similarly, Marchal et al. [20] propose a more general definition of the IoT device category.
assess the evolution of accuracy as the number of training
samples changes.
B. CLASSIFYING DEVICE BY TYPE
4) ADDITIONAL EVALUATION METRICS This is the most common approach in the literature (81%
In addition to classic metrics, other evaluation scenarios can of surveyed papers). There are several ways to define the
been explored. We give the following examples: i) robustness device type. For instance, in [14], a device type denotes the
to adversarial attacks [89] to evaluate the classifier quality on ‘‘combination of model and software version’’ of a particular
ambiguous examples, ii) explainability [90], i.e. if the result device. In [44], a device type is defined by three param-
can be simply interpreted, to provide better acceptance of eters: the manufacturer, the manufacturer-type,
ML-based solutions in IoT, iii) transferability [91], that is, and the manufacturer-type-model, e.g., ‘‘amazon-
whether a model learned in one context can be applied in kindle-v2.0.’’ In [81], a device type is defined by the manu-
another, in order to reduce learning costs and provide ‘‘out- facturer and model (e.g. for security cameras: Simple_Home
of-the-box’’ tools. XCS7_1001).

VII. GRANULARITY OF CLASSIFICATION C. CLASSIFYING DEVICES BY INSTANCE


In the literature, IoT devices are classified at different lev- This is the finest level of granularity, where instances of the
els of granularity. Bezawada et al. [47] enumerate three same device type must be distinguished. It is also the most
classification levels: i) category, ii) type, and iii) instance difficult and expensive scenario. It should be noted that, in the
(cf. Fig. 9). A device category is a grouping of similar devices; literature, the use of the term fingerprint does not reflect the
for instance, devices can be grouped by function, e.g., cam- definition of device instance we propose in this survey but
eras, sensors, or home assistants. A device type, however, rather refers to device identification, i.e., the classification
designates a more specific device model within a general of devices based on their type. Therefore, proposed solutions

97132 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

FIGURE 9. IoT devices classification levels. IoT devices can be classified at different levels of
granularity: i) category, ii) type, and iii) instance. In this example, home assistants and cameras are
two different categories of IoT devices. GHM (Google Home Mini) and Amazon Echo Dot are two
types of Home assistants. Finally, Alexa 1 and Alexa 2, are two instances of the Amazon Echo Dot.

similar hardware and software architecture and communicate


with the same remote cloud servers using the same protocols.
Thus, they often share very similar traffic patterns. Note that
this problem is very close to the instance-based classification
FIGURE 10. Taxonomy of the classification granularity: the approaches
are classified by granularity of classification. Percentages show how often problem, which is still an open problem.
each approach is used in the reviewed papers.
VIII. KEY RESEARCH DIRECTIONS
for device fingerprinting do not distinguish between device In what follows, we consider research directions that have
instances. received little or no attention in the literature. Follow-
Instance level classification has not been sufficiently ing the paper’s rationale, we discuss challenges related to
explored in the literature. To the best of our knowledge, data-acquisition, feature extraction, and machine learning.
no solution in the literature exists for this scenario. However, We address unbalanced data sets and provide solutions
is such a classification really necessary? The answer depends in VIII-A, the importance of minimizing feature extraction
on the use case. For example, when detecting vulnerable costs in VIII-B and improving learning quality in VIII-C.
devices, instance-based classification is not necessary since In sections VIII-D, VIII-E, and VIII-F, we discuss challenges
instances of the same type share vulnerabilities. However, related to scalability, deployment in practice and lack of
instance-based classification could be useful in some use- standardization, respectively.
cases. For example, in [95], the authors focus on 5G resource
A. THE PROBLEM OF UNBALANCED DATASETS
allocation and design a solution for automatically selecting
This is a common problem in many ML applications, but
a 5G slice based on the type of IoT device connecting to
it is accentuated in IoT device classification due to the
the network, which is detected through a classification of the
heterogeneous behavior of IoT devices: some devices, like
radio signal shape. An extension could be envisaged based
plugs, generate sparse traffic, while others, like cameras,
on instance device classification, where two instances of the
generate large amounts of traffic. This makes the detection of
same IoT device used by two users with different rights are
minority class devices difficult. Bai et al. [55] report limited
distinguished. This enables better 5G resource management
data for detecting hubs and Hsu et al. [92] remark that it is
based on the user profile. Instance-based classification could
difficult to distinguish smart plug traffic from IP cam traffic.
also be useful to track a unique user’s device.
Thus, having a balanced dataset is more important than the
D. HOW TO SELECT THE BEST CLASSIFICATION size of the dataset.
GRANULARITY? Solutions based on data augmentation can be considered
The granularity of classification should be carefully set during the training phase [48]. However, it is important to
depending on the application scenario. Category level clas- avoid introducing biases when over-representing minority
sification may be sufficient in many situations. For instance, classes. There is therefore a trade-off to consider to avoid
to ensure QoS by giving different priorities to flows (e.g., overfitting the model.
prioritizing traffic from healthcare devices during periods
of high load), it is not necessary to know the manufacturer B. REDUCING THE COST OF FEATURE EXTRACTION
and software of the device. Even though the category level It is essential to consider the cost of extracting features.
classification of IoT devices is not very precise, it has the Chakraborty et al. [97] distinguish three types of feature
advantage of being scalable. extraction costs: i) the computational cost involves computing
Device type classification is the most commonly used resources used to calculate the features, ii) the memory cost
classification level in the literature due to its better ratio measures the memory used to store running feature values
between accuracy and ease of implementation. However, while computing, and iii) the privacy cost is related to privacy
many results [14], [96] have shown that it is difficult to distin- violation, especially for features extracted from the payload
guish devices from the same manufacturer or with the same that may contain sensitive information. Desai et al. [98]
firmware version. This is because these devices usually have propose a framework for ranking features according to their

VOLUME 10, 2022 97133


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

discriminatory power to differentiate between devices. They IoT device classification is no exception [101]. For example,
demonstrate that a small set of highly ranked features is malicious devices may attempt to mimic the traffic of a
sufficient to achieve an accuracy close to that obtained using legitimate device in order to connect to the network. For-
all features. tunately, it is very difficult to do this while preserving the
Note that using a limited number of features limits the fea- intended malicious functionality [102]. As discussed in [15],
ture extraction cost but can make the classification approach the rogue device must be able to generate similar requests to
more vulnerable to adversarial attacks. In fact, it is easier for the manufacturer’s servers and get similar responses, which
an attacker to generate traffic that mimics the distribution of is difficult to achieve if device authentication is required.
values taken by one feature (e.g., packet size, or IAT, etc.) to
imitate the behavior of a particular IoT device and bypass the 4) TRANSFERABILITY OF THE CLASSIFICATION MODEL
classifier. For instance, Shahid et al. [99] generate sequences Kolcun et al. [34] reveal that the accuracy of classifiers
of packet sizes representing bidirectional flows that look as degrades over time when evaluated on data collected outside
if they were generated by a real smart device. However, it is the training set. However, it is desirable that classifiers that
more complex to bypass a classifier that takes into account perform well in one context can be used in another without
the values of several features because it is difficult to generate expensive retraining. Transfer learning [103] is a promising
traffic that matches the values of all these features at the same solution that should be explored. For example, it would allow
time. a manufacturer to build a model that learns the behavior of an
IoT device and use the model in a smart home to identify the
C. IMPROVING THE QUALITY OF LEARNING device with little-retraining.
1) NEED FOR CONTINUOUS LEARNING
The IoT ecosystem and device behavior evolve rapidly. Thus, D. DISCUSSING SCALABILITY
classification models must be updated to reflect recent data Given the exponential growth of the number and types of IoT
trends. Kolcun et al. [34] note that the accuracy of IoT device devices, it is crucial to design scalable solutions. Scalabil-
classification models falls by 40%, a few weeks after learn- ity must be considered at all stages of the solution design,
ing, and argue that to preserve the accuracy of the models, as explained below.
they need to be continuously updated. It is then necessary 1) Traffic collection: the collection must be quick, effi-
to explore continuous learning ML pipelines that keep the cient, and non-exhaustive. For instance, data sam-
machine-learned models up-to-date [100]. As mentioned in pling [104] (i.e., taking sufficiently representative sam-
Sec. VI-A3, techniques that train a classifier model per device ples rather than the entire dataset) can be used to
are more easily re-trained. improve scalability. However, the choice of the sam-
pling solution must be well thought out as it may be
inappropriate for minority and sparse traffic classes,
2) SCARCITY OF LABELED DATA
which brings us back to the unbalanced dataset prob-
Fan et al. [53] note that collecting and labeling data is lem, discussed above in Sec. VIII-A.
costly and time-consuming, which cannot be scaled to the 2) Feature extraction: feature extraction should not be
overgrowing IoT environment. However, when labeled data complex, long, or costly. It is important to choose
is scarce, supervised learning techniques fail. Using semi- a scalable method. For example, packet-level feature
supervised or unsupervised approaches are possible solu- extraction is very time- and computation-consuming,
tions. Fan et al. [53] proposed an IoT identification model and it is therefore not scalable. On the other hand, deep
based on semi-supervised learning. To do so, they i) judi- learning (cf. Sec V-C) could be improved to simplify
ciously choose the features describing the traffic, ii) perform and automate the feature extraction process, and is
a CNN based dimensionality reduction, and then iii) perform therefore more likely to be scalable.
the classification using a two-layer neural network, classify- 3) ML-based classification: the number of classifiers
ing the traffic into IoT and non-IoT, then specifying the class (one-class classifier or multi-class classifier) should
of IoT objects. They managed to get 99% accuracy using only allow for easier extension to new classes and avoid
5% of labeled data. extensive updating of all the models, as discussed
Generating labeled synthetic data is another solution: e.g, in Sec. VI-A3.
generative adversarial networks (GAN) can generate syn- 4) Classification granularity: Bai et al. [55] noticed a
thetic data close to the real distribution of training data by decrease in accuracy with the increase in the number
capturing the hidden class distribution. In addition, training of classes. One solution is to carefully choose the clas-
classifiers with additional synthetic data points gives them sification granularity according to the final application,
better generalization ability [99]. as discussed in Sec. VII-D.
Moreover, with the emergence of edge computing, it is
3) RESILIENCE TO ADVERSARIAL ATTACKS interesting to use the powerful computing and storage capa-
The vulnerability of ML algorithms to adversarial attacks bilities provided by neighboring edge servers to facilitate the
has been demonstrated in several applications, and ML-based IoT device classification and make it more scalable. A first

97134 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

attempt was proposed by Sun et al. [87] who designed an Finally, most public datasets we surveyed suffer from the
edge-based IoT device classification scheme. Transfer learn- above-mentioned biases, which are also found in papers using
ing, discussed above, can also be used for scalability by them.
minimizing learning time. Q2. How to build effective machine learning classifiers for
IoT device classification?
E. DEPLOYMENT IN PRACTICE Sec. V discussed feature extraction approaches and showed
We observe a gap between academic advancements and mar- that it is more scalable and less expensive to extract features
ket implementation since reviewed IoT device categorization from streams rather than packets.
solutions are seldom (if ever) deployed. Traffic can be split by time interval, number of packets,
Indeed, most proposed solutions have not been imple- or connection-wise. Dividing traffic by connection (packet
mented using a realistic case study. Hence, their contribution flows between two endpoints) is natural and straightforward,
to improving the security or management of the IoT system but not always appropriate. For instance, to identify a device
has not been evaluated, making their actual effectiveness upon its connection to a network, it is more suitable to
uncertain. The lack of such evaluation scenarios is due to consider the first generated packets rather than the whole
the difficulties of implementing and mastering realistic and flow.
usually complex ecosystems. In addition, the challenges dis- Setting the optimal time window or packet count is
cussed above need to be addressed to make the solutions more required for flow splitting, yet this is challenging. Small
mature and ready for market implementation. flows allow for rapid classification but may not be enough
to characterize the device. On the contrary, large flows can
be time- and memory-intensive to analyze. Moreover, IoT
F. MUD AND STANDARDIZATION devices generate varying amounts of data at different rates,
Another solution for classifying and identifying IoT devices so the appropriate number of packets may vary.
would be to use the Manufacturer Usage Description We identified the most discriminative features and dis-
(MUD) [105]. The MUD is a standard defined by the cussed how to calculate them (concatenation, statistics).
IETF [106] that allows IoT device manufacturers to publish Q3. How to build effective machine learning classifiers for
device specifications, including intended communication IoT device classification?
patterns. IoT devices generally perform a specific func- Analysis in Sec. VI showed that creating one multi-class
tion [107], and therefore have a recognizable communication classifier is not scalable and evolutive because the entire
pattern, which can be captured formally and concisely as model must be retrained when a new device type is added
a MUD profile [108]. Unfortunately, current IoT manufac- to the network. On the contrary, building a classifier per
turers do not yet support MUD specifications and mecha- device reduces costly re-training, allows discovery of addi-
nism. Hamza et al. [108] publicly share their tool called tional device kinds, and makes decisions more interpretable.
MUDgee [109] to automatically generate MUD profiles of However, one-class classifier techniques are more compu-
IoT devices. tationally demanding since the results of several classifiers
must be computed and managed.
IX. CONCLUSION Unsupervised learning is more scalable and suited for the
Classifying IoT devices has been proposed as a potential rising variety of IoT device types than supervised learning
solution to secure and manage the IoT ecosystem. This paper since it minimizes labeling costs. However, unsupervised
reviewed relevant literature to answer the following research learning is understudied in the literature. Moreover, more
questions: evaluation scenarios and metrics are required for realistic
Q1. How to design a practical data-acquisition method? assessment of classification algorithms.
Sec. IV showed that it is more practical to collect traffic Q4. How to set the classification granularity? In
data from both IoT and non-IoT devices as they co-exist in Sec. VII, we discussed category-, type-, and instance-based
smart homes and can easily be confused. classification.
There are three device operation modes: setup, idle, and Type-instance device classification achieves the best
interaction. Traffic generated by the devices during idle and trade-off between accuracy and ease of implementation.
interaction modes is abundant and widely used in the litera- Despite being imprecise, the category level classification of
ture. The setup traffic is stable and allows for rapid identifi- IoT devices is scalable and thus more adapted to the IoT
cation once the device is connected to the network. However, context where devices’ diversity is growing. To avoid costly
setup traffic is difficult to collect since the initialization phase classification, granularity level should be set depending on
may not appear multiple times in the device’s lifetime. the application context.
It is also more genuine to collect traffic from outside the We analyzed more issues and suggested new study direc-
home (i.e., place the probe after the gateway, rather than at the tions in Sec. VIII. We discussed scalability and implemen-
gateway) because this reflects real IoT device classification tation in practice and recommended looking at additional
use-cases. We found that this scenario is understudied in the challenges like adversarial attack robustness and model trans-
literature. ferability.

VOLUME 10, 2022 97135


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

APPENDIX

TABLE 2. Papers review with respect to the to data-acquisition and classification granularity taxonomies.

97136 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

TABLE 3. Papers review with respect to the feature extraction taxonomy.

VOLUME 10, 2022 97137


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

TABLE 4. Papers review with respect to the machine learning taxonomy.

97138 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

REFERENCES [21] Tcpdump. Accessed: Jul. 30, 2022. [Online]. Available: https://www.
[1] K. L. Lueth, M. Hasan, S. Sinha, S. Annaswamy, P. Wegner, F. Bruegge, tcpdump.org/
and M. Kulezak, ‘‘State of IoT-spring 2022,’’ IoT Anal. GmbH, Ham- [22] Wireshark. Accessed: Jul. 30, 2022. [Online]. Available: https://
burg, Germany, Tech. Rep., Spring 2022. [Online]. Available: https://iot- www.wireshark.org/
analytics.com/number-connected-iot-devices [23] X. Ma, J. Qu, J. Li, J. C. S. Lui, Z. Li, and X. Guan, ‘‘Pinpointing
[2] F. Shaikh, E. Bou-Harb, J. Crichigno, and N. Ghani, ‘‘A machine learn- hidden IoT devices via spatial–temporal traffic fingerprinting,’’ in Proc.
ing model for classifying unsolicited IoT devices by observing network INFOCOM IEEE Conf. Comput. Commun., Jul. 2020, pp. 894–903.
telescopes,’’ in Proc. 14th Int. Wireless Commun. Mobile Comput. Conf. [24] X. Ma, J. Qu, J. Li, J. C. S. Lui, Z. Li, W. Liu, and X. Guan, ‘‘Inferring
(IWCMC), Jun. 2018, pp. 938–943. hidden IoT devices and user interactions via spatial–temporal traffic finger-
[3] S. Biddle. (2017). Wikileaks Dump Shows CIA Could Turn Smart TVS printing,’’ IEEE/ACM Trans. Netw., vol. 30, no. 1, pp. 394–408, Feb. 2022.
Into Listening Devices. [Online]. Available: https://theintercept.com/ [25] Y. Meidan, V. Sachidananda, H. Peng, R. Sagron, Y. Elovici, and
2017/03/07/wikileaks-dump-shows-cia-could-turn-smart-tvs-into- A. Shabtai, ‘‘A novel approach for detecting vulnerable IoT devices
listening-devices/ connected behind a home NAT,’’ Comput. Secur., vol. 97, Oct. 2020,
[4] (2016). Today the Web Was Broken by Countless Hacked Devices—Your Art. no. 101968.
60-Second Summary. [Online]. Available: https://www.theregister.com/ [26] Github—IoT_Sentinel Device Fingerprint. Accessed: Jul. 30, 2022.
2016/10/21/dyn_dns_ddos_explained/ [Online]. Available: https://github.com/andypitcher/IoT_Sentinel
[5] T. T. T. Nguyen and G. Armitage, ‘‘A survey of techniques for internet [27] A. Sivanathan, H. H. Gharakheili, F. Loi, A. Radford, C. Wijenayake,
traffic classification using machine learning,’’ IEEE Commun. Surveys A. Vishwanath, and V. Sivaraman, ‘‘Classifying IoT devices in smart
Tuts., vol. 10, no. 4, pp. 56–76, Apr. 2008. environments using network traffic characteristics,’’ IEEE Trans. Mobile
[6] P. Yadav, A. Feraudo, B. Arief, S. F. Shahandashti, and V. G. Vassilakis, Comput., vol. 18, no. 8, pp. 1745–1759, Aug. 2018.
‘‘Position paper: A systematic framework for categorising IoT device [28] UNSW IoT Analytics Database. Accessed: Jul. 30, 2022. [Online]. Avail-
fingerprinting mechanisms,’’ in Proc. 2nd Int. Workshop Challenges able: https://iotanalytics.unsw.edu.au/
Artif. Intell. Mach. Learn. Internet Things, Nov. 2020, pp. 62–68, doi: [29] O. Alrawi, C. Lever, M. Antonakakis, and F. Monrose, ‘‘SoK: Security
10.1145/3417313.3429384. evaluation of home-based IoT deployments,’’ in Proc. IEEE Symp. Secur.
[7] H. Tahaei, F. Afifi, A. Asemi, F. Zaki, and N. B. Anuar, ‘‘The rise of traffic Privacy (SP), May 2019, pp. 1362–1380.
classification in IoT networks: A survey,’’ J. Netw. Comput. Appl., vol. 154, [30] (2018). YourThings Database. [Online]. Available: https://yourthings.
Mar. 2020, Art. no. 102538. info/data/
[8] A. Sivanathan, H. H. Gharakheili, and V. Sivaraman, ‘‘Can we classify an [31] R. Perdisci, T. Papastergiou, O. Alrawi, and M. Antonakakis, ‘‘IoTFinder:
IoT device using TCP port scan?’’ in Proc. IEEE Int. Conf. Inf. Autom. Efficient large-scale identification of IoT devices via passive DNS traf-
Sustainability (ICIAfS), Dec. 2018, pp. 1–4. fic analysis,’’ in Proc. IEEE Eur. Symp. Secur. Privacy, Sep. 2020,
[9] M. R. Shahid, G. Blanc, Z. Zhang, and H. Debar, ‘‘IoT devices recognition pp. 474–489.
through network traffic analysis,’’ in Proc. IEEE Int. Conf. Big Data, [32] I. Cvitić, D. Peraković, M. Periša, and B. Gupta, ‘‘Ensemble machine
Dec. 2018, pp. 5187–5192. learning approach for classification of IoT devices in smart home,’’ Int.
[10] C. Xenofontos, I. Zografopoulos, C. Konstantinou, A. Jolfaei, M. K. J. Mach. Learn. Cybern., pp. 1–24, 2021.
Khan, and K.-K.-R. Choo, ‘‘Consumer, commercial, and industrial IoT [33] IoT Traffic. Accessed: Jul. 30, 2022. [Online]. Available: https://www.
(in) security: Attack taxonomy and case studies,’’ IEEE Internet Things kaggle.com/datasets/5cae54f093f90a5a0613542573a16ab7c3f5dbf271cac
J., vol. 9, no. 1, pp. 199–221, Jan. 2022. 549578266e7c63-2d7f9
[11] Gartner. (2016). Gartner Says 8.4 Billion Connected ‘Things’ Will [34] R. Kolcun, D. A. Popescu, V. Safronov, P. Yadav, A. M. Mandalari,
be in Use in 2017, Up 31 Percent From 2016. [Online]. Available: R. Mortier, and H. Haddadi, ‘‘Revisiting IoT device identification,’’ 2021,
https://www.gartner.com/en/newsroom/press-releases/2017-02-07- arXiv:2107.07818.
gartner-says-8-billion-connected-things-will-be-in-use-in-2017-up-31-
[35] Dadabox Data-Set. Accessed: Jul. 30, 2022. [Online]. Available:
percent-from-2016
https://github.com/DADABox/revisiting-iot-device-identification
[12] O. Salman, I. Elhajj, A. Kayssi, and C. Ali, ‘‘A review on machine
[36] IoT Traffic Data. Accessed: Jul. 30, 2022. [Online]. Available:
learning–based approaches for internet traffic classification,’’ Ann.
https://github.com/DongShuaike/iot-traffic-dataset
Telecommun., vol. 75, pp. 673–710, Jun. 2020.
[37] IoT-deNAT. Accessed: Jul. 30, 2022. [Online]. Available:https://
[13] P. M. S. Sanchez, J. M. J. Valero, A. H. Celdran, G. Bovet, M. G. Perez, and
zenodo.org/record/3924770#.YUiPs7gzZPY
G. M. Perez, ‘‘A survey on device behavior fingerprinting: Data sources,
techniques, application scenarios, and datasets,’’ IEEE Commun. Surveys [38] J. Ren, D. J. Dubois, D. Choffnes, A. M. Mandalari, R. Kolcun, and
Tuts., vol. 23, no. 2, pp. 1048–1077, 2nd Quart., 2021. H. Haddadi, ‘‘Information exposure from consumer IoT devices: A multi-
[14] M. Miettinen, S. Marchal, I. Hafeez, N. Asokan, A.-R. Sadeghi, and dimensional, network-informed measurement approach,’’ in Proc. Internet
S. Tarkoma, ‘‘IoT SENTINEL: Automated device-type identification for Meas. Conf., Oct. 2019, pp. 267–279.
security enforcement in IoT,’’ in Proc. IEEE 37th Int. Conf. Distrib. [39] The MON(IoT)R Dataset. Accessed: Jul. 30, 2022. [Online]. Available:
Comput. Syst. (ICDCS), Jun. 2017, pp. 2177–2184. https://github.com/NEU-SNS/intl-iot
[15] Y. Meidan, M. Bohadana, A. Shabtai, M. Ochoa, N. O. Tippenhauer, [40] S. Garcia, A. Parmisano, and M. J. Erquiaga. (Jan. 2020). IoT-23:
J. D. Guarnizo, and Y. Elovici, ‘‘Detection of unauthorized IoT devices A Labeled Dataset With Malicious and Benign IoT Network Traffic.
using machine learning techniques,’’ 2017, arXiv:1709.04647. [Online]. Available: https://www.stratosphereips.org /datasets-iot23
[16] I. Hafeez, M. Antikainen, and S. Tarkoma, ‘‘Protecting IoT-environments [41] Iot-23 Data-Set. Accessed: Jul. 30, 2022. [Online]. Available:
against traffic analysis attacks with traffic morphing,’’ in Proc. IEEE https://zenodo.org/record/4743746#.YW_GphpBxPZ
Int. Conf. Pervasive Comput. Commun. Workshops (PerCom Workshops), [42] (2017). CISCO—Netflow. [Online]. Available: https://Cisco.com
Mar. 2019, pp. 196–201. [43] S. K. Nukavarapu and T. Nadeem, ‘‘Securing edge-based IoT networks
[17] S. Dong, Z. Li, D. Tang, J. Chen, M. Sun, and K. Zhang, ‘‘Your smart with semi-supervised GANs,’’ in Proc. IEEE Int. Conf. Pervasive Comput.
home can’t keep a secret: Towards automated fingerprinting of IoT traffic,’’ Commun. Workshops Affiliated Events (PerCom Workshops), Mar. 2021,
in Proc. 15th ACM Asia Conf. Comput. Commun. Secur., Oct. 2020, pp. 579–584.
pp. 47–59. [44] L. Yu, B. Luo, J. Ma, Z. Zhou, and Q. Liu, ‘‘You are what you broadcast:
[18] Most Popular Smart Home Deviecs. Accessed: Jul. 30, 2022. [Online]. Identification of mobile and IoT devices from (public) WiFi,’’ in Proc. 29th
Available: http://iotlineup.com/ USENIX Secur. Symp. Secur., 2020, pp. 55–72.
[19] A. Sivanathan, D. Sherratt, H. H. Gharakheili, A. Radford, C. Wijenayake, [45] P. A. Abhang, B. W. Gawali, and S. C. Mehrotra, ‘‘Chapter 5—Emotion
A. Vishwanath, and V. Sivaraman, ‘‘Characterizing and classifying IoT recognition,’’ in Introduction to EEG- and Speech-Based Emotion Recog-
traffic in smart cities and campuses,’’ in Proc. IEEE Conf. Comput. Com- nition, P. A. Abhang, B. W. Gawali, and S. C. Mehrotra, Eds. Cambridge,
mun. Workshops (INFOCOM WKSHPS), May 2017, pp. 559–564. MA, USA: Academic Press, 2016, pp. 97–112. [Online]. Available: https://
[20] S. Marchal, M. Miettinen, T. D. Nguyen, A.-R. Sadeghi, and N. Asokan, www.sciencedirect.com/science/article/pii/B9780128044902000051
‘‘AuDI: Toward autonomous IoT device-type identification using peri- [46] A. J. Pinheiro, J. de M. Bezerra, C. A. P. Burgardt, and D. R. Campelo,
odic communication,’’ IEEE J. Sel. Areas Commun., vol. 37, no. 6, ‘‘Identifying IoT devices and events based on packet length from encrypted
pp. 1402–1412, Jun. 2019. traffic,’’ Comput. Commun., vol. 144, pp. 8–17, Aug. 2019.

VOLUME 10, 2022 97139


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

[47] B. Bezawada, M. Bachani, J. Peterson, H. Shirazi, I. Ray, and [71] K. P. Balanda and H. L. Macgillivray, ‘‘Kurtosis: A critical review,’’ Amer.
I. Ray, ‘‘IoTSense: Behavioral fingerprinting of IoT devices,’’ 2018, Statistician, vol. 42, no. 2, pp. 111–119, May 1988.
arXiv:1804.03852. [72] R. Harris, ‘‘Testing for unit roots using the augmented Dickey-fuller test:
[48] K. Kostas, M. Just, and M. A. Lones, ‘‘IoTDevID: A behavior-based device Some issues relating to the size, power and the lag structure of the test,’’
identification method for the IoT,’’ 2021, arXiv:2102.08866. Econ. Lett., vol. 38, no. 4, pp. 381–386, 1992.
[49] J. Sun, K. Sun, and C. Shenefiel, ‘‘Automated IoT device fingerprinting [73] G. Cirillo and R. Passerone, ‘‘Packet length spectral analysis for IoT
through encrypted stream classification,’’ in Proc. Int. Conf. Secur. Privacy flow classification using ensemble learning,’’ IEEE Access, vol. 8,
Commun. Syst. Cham, Switzerland: Springer, 2019, pp. 147–167. pp. 138616–138641, 2020.
[50] N. Ammar, L. Noirie, and S. Tixeuil, ‘‘Autonomous identification of IoT [74] J. Greis, A. Yushchenko, D. Vogel, M. Meier, and V. Steinhage, ‘‘Auto-
device types based on a supervised classification,’’ in Proc. IEEE Int. Conf. mated identification of vulnerable devices in networks using traffic data
Commun. (ICC), Jun. 2020, pp. 1–6. and deep learning,’’ 2021, arXiv:2102.08199.
[51] Q. Huang, Y. Song, J. Yang, M. Fan, and A. Hu, ‘‘A booting fingerprint of [75] J. Kotak and Y. Elovici, ‘‘IoT device identification using deep learning,’’ in
device for network access control,’’ in Proc. 3rd Int. Conf. Circuits, Syst. Proc. Comput. Intell. Secur. Inf. Syst. Conf. Cham, Switzerland: Springer,
Simul. (ICCSS), Jun. 2019, pp. 251–254. 2020, pp. 76–86.
[52] S. A. Hamad, W. E. Zhang, Q. Z. Sheng, and S. Nepal, ‘‘IoT device [76] F. Yin, L. Yang, Y. Wang, and J. Dai, ‘‘IoT ETEI: End-to-end IoT device
identification via network-flow based fingerprinting and learning,’’ in identification method,’’ in Proc. IEEE Conf. Dependable Secure Comput.
Proc. 18th IEEE Int. Conf. Trust, Secur. Privacy Comput. Commun./13th (DSC), Jan. 2021, pp. 1–8.
IEEE Int. Conf. Big Data Sci. Eng., Aug. 2019, pp. 103–111. [77] CICFlowMeter Features. Accessed: Jul. 30, 2022. [Online]. Available:
[53] L. Fan, S. Zhang, Y. Wu, Z. Wang, C. Duan, J. Li, and J. Yang, ‘‘An IoT https://github.com/ahlashkari/CICFlowMeter/blob/master/ReadMe.txt
device identification method based on semi-supervised learning,’’ in Proc. [78] D. Bekerman, B. Shapira, L. Rokach, and A. Bar, ‘‘Unknown malware
16th Int. Conf. Netw. Service Manag. (CNSM), Nov. 2020, pp. 1–7. detection using network traffic classification,’’ in Proc. IEEE Conf. Com-
[54] F. Le, J. Ortiz, D. Verma, and D. Kandlur, ‘‘Policy-based identification mun. Netw. Secur. (CNS), Sep. 2015, pp. 134–142.
of IoT devices vendor and type by DNS traffic analysis,’’ in Policy- [79] Joy Feature Extracture. Accessed: Jul. 30, 2022. [Online]. Available:
Based Autonomic Data Governance. Berlin, Germany: Springer, 2019, https://github.com/cisco/joy
pp. 180–201. [80] B. Ghojogh, M. N. Samad, S. A. Mashhadi, T. Kapoor, W. Ali, F. Karray,
[55] L. Bai, L. Yao, S. S. Kanhere, X. Wang, and Z. Yang, ‘‘Automatic device and M. Crowley, ‘‘Feature selection and feature extraction in pattern
classification from network traffic streams of Internet of Things,’’ in Proc. analysis: A literature review,’’ 2019, arXiv:1905.02845.
IEEE 43rd Conf. Local Comput. Netw. (LCN), Oct. 2018, pp. 1–9. [81] J. Bao, B. Hamdaoui, and W.-K. Wong, ‘‘IoT device type identifica-
[56] N. Brownlee, C. Mills, and G. Ruth, ‘‘Traffic flow measurement: Archi- tion using hybrid deep learning approach for increased IoT security,’’
tecture,’’ IETF, Fremont, CA, USA, Tech. Rep. RFC 2722, Oct. 1999. in Proc. Int. Wireless Commun. Mobile Comput. (IWCMC), Jun. 2020,
[Online]. Available: https://www.rfc-editor.org/rfc/rfc2722.txt pp. 565–570.
[57] Y. Meidan, M. Bohadana, A. Shabtai, J. D. Guarnizo, M. Ochoa, [82] M. Tschannen, O. Bachem, and M. Lucic, ‘‘Recent advances in
N. O. Tippenhauer, and Y. Elovici, ‘‘ProfilIoT: A machine learning autoencoder-based representation learning,’’ 2018, arXiv:1812.05069.
approach for IoT device identification based on network traffic analysis,’’ [83] Y. Bengio, A. Courville, and P. Vincent, ‘‘Representation learning:
in Proc. Symp. Appl. Comput. New York, NY, USA: Association for A review and new perspectives,’’ IEEE Trans. Pattern Anal. Mach. Intell.,
Computing Machinery, Apr. 2017, pp. 506–509. vol. 35, no. 8, pp. 1798–1828, Aug. 2013.
[58] A. Sivanathan, H. H. Gharakheili, and V. Sivaraman, ‘‘Managing IoT [84] A. Aksoy and M. H. Gunes, ‘‘Automated IoT device identification using
cyber-security using programmable telemetry and machine learning,’’ network traffic,’’ in Proc. IEEE Int. Conf. Commun. (ICC), May 2019,
IEEE Trans. Netw. Service Manag., vol. 17, no. 1, pp. 60–74, Mar. 2020. pp. 1–7.
[59] V. Melnyk, P. Haleta, and N. Golphamid, ‘‘Machine learning based network [85] A. Sivanathan, H. H. Gharakheili, and V. Sivaraman, ‘‘Inferring IoT device
traffic classification approach for Internet of Things devices,’’ Theor. types from network behavior using unsupervised clustering,’’ in Proc.
Appl. Cybersecur., vol. 2, no. 1, Aug. 2020, doi: 10.20535/tacs.2664- IEEE 44th Conf. Local Comput. Netw. (LCN), Oct. 2019, pp. 230–233.
29132020.1.209472. [86] O. Salman, I. H. Elhajj, A. Chehab, and A. Kayssi, ‘‘A machine learn-
[60] A. Bremler-Barr, H. Levy, and Z. Yakhini, ‘‘IoT or NoT: Identifying IoT ing based framework for IoT device identification and abnormal traffic
devices in a short time scale,’’ in Proc. IEEE/IFIP Netw. Oper. Manag. detection,’’ Trans. Emerg. Telecommun. Technol., vol. 33, no. 3, p. e3743,
Symp. (NOMS), Apr. 2020, pp. 1–9. Mar. 2022.
[61] V. Thangavelu, D. M. Divakaran, R. Sairam, S. S. Bhunia, and [87] Y. Sun, J. Liu, A. K. Bashir, U. Tariq, W. Liu, K. Chen, and M. D. Alshehri,
M. Gurusamy, ‘‘DEFT: A distributed IoT fingerprinting technique,’’ IEEE ‘‘E-CIS: Edge-based classifier identification scheme in green & sus-
Internet Things J., vol. 6, no. 1, pp. 940–952, Feb. 2019. tainable IoT smart city,’’ Sustain. Cities Soc., vol. 75, Dec. 2021,
[62] R. Kolcun, D. A. Popescu, V. Safronov, P. Yadav, A. M. Mandalari, Y. Xie, Art. no. 103312.
R. Mortier, and H. Haddadi, ‘‘The case for retraining of ML models for [88] Y. Song, Q. Huang, J. Yang, M. Fan, A. Hu, and Y. Jiang, ‘‘IoT device
IoT device identification at the edge,’’ 2020, arXiv:2011.08605. fingerprinting for relieving pressure in the access control,’’ in Proc. ACM
[63] E. Valdez, D. Pendarakis, and H. Jamjoom, ‘‘How to discover IoT devices Turing Celebration Conf.-China, May 2019, pp. 1–8.
when network traffic is encrypted,’’ in Proc. IEEE Int. Congr. Internet [89] Y. Vorobeychik and M. Kantarcioglu, ‘‘Adversarial machine learning,’’
Things (ICIOT), Jul. 2019, pp. 17–24. Synth. Lect. Artif. Intell. Mach. Learn., vol. 12, pp. 1–169, Dec. 2017.
[64] S. Aneja, N. Aneja, and M. S. Islam, ‘‘IoT device fingerprint using deep [90] N. Burkart and M. F. Huber, ‘‘A survey on the explainability of supervised
learning,’’ in Proc. IEEE Int. Conf. Internet Things Intell. Syst. (IOTAIS), machine learning,’’ J. Artif. Intell. Res., vol. 70, pp. 245–317, Jan. 2021.
Nov. 2018, pp. 174–179. [91] N. Papernot, P. McDaniel, and I. Goodfellow, ‘‘Transferability in machine
[65] L. Babun, H. Aksu, L. Ryan, K. Akkaya, E. S. Bentley, and A. S. Uluagac, learning: From phenomena to black-box attacks using adversarial sam-
‘‘Z-IoT: Passive device-class fingerprinting of ZigBee and Z-wave IoT ples,’’ 2016, arXiv:1605.07277.
devices,’’ in Proc. IEEE Int. Conf. Commun. (ICC), Jun. 2020, pp. 1–7. [92] A. Hsu, J. Tront, D. Raymond, G. Wang, and A. Butt, ‘‘Automatic IoT
[66] H. Aksu, A. S. Uluagac, and E. S. Bentley, ‘‘Identification of wearable device classification using traffic behavioral characteristics,’’ in Proc.
devices with bluetooth,’’ IEEE Trans. Sustain. Comput., vol. 6, no. 2, SoutheastCon, 2019, pp. 1–7.
pp. 221–230, Jun. 2021. [93] A. Bassene and B. Gueye, ‘‘A group-based IoT devices classification
[67] N. Msadek, R. Soua, and T. Engel, ‘‘IoT device fingerprinting: Machine through network traffic analysis based on machine learning approach,’’
learning based encrypted traffic analysis,’’ in Proc. IEEE Wireless Com- in Proc. Int. Conf. E-Infrastruct. E-Services Developing Countries. Cham,
mun. Netw. Conf. (WCNC), Apr. 2019, pp. 1–8. Switzerland: Springer, 2021, pp. 185–202.
[68] Y. Wan, K. Xu, F. Wang, and G. Xue, ‘‘IoTAthena: Unveiling IoT device [94] I. Cvitic, D. Perakovic, M. Perisa, and M. Botica, ‘‘Definition of the IoT
activities from network traffic,’’ 2021, arXiv:2105.14405. device classes based on network traffic flow features,’’ in Proc. 4th EAI Int.
[69] L. Zhang, L. Gong, and H. Qian, ‘‘An effiective IoT device identification Conf. Manag. Manuf. Syst. Cham, Switzerland: Springer, 2020, pp. 1–17.
using machine learning algorithm,’’ in Proc. IEEE 6th Int. Conf. Comput. [95] A. Thantharate, R. Paropkari, V. Walunj, and C. Beard, ‘‘DeepSlice: A deep
Commun. (ICCC), Dec. 2020, pp. 874–877. learning approach towards an efficient and reliable network slicing in
[70] D. P. Doane and L. E. Seward, ‘‘Measuring skewness: A forgotten statis- 5G networks,’’ in Proc. IEEE 10th Annu. Ubiquitous Comput., Electron.
tic?’’ J. Statist. Educ., vol. 19, no. 2, pp. 1–19, Jul. 2011. Mobile Commun. Conf. (UEMCON), Oct. 2019, pp. 0762–0767.

97140 VOLUME 10, 2022


H. Jmila et al.: Survey of Smart Home IoT Device Classification Using ML-Based Network Traffic Analysis

[96] V. A. Ferman and M. Ali Tawfeeq, ‘‘Machine learning challenges for [118] V. A. Ferman and M. A. Tawfeeq, ‘‘Gradient boosting algorithm for
IoT device fingerprints identification,’’ J. Phys., Conf., vol. 1963, no. 1, early detection of unknown Internet of Things devices,’’ J. Eng. Sustain.
Jul. 2021, Art. no. 012046. Develop., vol. 25, pp. 1-115–1-126, Sep. 2021.
[97] B. Chakraborty, D. M. Divakaran, I. Nevat, G. W. Peters, and M. Gurusamy, [119] R. Kumar, M. Swarnkar, G. Singal, and N. Kumar, ‘‘IoT network traffic
‘‘Cost-aware feature selection for IoT device classification,’’ IEEE Internet classification using machine learning algorithms: An experimental analy-
Things J., vol. 8, no. 14, pp. 11052–11064, Jul. 2021. sis,’’ IEEE Internet Things J., vol. 9, no. 2, pp. 989–1008, Jan. 2022.
[98] B. A. Desai, D. M. Divakaran, I. Nevat, G. W. Peter, and M. Gurusamy,
‘‘A feature-ranking framework for IoT device classification,’’ in Proc. 11th
Int. Conf. Commun. Syst. Netw. (COMSNETS), Jan. 2019, pp. 64–71.
[99] M. R. Shahid, G. Blanc, H. Jmila, Z. Zhang, and H. Debar, ‘‘Genera-
HOUDA JMILA received the engineering degree
tive deep learning for Internet of Things network traffic generation,’’ in
in computer science and the Ph.D. degree in
Proc. IEEE 25th Pacific Rim Int. Symp. Dependable Comput. (PRDC),
Dec. 2020, pp. 70–79. telecommunications and computer science from
[100] D. Baylor, K. Haas, K. Katsiapis, S. Leong, R. Liu, C. Menwald, H. Miao, Telecom Sudparis, Institut Polytechnique de Paris,
N. Polyzotis, M. Trott, and M. Zinkevich, ‘‘Continuous training for produc- France, in 2011 and 2015, respectively. She is
tion in the TensorFlow extended (TFX) platform,’’ in Proc. USENIX Conf. currently a Postdoctoral Researcher with Telecom
Oper. Mach. Learn. (OpML), 2019, pp. 51–53. SudParis. Her research interests include network
[101] O. Ibitoye, O. Shafiq, and A. Matrawy, ‘‘Analyzing adversarial attacks security and automated management of resources
against deep learning for intrusion detection in IoT networks,’’ in Proc. in virtual networks and the machine learning appli-
IEEE Global Commun. Conf. (GLOBECOM), Dec. 2019, pp. 1–6. cations to these domains.
[102] I. Rosenberg, A. Shabtai, Y. Elovici, and L. Rokach, ‘‘Adversarial
machine learning attacks and defense methods in the cyber security
domain,’’ 2020, arXiv:2007.02407.
[103] S. J. Pan, ‘‘Transfer learning,’’ Learning, vol. 21, pp. 1–2, Feb. 2020. GREGORY BLANC received the Ph.D. degree
[104] S. K. Thompson, Sampling, vol. 755. Hoboken, NJ, USA: Wiley, 2012. from NAIST, Japan, in 2012. During his Ph.D.,
[105] E. Lear, R. Droms, and D. Romascanu, ‘‘Manufacturer usage descrip-
he chaired the Web 2.0 Application Security WG
tion,’’ IETF, Tech. Rep., Sep. 2016. Accessed: Jul. 30, 2022. [Online].
Available: https://www.rfc-editor.org/rfc/rfc8520.html
in the WIDE Project. He is an Associate Professor
[106] I. Ishaq, D. Carels, G. K. Teklemariam, J. Hoebeke, F. V. D. Abeele, (Maître de Conférences) with the Networks
E. D. Poorter, I. Moerman, and P. Demeester, ‘‘IETF standardization in the and Telecommunication Services Department,
field of the Internet of Things (IoT): A survey,’’ J. Sensor Actuator Netw., IMT/Télécom SudParis, Institut Polytechnique de
vol. 2, no. 2, pp. 235–287, Apr. 2013. Paris, and a Coordinator of ICT Security Curricu-
[107] P. Sudhakaran and C. Malathy, ‘‘Authorisation, attack detection and lum. He is the Co-Chair of WG7 on Network Secu-
avoidance framework for IoT devices,’’ IET Netw., vol. 9, no. 5, rity at Cybersecurity France–Japan and a Steering
pp. 209–214, Sep. 2020. Committee Member of the RESSI Conference. His research interests include
[108] A. Hamza, H. H. Gharakheili, and V. Sivaraman, ‘‘Combining MUD network security, network virtualization, and applications of machine learn-
policies with SDN for IoT intrusion detection,’’ in Proc. Workshop IoT ing to cybersecurity.
Secur. Privacy, Aug. 2018, pp. 1–7.
[109] Mudgee. Accessed: Jul. 30, 2022. [Online]. Available: https://github.com/
ayyoob/mudgee
[110] M. R. P. Santos, R. M. C. Andrade, D. G. Gomes, and A. C. Callado, MUSTAFIZUR R. SHAHID received the M.Sc.
‘‘An efficient approach for device identification and traffic classification
degree in computer science and the Ph.D. degree
in IoT ecosystems,’’ in Proc. IEEE Symp. Comput. Commun. (ISCC),
Jun. 2018, pp. 00304–00309.
in computer science and artificial intelligence from
[111] J. Ortiz, C. Crawford, and F. Le, ‘‘DeviceMien: Network device behavior the Institut Polytechnique de Paris, in 2017 and
modeling for identifying unknown IoT devices,’’ in Proc. Int. Conf. Inter- 2021, respectively. He is currently a Postdoc-
net Things Design Implement., Apr. 2019, pp. 106–117. toral Researcher with the Machine Learning and
[112] A. Hamza, H. H. Gharakheili, T. A. Benson, and V. Sivaraman, ‘‘Detect- Cybersecurity, Télécom SudParis, Institut Poly-
ing volumetric attacks on loT devices via SDN-based monitoring of MUD technique de Paris. His research interests include
activity,’’ in Proc. ACM Symp. SDN Res., Apr. 2019, pp. 36–48. data science, machine learning, artificial intelli-
[113] L. Deng, Y. Feng, D. Chen, and N. Rishe, ‘‘IoTSpot: Identifying the IoT gence, computer and network security, and the
devices using their anonymous network traffic data,’’ in Proc. IEEE Mil. Internet of Things (IoT).
Commun. Conf. (MILCOM), Nov. 2019, pp. 1–6.
[114] X. Wang, Y. Wang, X. Feng, H. Zhu, L. Sun, and Y. Zou, ‘‘IoTTracker:
An enhanced engine for discovering Internet-of-Thing devices,’’ in Proc.
IEEE 20th Int. Symp. World Wireless, Mobile Multimedia Netw. (WoW-
MARWAN LAZRAG received the engineering
MoM), Jun. 2019, pp. 1–9.
[115] K. Yang, Q. Li, and L. Sun, ‘‘Towards automatic fingerprinting of
degree in telecommunications from the Engi-
IoT devices in the cyberspace,’’ Comput. Netw., vol. 148, pp. 318–327, neering School of Communications (Sup’Com),
Jan. 2019. University of Carthage, Tunisia, in 2019. He is
[116] Y. Sun, S. Fu, S. Zhang, H. Zhu, and Y. Li, ‘‘Accurate IoT device a Research and Development Engineer with
identification from merely packet length,’’ in Proc. 16th Int. Conf. Mobility, IMT/Télécom SudParis, Institut Polytechnique de
Sens. Netw. (MSN), Dec. 2020, pp. 774–778. Paris. His research interests include the Internet of
[117] A. Sivanathan, H. H. Gharakheili, and V. Sivaraman, ‘‘Detecting behav- Things and the development of a decision support
ioral change of IoT devices using clustering-based network traffic model- and security automation platform.
ing,’’ IEEE Internet Things J., vol. 7, no. 8, pp. 7295–7309, Aug. 2020.

VOLUME 10, 2022 97141

You might also like