0% found this document useful (0 votes)
31 views

Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics

Uploaded by

Vaishali Soni
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics

Uploaded by

Vaishali Soni
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO.

8, AUGUST 2019 1745

Classifying IoT Devices in Smart Environments


Using Network Traffic Characteristics
Arunan Sivanathan , Hassan Habibi Gharakheili , Franco Loi, Adam Radford, Chamith Wijenayake ,
Arun Vishwanath , Senior Member, IEEE, and Vijay Sivaraman

Abstract—The Internet of Things (IoT) is being hailed as the next wave revolutionizing our society, and smart homes, enterprises, and
cities are increasingly being equipped with a plethora of IoT devices. Yet, operators of such smart environments may not even be fully
aware of their IoTassets, let alone whether each IoT device is functioning properly safe from cyber-attacks. In this paper, we address this
challenge by developing a robust framework for IoT device classification using traffic characteristics obtained at the network level. Our
contributions are fourfold. First, we instrument a smart environment with 28 different IoT devices spanning cameras, lights, plugs, motion
sensors, appliances, and health-monitors. We collect and synthesize traffic traces from this infrastructure for a period of six months, a
subset of which we release as open data for the community to use. Second, we present insights into the underlying network traffic
characteristics using statistical attributes such as activity cycles, port numbers, signalling patterns, and cipher suites. Third, we develop
a multi-stage machine learning based classification algorithm and demonstrate its ability to identify specific IoT devices with over
99 percent accuracy based on their network activity. Finally, we discuss the trade-offs between cost, speed, and performance involved in
deploying the classification framework in real-time. Our study paves the way for operators of smart environments to monitor their IoT
assets for presence, functionality, and cyber-security without requiring any specialized devices or protocols.

Index Terms—IoT, network characteristics, device visibility, classification, machine learning

Ç
1 INTRODUCTION

T HE number of devices connecting to the Internet is bal-


looning, ushering in the era of the “Internet of Things”
(IoT). IoT refers to the tens of billions of low cost devices that
facilities team, sewage and garbage sensors by the sanita-
tion department and surveillance cameras by the local
police division. Coordinating across various departments to
communicate with each other and with remote servers on the obtain an inventory of IoT assets is time consuming, oner-
Internet autonomously. It comprises everyday objects such as ous and error-prone, making it nearly impossible to know
lights, cameras, motion sensors, door locks, thermostats, precisely what IoT devices are operating on the network at
power switches and household appliances, with shipments any point in time. Obtaining “visibility” into IoT devices in
projected to reach nearly 20 billion by 2020 [1]. Thousands of a timely manner is of paramount importance to the opera-
IoT devices are expected to find their way in homes, enter- tor, who is tasked with ensuring that devices are in appro-
prises, campuses and cities of the near future, engendering priate network security segments, are provisioned for
“smart” environments benefiting our society and our lives. requisite quality of service, and can be quarantined rapidly
The proliferation of IoT, however, creates an important when breached. The importance of visibility is emphasized
problem. Operators of smart environments can find it diffi- in Cisco’s most recent IoT security report [2], and further
cult to determine what IoT devices are connected to their highlighted by two recent events: sensors of a fishtank that
network and further to ascertain whether each device is compromised a casino in Jul 2017 [3], and attacks on a Uni-
functioning normally. This is mainly attributed to the task versity campus network from its own vending machines in
of managing assets in an organization, which is typically Feb 2017 [4]. In both cases, network segmentation could
distributed across different departments. For example, in a have potentially prevented the attack and better visibility
local council, lighting sensors may be installed by the would have allowed rapid quarantining to limit the damage
of the cyber-attack on the enterprise network.
One would expect that devices can be identified by their
 A. Sivanathan, H. Habibi Gharakheili, F. Loi, C. Wijenayake, and
V. Sivaraman are with the School of Electrical Engineering and Telecommu- MAC address and DHCP negotiation. However, this faces
nications, University of New South Wales, Sydney, NSW 2052, Australia. several challenges: (a) IoT device manufacturers typically
E-mail: {asivanathan, h.habibi, c.wijenayake, vijay}@unsw.edu.au, f.loi@ use NICs supplied by third-party vendors, and hence the
student.unsw.edu.au. Organizationally Unique Identifier (OUI) prefix of the MAC
 A. Radford is with Cisco Systems, Sydney, NSW 2060, Australia. address may not convey any information about the IoT
E-mail: aradford@ciscocom.
 A. Vishwanath is with IBM Research, Southbank, VIC 3006, Australia. device; (b) MAC addresses can be spoofed by malicious
E-mail: [email protected]. devices; (c) many IoT devices do not set the Host Name
Manuscript received 4 Oct. 2017; revised 10 Aug. 2018; accepted 14 Aug. option in their DHCP requests [5]; indeed we found that
2018. Date of publication 20 Aug. 2018; date of current version 28 June 2019. about half the IoT devices we studied do not reveal their
(Corresponding author: Hassan Habibi Gharakheili.) host names, as shown in Table 1; (d) even when the IoT
For information on obtaining reprints of this article, please send e-mail to: device exposes its host name it may not always be meaning-
[email protected], and reference the Digital Object Identifier below.
Digital Object Identifier no. 10.1109/TMC.2018.2866249
ful (e.g., WBP-EE4C for Withings baby monitor in Table 1);

1536-1233 ß 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See ht_tp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1746 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

TABLE 1
MAC Address and DHCP Host Name of IoT Devices Used in Our Testbed

IoT device MAC address OUI DHCP host name


Amazon Echo 44:65:0d:56:cc:d3 Amazon Technologies Inc.
August Doorbell Cam e0:76:d0:3f:00:ae AMPAK Technology, Inc.
Awair air quality monitor 70:88:6b:10:0f:c6 Awair-4594
Belkin Camera b4:75:0e:ec:e5:a9 Belkin International Inc. NetCamHD
Belkin Motion Sensor ec:1a:59:83:28:11 Belkin International Inc.
Belkin Switch ec:1a:59:79:f4:89 Belkin International Inc.
Blipcare BP Meter 74:6a:89:00:2e:25 Rezolt Corporation
Canary Camera 7c:70:bc:5d:5e:dc IEEE Registration Authority Ambarella/C100F1615229
Dropcam 30:8c:fb:2f:e4:b2 Dropcam
Google Chromecast 6c:ad:f8:5e:e4:61 AzureWave Technology Inc. Chromecast
Hello Barbie 28:c2:dd:ff:a5:2d AzureWave Technology Inc. Barbie-A52D
HP Printer 70:5a:0f:e4:9b:c0 Hewlett Packard HPE49BC0
iHome PowerPlug 74:c6:3b:29:d7:1d AzureWave Technology Inc. hap-29D71D
LiFX Bulb d0:73:d5:01:83:08 LIFI LABS MANAGEMENT PTY LTD LIFX Bulb
NEST Smoke Sensor 18:b4:30:25:be:e4 Nest Labs Inc.
Netatmo Camera 70:ee:50:18:34:43 Netatmo netatmo-welcome-183443
Netatmo Weather station 70:ee:50:03:b8:ac Netatmo
Phillip Hue Lightbulb 00:17:88:2b:9a:25 Philips Lighting BV Philips-hue
Pixstart photo frame e0:76:d0:33:bb:85 AMPAK Technology, Inc.
Ring Door Bell 88:4a:ea:31:66:9d Texas Instruments
Samsung Smart Cam 00:16:6c:ab:6b:88 Samsung Electronics Co.,Ltd
Smart Things d0:52:a8:00:67:5e Physical Graph Corporation SmartThings
TP-Link Camera f4:f2:6d:93:51:f1 TP-LINK TECHNOLOGIES CO.,LTD. Little Cam
TP-Link Plug 50:c7:bf:00:56:39 TP-LINK TECHNOLOGIES CO.,LTD. HS110(US)
Triby Speaker 18:b7:9e:02:20:44 Invoxia
Withings Baby Monitor 00:24:e4:10:ee:4c Withings WBP-EE4C
Withings Scale 00:24:e4:1b:6f:96 Withings
Withings sleep sensor 00:24:e4:20:28:c6 Withings WSD-28C6

and lastly (e) these host names can be changed by the user emerging of how IoT devices have been compromised
(e.g., the HP printer can be given an arbitrary host name). and used to launch large-scale attacks [13]. The large het-
For these reasons, relying on DHCP infrastructure is not a erogeneity in IoT devices has led researchers to propose
viable solution to correctly identify devices at scale. network-level security mechanisms that analyze traffic
In this paper, we address the above problem by develop- patterns to identify attacks (see [14] and our recent work
ing a robust framework that classifies each IoT device sepa- [15]); success of these approaches relies on a good under-
rately in addition to one class of non-IoT devices with high standing of what “normal” IoT traffic profile looks like.
accuracy using statistical attributes derived from network Our primary focus in this work is to establish a machine
traffic characteristics. Qualitatively, most IoT devices are learning framework based on various network traffic char-
expected to send short bursts of data sporadically. Quantita- acteristics to identify and classify the default (i.e., baseline)
tively, our preliminary work in [6] was one of the first behavior of IoT devices on a network. Such a framework
attempts to study how much traffic IoT devices send in a can potentially be used in the future to detect anomalous
burst and how long they idle between activities. We also behavior of IoT devices (potentially due to cyber-attacks),
evaluated how much signaling they perform (e.g., domain and such anomaly detection schemes are beyond the scope
lookups using DNS or time synchronization using NTP) in of this paper. This paper fills an important gap in the litera-
comparison to the data traffic they generate. This paper sig- ture relating to classification of IoT devices based on their
nificantly expands on our prior work by employing a more network traffic characteristics. Our contributions are:
comprehensive set of attributes on trace data captured over 1) We instrument a living lab with 28 IoT devices emu-
a much longer duration (of 6 months) from a test-bed com- lating a smart environment. The devices include
prising 28 different IoT devices. cameras, lights, plugs, motion sensors, appliances
There is no doubt that it is becoming increasingly and health-monitors. We collect and synthesize data
important to understand the nature of IoT traffic. Doing so from this environment for a period of 6 months. A
helps contain unnecessary multicast/broadcast traffic, subset of our data is made available for the research
reducing the impact they have on other applications. It community to use.
also enables operators of smart cities and enterprises to 2) We identify key statistical attributes such as activity
dimension their networks for appropriate performance cycles, port numbers, signaling patterns and cipher
levels in terms of reliability, loss, and latency needed by suites, and use them to give insights into the under-
environmental, health, or safety applications. However, lying network traffic characteristics.
the most compelling reason for characterizing IoT traffic is 3) We develop a multi-stage machine learning based
to detect and mitigate cyber-security attacks. It is widely classification algorithm and demonstrate its ability
known that IoT devices are by their nature and design to identify specific IoT devices with over 99 percent
easy to infiltrate [7], [8], [9], [10], [11], [12]. New stories are accuracy based on their network behavior.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1747

4) We evaluate the deployment of the classification classify traffic application or identify malwares/botnets for
framework in real-time, by examining the trade-offs typical computer networks. The work in [27] uses deep learn-
between costs, speed, and accuracy of the classifier. ing to classify flow types such as HTTP, SMTP, Telnet, QUIC,
The rest of this paper is organized as follows: Section 2 Office365, and YouTube by considering six features namely
describes relevant prior work. We present our IoT setup source/destination port number, payload volume, TCP win-
and data traces in Section 3, and in Section 4 characterize dow size, inter-arrival time and direction of traffic that are
traffic attributes of the various IoT devices. In Section 5 we extracted from the first 20 packets of a flow. The work carried
propose a machine learning based multi-stage device classi- out in [28] suggests that botnets exhibit identifiable traffic
fication method and evaluate its performance, followed by patterns that can be classified by considering features such
a discussion on the real-time operation of the proposed sys- as average time between successive flows, flow duration,
tem in Section 6. The paper is concluded in Section 7. inbound/outbound traffic volume, and Fourier transforma-
tion over the flow start times. Detection of malicious activity
2 RELATED WORK on the network was enhanced in [29] and [30] by combining
these flow-level features with packet-level attributes includ-
There is a large body of work characterizing general Internet ing packet size, byte distribution of payload, inter arrival
traffic [16], [17], [18], [19]. These prior works largely focus on times of packets and TLS handshake metadata (i.e., cipher
application detection (e.g., Web browsing, Gaming, Mail, suite codes). Further, authors have released an open source
Skype VoIP, Peer-to-Peer, etc.). However, studies focusing libpcap-based tool called Joy [31] to extract these features
on characterizing IoT traffic (also referred to as machine-to- from the passive capture of network traffic.
machine or M2M traffic) are still in their infancy. In the context of IoT, [32] uses machine learning to clas-
Analysis of Empirical Traces. The work in [20] is one of the sify a single TCP flow from authorized devices on the net-
first large-scale studies to delve into the nature of M2M traf- work. It employs over 300 attributes (packet-level and flow-
fic. It is motivated by the need to understand whether M2M level), though the most influential ones are minimum,
traffic imposes new challenges for the design and manage- median and average of packets Time-To-Live (TTL), the
ment of cellular networks. The work uses a traffic trace ratio of total bytes transmitted and received, total number
spanning one week from a tier-1 cellular network operator packets with reset (RST) flag, and the Alexa rank of server.
and compares M2M traffic with traditional smart-phone While all the above works make important contributions,
traffic from a number of different perspectives—temporal they do not undertake fine-grained characterization and
variations, mobility, network performance, and so on. The classification of IoT devices in a smart environment such as
study informs network operators to be cognizant of these a home, city, campus or enterprise. Furthermore, statistical
factors when managing their networks. models are not developed that enable IoT device classifica-
In [21], the authors note that the amount of traffic gener- tion based on their network traffic characteristics. Most
ated by a single M2M device is likely to be small, but the importantly, prior works do not make any data set publicly
total traffic generated by hundreds or thousands of M2M available for the research community to use and build
devices would be substantial. These observations are to upon. Our work overcomes these shortcomings.
some extent corroborated by [22], [23], which note that a
remote patient monitoring application is expected to gener- 3 IOT TRAFFIC COLLECTION AND SYNTHESIS
ate about 0.35 MB per day and smart meters roughly
0.07 MB per day. In this section, we describe our smart environment infra-
Aggregated Traffic Model. A Coupled Markov Modulated structure for collecting and synthesizing traffic from various
Poisson Processes framework to capture the behavior of a IoT devices.
single machine-type communication as well as the collective
behavior of tens of thousands of M2M devices is proposed 3.1 Experimental Test-Bed
in [24]. The complexity of the CMMPP framework is shown A real-life architecture of a “smart environment” is depicted
to grow linearly with the number of M2M devices, render- in Fig. 1 that serves a wide range of IoT and non-IoT devices
ing it effective for large-scale synthesis of M2M traffic. over its (wired/wireless) network infrastructure and allows
In [25], the authors show that it is possible to split the them to communicate with the Internet servers via a gate-
(traffic) state of an M2M device into three generic categories, way. Our lab setup is a specialized implementation of this
namely periodic update, event driven, and payload excha- architecture, housed at our campus facility, comprises one
nge, and a number of modelling strategies that use these node of TP-Link Archer C7 v2 WiFi access point (represent-
states are developed. An illustration of model fitting is ing internal switch) collocated with the Internet gateway.
shown via a use-case in fleet management comprising 1000 The TP-Link access point, flashed with the OpenWrt firm-
trucks run by a transportation company. The fitting is based ware release Chaos Calmer (15.05.1, r48532), serves as the
on measured M2M traffic from a 2G/3G network. A simple gateway to the public Internet. We also installed additional
model to estimate the volume of M2M traffic generated in a OpenWrt packages on the gateway, namely tcpdump
wireless sensor network enabled connected home is con- (4.5.1-4) for capturing traffic, bash (4.3.39-1) for
structed in [26]. Since behavior of sensors is very application scripting, block-mount package for mounting external
specific, the work identifies certain common communication USB storage on the gateway, kmod-usb-core and kmod-
patterns that can be attributed to any sensor device. Using usb-storage (3.18.23-1) for storing the traffic trace
these attributes, four generalized equations are proposed to data on the USB storage.
estimate the volume of traffic generated by a sensor network In our lab setup, the WAN interface of the TP-Link access
enabled connected apartment/home. point is connected to the public Internet via the university
Use of Machine Learning. Various machine-learning-based network, while the IoT devices are connected to the LAN
analytical methods have been proposed in the literature to and WLAN interfaces respectively. Our smart environment
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1748 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

of 26 weeks. The raw trace data contains packet headers and


payload information. The process of data collection and stor-
age begins at midnight local time each day using the Cron job
on OpenWrt. We wrote a monitoring script on the OpenWrt
to ensure that data collection/storage was proceeding
smoothly. The script checks the processes running on the
gateway at 5 second intervals. If the logging process is not
running, then the script immediately restarts it, thereby limit-
ing any data loss event to only 5 seconds. To make the trace
data publicly available, we set up an Apache server on a vir-
tual machine (VM) in our university data center and wrote a
script to periodically transfer the trace data from the previous
day, stored on the hard drive, onto the VM. The trace data
from two weeks is openly available for download at: http://
iotanalytics.unsw.edu.au/. The size of the daily logs varies
between 61 MB and 2 GB, with an average of 365 MB.

4 IOT TRAFFIC CHARACTERIZATION


We now present our observations using passive packet-
level analysis of traffic from 28 IoT devices over the course
of 26 weeks. We study a broad range of IoT traffic character-
Fig. 1. Testbed architecture showing connected 28 different IoT devices istics including activity patterns (e.g., distribution of vol-
along with several non-IoT devices, and telemetry collected across the
ume/times during active/sleep periods), and signalling
infrastructure is fed to our classification models.
(e.g., domain names requested, server-side port numbers
has a total of 28 unique IoT devices representing different used and TLS handshake exchanges).
categories along with several non-IoT devices. Here, IoT IoT traffic constitutes (i) traffic generated by the devices
refers to specific-purpose Internet connected devices (e.g., autonomously—e.g., DNS, NTP, etc. that are unaffected by
cameras and smoke sensors), while general-purpose devices human interaction, as well as (ii) traffic generated due to
(e.g., phones and laptops) fall into the non-IoT category. users interacting with the devices—e.g., Belkin Wemo
The IoT devices include cameras (Nest Dropcam, Sam- sensor responding to detection of movement, Amazon Echo
sung SmartCam, Netatmo Welcome, Belkin camera, TP-Link responding to voice commands issued by a user, LiFX
Day Night Cloud camera, Withings Smart Baby Monitor, lightbulb changing colour and intensity upon user request,
Canary camera, August door bell, Ring door bell), switches Netatmo Welcome camera detecting an occupant and ins-
and triggers (iHome, TP-Link Smart Plug, Belkin Wemo tructing the LiFX light bulb to turn on with a specific colour,
Motion Sensor, Belkin Wemo Switch), hubs (Smart Things, and so on. Our dataset well captures these two types of IoT
Amazon Echo), air quality sensors (NEST Protect smoke traffic from a lab that represents a living smart environment
alarm, Netatmo Weather station, Awair air quality monitor), (i.e., covering periods over which humans are present or
electronics (Triby speaker, PIXSTAR Photoframe, HP Printer, absent in the environment).
Hello barbie, Google Chromecast), healthcare devices (With- To provide insights into the IoT traffic characteristics, we
ings Smart scale, Withings Aura smart sleep sensor, Blipcare show in Fig. 2a Sankey plot of network traffic seen over a
blood pressure meter) and light bulbs (Phips Hue and LiFX 24 hour period for Amazon Echo and LiFX lightbulb. These
Smart Bulb). Several non-IoT devices were also connected to devices are chosen just for illustrative purposes. Each plot
the testbed, such as laptops, mobile phones and an Android depicts the flow-level information generated by the respec-
tablet. The tablet was used to configure the IoT devices as rec- tive device. Flows are: (a) either unicast or multicast/broad-
ommended by the respective device manufacturers. cast, (b) destined to either local hosts (LAN) or Internet
servers (WAN), and (c) tied to protocols (TCP, UDP, ICMP
or IGMP) and port numbers.
3.2 Trace Data Fig. 2 provides a visual aid depicting the underlying traf-
All the traffic on the LAN side was collected using the fic signature exhibited by the two devices. For example, DNS
tcpdump tool running on OpenWrt [33]. It is important to (port number 53) and NTP (port number 123) are used by
have a one-to-one mapping between a physical device and both Amazon Echo and LiFX lightbulb. While Amazon Echo
a known MAC address (by virtue of being in the same uses HTTP (port number 80), HTTPS (port number 443) and
LAN) or IP address (i.e., without NAT) in the traffic trace. ICMP (port number 0), LiFX lightbulb does not use any of
Capturing traffic on the LAN allowed us to use MAC these applications. Further, each device seems to communi-
address as the identifier for a device to isolate its traffic cate to a unique port number on a WAN server; TCP 33434
from the traffic mix comprising many other devices in the for Amazon Echo and UDP 56700 for LiFX lightbulb, as
network. We developed a script to automate the process of shown by the top flow in Figs. 2a and 2b. Finally, we observe
data collection and storage. The resulting traces were stored that Amazon Echo accesses a number of domain names
as pcap files on an external USB hard drive of 1 TB storage including softwareupdates.amazon.com, device-
attached to the gateway. This setup permitted continuous metrics-su.amazon.com, example.org, pindorama.
logging of the traffic across several months. amazon.com and pool.ntp.org. However, LiFX light-
We started logging the network traffic in our smart envi- bulb communicates with only two domains, i.e., v2.bro-
ronment from 1-Oct-2016 to 13-Apr-2017, i.e., over a period ker.lifx.co and pool.ntp.org.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1749

Fig. 2. Sankey diagram of daily network activity for two representative IoT devices, Amazon Echo, and LiFX lightbulb. A clear distinction is observed
in terms of their communication patterns, i.e., the servers they talk to, and the port numbers and protocols used for data exchange.

Fig. 3. Distribution of IoT activity pattern: (a) Flow volume, (b) flow duration, (c) average flow rate, and (d) device sleep time.

4.1 IoT Activity and Volume Pattern Fig. 3a that each IoT device tends to exchange a small
We start with the activity pattern of IoT devices that is amount of data per flow. For the case of the LiFX lightbulb
defined by the properties of their traffic flows. We define (depicted by red bars), 26 percent of flows transfer between
four key attributes at a per-flow level to characterize IoT [130, 140] bytes and 20 percent between [120, 130] bytes.
devices based on their network activity: flow volume (i.e., The flow volume for the Belkin motion sensor (depicted by
sum total of download and upload bytes), flow duration (i.e., green bars) is slightly higher; over 35 percent of flows trans-
time between the first and the last packet in a flow), average fer between [2800, 3800] bytes. For the Amazon Echo
flow rate (i.e., flow volume divided by the flow duration), (depicted by blue bars), over 95 percent of flows transfer
and device sleep time (i.e., time interval over which the IoT less than 1000 bytes. Though we present the flow volume
device has no active flow). histogram for only a few devices, most of our IoT devices
We plot in Fig. 3 the probability distribution of the above exhibit a similar predictable pattern.
four attributes for a chosen set of IoT devices using the trace A similar pattern emerges for the flow duration as
data collected over 26 weeks. It can be observed from well. Referring to Fig. 3b, we note that the flow duration of
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1750 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

Fig. 4. Word-cloud of server ports (total count of unique ports is shown in {sub-captions} next to the device name).

53 seconds is seen in more than 40 percent of flows for in Figs. 4e, 4f, and 4g. We also note that well-known standard
Amazon Echo, while a duration of 60 seconds is seen for the port numbers such as 53 (DNS), 123 (NTP), 0 (ICMP) and
LiFX lightbulb and Belkin motion sensor with a probability 1900 (SSDP) are used by many of the IoT devices as well as
of 50 and 21 percent respectively. the non-IoTs with various frequencies, as shown in Fig. 4.
For the average flow rate attribute, Fig. 3c shows that the Moreover, the server-side port number of 443 (TLS/SSL) is
mean rate is rather small, in the bits-per-second range as also used by many of the IoT devices.
one would qualitatively expect. Quantitatively, the figure
shows that the LiFX lightbulb has an average flow rate of 4.2.2 DNS Queries
18 bits-per-second nearly 60 percent of the time. Nearly
DNS is a common application used by almost all networked
30 percent of Belkin flows have a bit rate in the range 59 to
devices. Since IoT devices are custom-designed for specific
60 bits-per-second while nearly 40 percent Amazon Echo
purposes, they access a limited number of domains corre-
flows have a bit range in the range 70 to 71 bits-per-second.
sponding to their vendor-specific end-point servers. We
Lastly, in terms of the sleep time for the devices Fig. 3d
plot in Fig. 5 the word cloud of domain names accessed by
shows that the Belkin motion sensor and the LiFX lightbulb
several IoT devices as well as non-IoTs. It is seen that IoT
exhibit a distinct sleep pattern. The duration is 1 second and
devices are fairly distinguishable by the domain names they
60 seconds with probability 73 and 48 percent respectively.
communicate with. For example, as depicted in Figs. 5a, 5b,
However, multiple sleep times with small probabilities are
and 5c, domains such as example.com, example.net,
observed for the Amazon Echo. This is because Amazon
and example.org are frequently requested by Amazon
Echo keeps its TCP connections alive and goes to sleep only
Echo; sub-domains of hp.com and hpeprint.com are
when it disconnects from the Internet. Other devices in our
seen in DNS queries from the HP printer. However, we also
test-bed also perform like the Echo and do not seem to have
see that some prominent domain names are shared between
a dominant sleep pattern.
the different devices. For example, belkin.com and
d3gjecg2uu2faq.cloudfront.net are commonly used
4.2 IoT Signaling Pattern by Belkin devices (i.e., camera, motion sensor and power
We now focus on the application layer protocols, inferred switch) as shown in Figs. 5d, 5e, and 5f; or pool.ntp.org
using the port numbers, that IoT devices mostly use to com- is prominent in traffic flows generated from Google Drop-
municate locally in the LAN and/or externally with servers cam, Awair air quality monitor and LiFX lightbulb, as
on the public Internet. shown in Figs. 5b, 5c, 5d, 5e, 5f, 5g, and 5h. Again consider-
ing non-IoTs in Fig. 5i, we see about 12000 unique domains
4.2.1 Server Port Numbers visited which is far diverse compared to IoT devices with
only a handful of domains accessed repeatedly.
Fig. 4 shows the word cloud of server-side port numbers of
We also found that IoT devices differ from one other in
all flows initiated from a variety of IoT devices. For each
how often the DNS protocol is used. We have observed
device, if a port is used more frequently then it is shown by a
from our traffic traces that IoT devices generate DNS
larger font-size in the respective word cloud. Sub-captions
queries during different stages of its operation; for example
(i.e., numbers within {}) report the number of unique server
only during the boot-up phase (e.g., Google Dropcamp) or
ports for each device. It can be seen that IoT devices each
when interacting with a user (e.g., Hello Barbie) or periodi-
uniquely communicate with a handful of server ports
cally (e.g., Amazon Echo). As shown in Fig. 6, certain IoT
whereas non-IoT devices use a much wider range of services
devices exhibit a characteristic signature in the frequency of
(i.e., 2382 unique ports are shown in Fig. 4h and many of
their DNS queries. The LiFX lightbulb and Amazon Echo
them are very infrequent). We observe that non-standard
send DNS queries very frequently (i.e., every 5 minutes) but
ports 33434, 56700, 8883, and 25050 are prominently seen in
a device like the Belkin motion sensor requests domain
traffic originating from Amazon Echo, LiFX lightbulb, Awair
names only once every 30 minutes.
air quality monitor, and Netatmo weather station respec-
tively, as shown in the top row of Fig. 4. Further, we note
devices from the same manufacturer share certain ports. For 4.2.3 NTP Queries
example, port numbers 8443 and 3478 are common between As mentioned earlier, NTP is another popular protocol
Belkin’s motion sensor, power switch, and camera, as shown used by IoT devices because precise and verifiable timing
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1751

Fig. 5. Word-cloud of domain names (total count of unique domains is shown in {sub-captions} next to the device name).

is crucial for IoT operations [34]. Many IoT devices tend to negotiate the security algorithms with servers, devices start
use NTP protocol (UDP port 123) in a periodic manner in handshaking by sending a “Client Hello” packet with a list
order to synchronize their time with publicly available of “cipher suites” that they can support, in the order of their
NTP servers. For example, Awair air quality monitor, LiFX preference. For example, Figs. 8a and 8b depict cipher suites
lightbulb and Google Dropcam obtain the IP address of that Amazon Echo offers to two different Amazon servers.
time servers from pool.ntp.org. We also find that time Each cipher suite (i.e., 4-digit code) can take one of 380 pos-
synchronization occurs repeatedly in our test-bed and sible values and represents algorithms for key exchange,
many IoT devices exhibit a recognizable pattern in the use bulk encryption and message authentication code (MAC).
of NTP. For example, the Belkin power switch, LiFX light- For example, the cipher 002f negotiated by an Amazon
bulb and SmartThings hub send NTP requests every 60, server uses RSA, AES_128_CBC, and SHA protocols for key
300 and 600 seconds respectively, as shown in histogram exchange, bulk encryption and message authentication,
plot of Fig. 7. respectively.
We find that 17 out of the 28 IoT devices in our setup,
4.2.4 Cipher Suite inclu ding the Amazon Echo, August Doorbell Cam, Awair
A number of IoT devices use TLS/SSL protocol (port num- air quality monitor, Belkin Camera, Canary Camera, Drop-
ber 443) to communicate with their respective servers on cam, Google Chromecast, Hello Barbie, HP ENVY Printer,
the Internet [30]. In order to initiate the TLS connection and iHome, Netatmo Welcome camera, Philips Hue lightbulb,

Fig. 6. Histogram of DNS interval. Fig. 7. Histogram of NTP interval.


Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1752 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

different types of devices from the same vendor visit similar


domains and use the same port numbers to access cloud serv-
ers. Capturing aspects such as the number of occurrences for
these attributes (e.g., number of times a domain is accessed or
number of streams that use the port), in combination with
other attributes, vastly improves the prediction capability to
distinguish between devices from the same manufacturer. In
the next section, we develop a multi-stage machine learning
based algorithm using combinations of these attributes to help
classify IoT devices with high accuracy.

Fig. 8. Signature of cipher suite. 5 MACHINE LEARNING BASED CLASSIFICATION


In order to synthesize the attributes from our trace data, we
Pixtar photoframe, Ring Door Bell, Triby, Withings Aura first convert the raw pcap files into flows on an hourly basis
smart sleep sensor and Withings Scale, use TLS/SSL for using the Joy tool [31]. Then, for a given IoT device, we com-
communication. We find that Amazon Echo uses total of pute the traffic activity and signalling attributes defined in
five different cipher suite strings when communicating SSL the previous section over the hourly instances. The number
to different servers, Triby speaker uses two strings, while of instances for each device obtained from the trace span-
the Pixtar photoframe uses only one string for all of its SSL ning 26 weeks varies depending on factors such as the dura-
communications. We plot unique cipher suite strings from tion for which a device is online, or how a device generates
these three devices in Fig. 9 as discrete signals: x-axis is the traffic (autonomously or interactively). For example, there
order of 4-digit cipher codes that appear in the offered suite, were only 13 hourly instances for the Blipcare BP monitor
and y-axis is the index of the individual cipher codes (i.e., a since it generates traffic only when the device is used by a
value from {1, 2, ..., 380}). It is seen that the collection of user. On the other hand, we collected 4177 instances for
cipher suite signals enunciates a unique signature for each Google Dropcam.
IoT device. Exceptionally, we found that Pixtar photoframe
shares its single cipher suite with one of 18 suites that are 5.1 Multi-Stage Device Classification Architecture
used by August door-bell—we will see in Section 5.2 that We note that three of our attributes namely “set of domain
relying only on cipher suite attribute would not be effective names”, “set of remote port numbers” and “set of cipher
in classifying Pixtar photo-frame traffic. suites” are nominal (i.e., are not treated as numeric values)
There are however many devices that rarely exchange and multi-valued (for example, {”53”:3, ”123”:1, ”443”:2}
cipher suites but instead prefer to keep their TLS connec- represents a set of remote port numbers with three occur-
tions alive for a long period. For example, Google Dropcam rences of port number 53, two occurrences of port 123, and
establishes a TLS connection to its own server whenever it one occurrence of port number 443). Our remaining attrib-
boots up and maintains this connection as long as it has utes including flow volume/duration, flow rate, sleep time,
network connectivity, while Amazon Echo and Pixstar pho- and DNS/NTP intervals contain single quantitative and
toframe initiate on average 1 and 2 TLS connections respec- continuous values. We therefore employ a two-stage hierar-
tively every hour. chical architecture for our IoT classifier as shown in Fig. 10.
Summary. In this section, we have identified 8 key attrib- Inthisarchitecture,wefirstfeedeachmulti-valuedattribute
utes based on the underlying network traffic characteristics to its corresponding stage-0 classifier in the form of a “bag of
of IoT devices. They are flow volume, flow duration, aver- words”. A bag of words is a matrix whose rows represent
age flow rate, device sleep time, server port numbers, DNS labeled instances, and columns represent unique words. This
queries, NTP queries and cipher suites. Although, some matrix has M rows (i.e., total number of instances) and N
devices (e.g., Amazon Echo, or LiFX lightbulb) can be columns (i.e., number of unique words). We observed 356,
uniquely identified by considering just one or two traffic attrib- 421 and 54 unique words for domain-names, remote port
utes such as the list of domain-names, port-numbers, or cipher numbers and cipher suite strings, as shown in Fig. 10. In addi-
suites, these come with challenges. For example, a strong attri- tion to these unique words, we aggregated all corresponding
bute like the list of cipher-suites is observed very infrequently words for non-IoT devices as “others” - a column called
in the traffic (e.g., only once a day). As another example, “others” in each Stage-0 matrix represents words not seen

Fig. 9. Signature of cipher suite.


Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1753

Fig. 10. System architecture of the multi-stage classifier.

in IoT traffic. Each cell of this matrix is the number of occur- Naive Bayes Multinomial classifier performs well if training
rencesof suchuniquewordsinagiveninstance. instances are fairly distributed among various classes [35].
As shown in Fig. 10, each classifier of Stage-0 generates
two outputs, namely a tentative class and a confidence level, 5.1.2 Stage-1 Classifier
which together with other single-valued quantitative attrib-
utes (i.e., flow volume, duration, rate, sleep time, DNS, NTP We have a stage-1 classifier that takes all quantitative attrib-
intervals) are fed into a Stage-1 classifier that produces the utes along with the pair of outputs from each stage-0 classi-
final output (i.e., the device identification with a confidence fier. Since the stage-1 attributes are not linearly separable
level). and the outputs of stage-0 classifiers are nominal values, we
use a Random Forest based stage-1 classifier. Another rea-
5.1.1 Stage-0: Bag-of-Words Classifiers son for selecting the Random Forest is its high tolerance to
over-fitting compared to other decision tree classifiers.
We employ a Naive Bayes Multinomial classifier to analyze
each bag of words in the stage-0 of our machine. It has been
shown [35] that this classifier performs well in text classifica-
5.2 Performance Evaluation
tion when dealing with a large number of unique words. We use the Weka [36] tool for our IoT device classification.
During the training phase, the classifier takes the distribution We have collected a total of 50,378 labeled instances from
of words, e.g., individual unique domain names, and com- our traffic traces. As mentioned earlier, we have a number
putes the probability of each word given a class using: of instances from different devices—those that generate
traffic when triggered by user interaction have small num-
P ber of instances (e.g., 13 for Blipcare BP monitor, 21 for Goo-
1þ D l¼1 nl;ci ;wj
train
Prðwtrain
j jci Þ ¼ PN PD ; (1) gle Chromecast) and those that autonomously generate
N þ k¼1 l¼1 nl;ci ;wk train traffic have a fairly large number of instances (e.g., 2,868 for
Samsung Smart Things or 2,247 for Amazon Echo). We have
where wj is a unique word in the training dataset (e.g., port
randomly split instances into two groups, one containing
number 56700); ci is a class label (e.g., LiFX lightbulb); D is
70 percent of the instances for “training” and another con-
the total number of instances; nl;ci ;wj train is the number of wj taining 30 percent of the instances for “testing”.
occurrences in each of instances with class label of ci ; N is Table 2 shows the performance of our classifier under
the total number of unique words (e.g., we have N ¼ 421 various scenarios, each captured by a pair of columns.
unique port numbers in our dataset). For a given scenario, we measure the true positive rate
During the testing phase, the classifier needs to compute (i.e., fraction of test instances that are correctly classified)
the following probability for all possible classes: and false positive rate (i.e., fraction of test instances that
are incorrectly classified) for every device corresponding
Y
N
ntest
to the rows in Table 2. We also obtain the average confi-
Prðci jW test Þ ¼ Prðctrain
i Þ Pr ðwj train jci Þj ; (2) dence level (i.e., a number between 0 and 1 depicted
j¼1 within square brackets in each cell) of our classifier for
correctly classified and incorrectly classified instances. In
where W test is a set represented by fw1 : ntest test
1 ; w2 : n2 ; . . . ; addition, we aggregate the performance of individual
wN : ntest
N g; ntest
j is the occurrence number of individual classes and compute the overall accuracy (i.e., total true
unique words wj in a given test instance; Prðctrain i Þ is the positive rate) along with the overall root relative squared
presence probability of a class ci in the whole training data- error (RRSE) as measures of performance for our classi-
set (i.e., number of ci training instances divided by total fier. These measures are reported in the top row of each
number of all training instances). The classifier finally choo- scenario in Table 2. Note that our objective is to achieve a
ses the class that gives the maximum probability in (2) for a high accuracy (close to 100 percent) with a fairly low
given set of words along with their occurrences. Note that a error (close to zero).
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1754 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

TABLE 2
Performance of the Proposed IoT Device Classifier under Different Sets of Attributes

5.2.1 Performance of Stage-0: Port Numbers Attribute reasons: (i) there are 2451 training instances of Netatmo
The first three columns correspond to those cases in which compared to 323 of Ring door bell, which makes Prðctrain i Þ of
we consider only nominal attributes of stage-0 (i.e., bag of Netatmo larger than that of Ring door bell, and (ii) many
words corresponding to port numbers, domain names and Netatmo instances contain several (on average 4 times)
cipher suites). The first column shows that when we only occurrences of port 53 as opposed to only one for Ring Door
use a list of server-side port numbers for device classifica- bell, which also contributes to Prðwj jci Þ of Netatmo being
tion, a reasonable accuracy of 92.13 percent is achieved, but greater than that for Ring door bell in (1). Thus, Ring door
RRSE is poor (at 39.93 percent). Inspecting the individual bell instances get classified as Netatmo weather station,
classes, we observe that certain classes highlighted by yel- warranting a second stage of classification with additional
low or light-green (e.g., Ring door bell, Blipcare BP monitor, attributes for improved accuracy.
Hello Barbie, and Google chromecast) are poorly classified. Blipcare BP Monitor. It uses only two remote port numbers,
We explain the reason behind this misclassification next. namely 8777 and 53, in a total of 13 instances - the port num-
Ring Door Bell. Out of 486 instances, 465 contain a single bers appear only once or twice in each instance. Surprisingly,
occurrence of the DNS query (i.e., remote port number 53). we see that 80 percent of Blipcare test instances are incorrectly
We see that 95.8 percent of test instances are incorrectly clas- classified as Ring Door Bell though the remote port number of
sified as Netatmo weather station. This is because of two 8777 is unique to the Blipcare BP monitor. This is because
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1755

there are only a very small number of Blipcare instances in Table 2. In addition, we find that August doorbell cam is
our dataset, which results in a fairly small value of sharing one of its cipher suite strings (out of total 18) with
Prð00 5300 jBlipcareÞ ¼ 0:0203 and Prð00 877700 j BlipcareÞ ¼ Pixstar photoframe, which has a single cipher suite string.
0:0294 in (1), and a negligible value of Pr ðBlipcaretrain Þ ¼ Thus, 21.2 percent of August door bell instances are misclas-
0:0003 in (2). On the other hand, Prð00 877700 jRingÞ becomes sified as Pixstar photoframe and almost all instances of Pix-
very small as the remote port number 8777 is never used by star photoframe are classified as August doorbell.
the Ring Door Bell in our dataset. However, the probability of
Prð00 877700 jRingÞ ¼ 0:0011 in (1) is sufficient enough to maxi- 5.2.4 Performance of Stage-0: Combination
mize the classifier probability PrðRingjf00 5300 : 1;00 877700 : 1gÞ of Attributes
in (2), given PrðRingtrain Þ ¼ 0:0097. We expect the combination of the three bags of words (port
Other Devices. Server-side port numbers are empty in numbers, domain names, and cipher suites) to significantly
72 percent of instances for Hello Barbie, since it communi- enhances the accuracy of our classifier, as indeed shown by
cates with local devices instead of Internet-based end-points. the fourth column titled “Combined stage-0” in Table 2. The
Similarly for HP printer (38 percent) and iHome power plug overall accuracy reaches to 97.39 percent with RRSE of
(10 percent). The lack of server-side port number information 18.24 percent. It can be seen that the majority of test instances
explains why these devices are classified as Dropcam, which are correctly classified, except for Hello Barbie. This is because
has the highest value of PrðDropcamtrain Þ ¼ 0:0828 in (2). We most of the Hello Barbie attributes are empty in stage-0 and
note that the confidence level of our stage-0 classifier is fairly thus it is classified as Dropcam, as mentioned earlier.
low (i.e., less than 0.4) in these cases, suggesting that the clas- Interestingly, we see that all test instances of Blipcare BP
sifier chooses the most probable class given empty attribute monitor are classified correctly though the accuracy of indi-
(i.e., all ntest
j are zero). vidual stage-0 was fairly poor. This is because our decision-
tree-based classifier in stage-1 sees a strong correlation
5.2.2 Performance of Stage-0: Domain between the outputs of stage-0 classifiers and the actual
Names Attribute class of training instance, even though those outputs (tenta-
We now focus on the stage-0 machine that uses only a bag of tive class) are incorrect—e.g., having the tentative output
domain-names, which yields an accuracy of 79.48 percent from remote port number classifier as Ring door bell, hav-
with a fairly high RRSE value of 57.56 percent, as shown in the ing the tentative output from cipher suite classifier as Drop-
second column in Table 2. In this scenario, more classes suffer cam, and having the confidence level from domain name
from misclassification (i.e., those with yellow coloured cells) classifier less than 0.66 collectively is a strong indication of
compared to the previous scenario where only remote port Blipcare instance.
numbers were considered. The reasons behind the misclassifi-
cation are threefold: (i) since devices from the same manufac- 5.2.5 Overall Performance
turer share a collection of domain names, as discussed in As the last step, we incorporate the outputs from the stage-0
Section 4.2.2, 59.8 percent of Belkin camera test instances are classifiers into stage-1 (without the latter having any notion
misclassified as Belkin Motion sensor and 100 percent Belkin of the quantitative attributes from the former), and addi-
Motion sensor instances are misclassified as Belkin switch. tionally include quantitative attributes (flow volume, dura-
Similarly, 56.8 percent of Withings scale instances are incor- tion, rate, sleep time, DNS and NTP intervals). The last
rectly classified as Withings sleep sensor, and 12 percent of column of Table 2 shows the overall performance of the
Samsung smart cam are misclassified as Samsung Smart- classification framework. In this case, the accuracy reaches a
things. (ii) a significant number of instances from select devi- remarkably high value of 99.88 percent, with almost all clas-
ces contain no DNS query entries (e.g., 96.2 percent of HP ses labeled correctly with a very small value of RRSE at
printer, 73.4 percent of Samsung Smart Cam, 71.4 percent of 5.06 percent. Fig. 11 shows the full confusion matrix of our
Hello Barbie, 12.5 percent of iHome power plug, 11 percent of classification when all the attributes are used in conjunction,
Hue bulb) and are thus incorrectly classified as a Dropcam, and corroborates that the diagonal entries (corresponding to
which also rarely generates DNS packets. (iii) the low number correct classification) are all at or very close to 100 percent,
of training instances with domain names leads to poor perfor- with just two exceptions—the Google Chromecast and the
mance (e.g., Blipcare BP meter and Hello Barbie). Hello Barbie. As explained earlier, the Chromecast gets clas-
sified as the Dropcam in some instances, while the Hello
5.2.3 Performance of Stage-0: Cipher Suite Attribute Barbie gets classified as a Hue bulb.
Considering only the cipher suite attribute, this stage-0 clas-
sifier results in a fairly low accuracy of 36.15 percent with a
high RRSE of 86.73 percent, as shown in the third column in
6 REAL-TIME OPERATION IN A NETWORK
Table 2. Again, the main reason for such poor performance Thus far, we have examined the performance of our multi-
is the scarcity of cipher suite attribute in the training instan- stage classifier using off-line analysis on captured traffic
ces, though this attribute carries a very strong signature to traces (i.e., pcap files). In this section, we discuss how one
uniquely identify an IoT device. Note that many of the IoT can realize a real-time implementation of our system taking
devices do not use secure communication at all and are thus into account the various stages involved in the analysis,
devoid of this attribute (i.e., have an empty field for it). namely attribute collection, machine training, and interpret-
Unsurprisingly, instances of devices that exchange cipher ing the classifier’s output.
suite fairly frequently including Amazon Echo, Awiar air
quality monitor, Canary camera, Google Chromecast and 6.1 Computing Attributes
Netatmo camera are correctly classified, as shown by the Extracting the attributes on-the-fly requires infrastructure
dark-green color cells in the corresponding column in that has sufficient visibility into the traffic flowing on the
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1756 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

Fig. 11. Confusion matrix of our IoT device classification using all attributes (accuracy: 99.88 percent, RRSE: 5.06 percent).

network. Flow related attributes such as flow volume, flow attributes tool in Weka with InfoGain attribute evaluator and
duration and flow rate can be extracted relatively easily Ranker search method. Fig. 12c shows the attributes in
using network switches that are instrumented with special decreasing order of merit score. A high merit score trans-
hardware-accelerated flow-level analyzers, e.g., NetFlow lates to superior strength in identifying the class of an
capable devices [37]. We therefore deem the extraction cost instance. We can see that the “flow-volume” is the most
of flow related attributes to be fairly low, and show them important attribute, followed by “bag of remote port
via blue color bars in Fig. 12c that depicts the relative costs numbers”, “bag of domain names” and “flow duration”
and merits of the various attributes. respectively. The sleep-time and NTP interval are the attrib-
Attributes including bag of port numbers, sleep-time, utes with the lowest merit.
and frequency of DNS/NTP requests can be extracted using Knowing the relative cost and merit of each attribute
flow-aware network switches with extra computation and allows us to evaluate the performance of our classifier
state management. For example, remote port numbers of all using: (a) only low cost attributes, (b) combination of low
flows associated with a given IoT device need to be and medium cost attributes, and (c) all attributes. The
recorded for the bag of port numbers. However, this specific classifier accuracy and RRSE are shown in Table 3. It is
state is not captured by default in commodity switches. Sim- seen that using only low-cost attributes results in 97.85
ilarly, time intervals between successive UDP packets of percent accuracy with an RRSE value of 18.63 percent; the
NTP/DNS should be recorded, which requires additional additional use of medium-cost attributes increases accu-
computation. We therefore associate these attributes with
racy to 99.68 percent and significantly reduces the RRSE
medium cost, and shown as yellow color bars in Fig. 12c.
error to 7.7 percent; while including all attributes yields
Lastly, two of our attributes, namely bag of domain
an overall accuracy of 99.88 percent and RRSE of 5.06 per-
names and bag of cipher suite strings, can only be extracted
cent. The method can therefore be tuned to achieve
by looking inside the payload of the appropriate packets,
appropriate balance between attribute collection cost and
which imposes considerable cost on processing. Thus, we
associate these attributes with high collection cost, and accuracy/error of classification.
shown them via red color bars in Fig. 12c.
Having understood the extraction cost of various attrib- 6.2 Training the Machine
utes, let us now examine the relative importance of the The duration of the training data set is another source of
attributes in classifying the IoT devices. We quantify the cost incurred by our classification. In Fig. 12a, we plot the
importance of each attribute by employing the select accuracy of the classifier on the left y-axis and the RRSE on
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1757

Fig. 12. Operational insights for real-time implementation of our device classifier: (a) Impact of training, (b) confidence-level for correct/incorrect clas-
sification, and (c) importance of attributes.

the right y-axis as a function of the number of days involved 80 percent, otherwise we need to collect more traffic (and
in collecting the training data set. Note that the x-axis is in richer instances) from that device in order to increase the
log-scale and each day represents 24 instances. confidence level.
It can be seen that the classifier achieves an overall accu- To demonstrate the ability of our classifier in detecting
racy is 99.28 percent with only one day of training and satu- changes of normal behavior, we have launched UDP reflec-
rates at 99.76 percent when trained over 16 days. On the tion and TCP SYN attacks of varying rates on the Samsung
other hand, RRSE drops from 14.43 to 7.5 percent when the camera. When our classifier is fed these attributes during the
training duration is increased from 1 day to 16 days. It fur- attack, it incorrectly identifies the device, but its confidence-
ther falls to 5.82 percent when we train using 70 percent of level drops to less than 50 percent. We note that the confi-
all instances from 128 days. As mentioned in Section 5, the dence level is 100 percent for normal traffic from Samsung
RRSE value is sensitive to the accuracy of individual classes. camera, as shown in the last column of Table 2. This is taken
We therefore believe that if there is a balanced number of as a sign of anomalous behavior that warrants further inves-
instances from various classes, our classifier would perform tigation by the network operator.
better in terms of RRSE.

6.3 Interpreting the Output of Classifier 7 CONCLUSION


As discussed in Section 5.1, our classifier generates a confi-
Despite the proliferation of IoT devices in smart homes,
dence level during the testing phase. This can be used as a
enterprises, campuses, and cities around the world, opera-
measure of reliability for our classifier. If adequate information
tors of such environments lack visibility into what IoT devi-
is not provided by a test instance then the classifier will choose
ces are connected to their networks, what their traffic
a random class (as discussed in Section 5.2.1) with a low confi-
characteristics are, and whether the devices are functioning
dence level—this can be interpreted as an “unknown” class.
appropriately free from security compromises. This work is
For example, given instances with an empty value for the
cipher suite attribute, the corresponding stage-0 classifier will the first to systematically characterize and classify IoT devi-
output Dropcam class with a confidence value of less than ces at run-time. We instrumented a smart environment with
10 percent - even for Dropcam instances that are classified cor- 28 unique IoT devices and collected traffic traces continu-
rectly the confidence level is low within the same range. ously over 26 weeks. We then statistically characterized the
We plot the CCDF of confidence level of our stage-1 clas- traffic in terms of activity cycles, signalling patterns, commu-
sifier in Fig. 12b for instances classified as correct and incor- nication protocols and cipher suites. We developed a multi-
rect. It is clearly seen that the confidence level is always stage machine learning based classification framework that
below 80 percent when an instance is incorrectly classified, uniquely identifies IoT devices with over 99 percent accu-
as shown by the red dotted line - the average confidence racy. Finally, we evaluated the real-time operational cost,
level for incorrectly classified instances is 54.22 percent. On speed, and accuracy trade-offs of our classification method.
the other hand, our classifier has an average 99.74 percent This paper shows that IoT devices can be identified with
confidence level for instances that are correctly classified. high accuracy based on their network behavior, and sets the
We note that for only a negligible fraction of correctly classi- stage for future work in detecting misbehaviors resulting
fied instances (i.e., 0.37 percent) the confidence level is less from security breaches in teh smart environment.
than 80 percent as shown by the blue dashed line. This sug-
gests that we can comfortably rely on our classifier’s output ACKNOWLEDGMENTS
for a device if it results in a confidence level of greater than
Funding for this project was provided by the Australian
Research Council (ARC) Linkage Grant LP150100666.
TABLE 3
Impact of Attributes Combination on Performance of Classifier REFERENCES
Accuracy RRSE [1] I. Spectrum, Popular Internet of Things forecast of 50 billion devi-
ces by 2020 Is outdated. 2016. [Online]. Available: https://goo.gl/
all attributes 99.88% 5.06% 6wSUkk
low- and medium-cost attributes 99.68% 7.70% [2] Cisco, “Cisco 2017 Midyear Cybersecurity Report,” 2017, https://
only low-cost attributes 97.85% 18.63% www.cisco.com/c/dam/global/es_mx/solutions/security/pdf/
cisco-2017-midyear-cybersecurity-report.pdf
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
1758 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 8, AUGUST 2019

[3] A. Schiffer, How a fish tank helped hack a casino, 2017. [Online]. [28] D. Tegeler, et al., “BotFinder: Finding bots in network traffic with-
Available: https://goo.gl/SAHxCX out deep packet inspection,” in Proc. 8th Int. Conf. Emerging Netw.
[4] Ms. Smith, University attacked by its own vending machines, Experiments Technol., Dec. 2012, pp. 349–360.
smart light bulbs & 5,000 IoT devices, 2017. [Online]. Available: [29] D. McGrew and B. Anderson, “Enhanced telemetry for encrypted
https://goo.gl/cdNJnE threat analytics,” in Proc. IEEE 24th Int. Conf. Netw. Protocols,
[5] S. Alexander and R. Droms, “DHCP Options and BOOTP vendor Nov. 2016, pp. 1–6.
extensions,” Internet Requests for Comments, RFC Editor, RFC [30] B. Anderson and D. McGrew, “Identifying encrypted malware
2132, Mar. 1997. [Online]. Available: https://tools.ietf.org/rfc/ traffic with contextual flow data,” in Proc. ACM Workshop Artif.
rfc2132.txt Intell. Security, Oct. 2016, pp. 35–46.
[6] A. Sivanathan, et al., “Characterizing and classifying IoT traffic in [31] Cisco, 2017. [Online]. Available: https://github.com/cisco/joy
smart cities and campuses,” in Proc. IEEE Infocom Workshop Smart [32] Y. Meidan, et al., “Detection of unauthorized IoT devices using
Cities Urban Comput., May 2017, pp. 559–564. machine learning techniques,” arXiv, 2017. [Online]. Available:
[7] S. Notra, et al., “An experimental study of security and privacy http://arxiv.org/abs/1709.04647
risks with emerging household appliances,” in Proc. M2MSec, [33] OpenWrt. 2016. [Online]. Available: https://openwrt.org/
Oct. 2014, pp. 79–84. [34] M. Weiss, et al., Time-aware applications, computers, and com-
[8] F. Loi, et al., “Systematically evaluating security and privacy for munication systems. 2015. [Online]. Available: http://dx.doi.org/
consumer IoT devices,” in Proc. ACM CCS Workshop IoT Security 10.6028/NIST.TN.1867
Privacy, Nov. 2017, pp. 1–6. [35] A. McCallum and K. Nigam, “A comparison of event models for
[9] I. Andrea, et al., “Internet of Things: Security vulnerabilities and naive bayes text classification,” in Proc. Workshop Learn. Text Cate-
challenges,” in Proc. IEEE Symp. Comput. Commun., Jul. 2015, gorization, 1998, pp. 41–48.
pp. 180–187. [36] E. Frank, et al., The WEKA Workbench. Online Appendix for ”Data
[10] K. Moskvitch, “Securing IoT: In your smart home and your con- Mining: Practical Machine Learning Tools and Techniques”, 4th ed.
nected enterprise,” Eng. Technol., vol. 12, no. 3, pp. 40–42, San Mateo, CA, USA: Morgan Kaufmann, 2016.
Apr. 2017. [37] E. Vyncke and C. Paggen, LAN Switch Security: What Hackers Know
[11] N. Dhanjani, Abusing the Internet of Things: Blackouts, Freakouts, and About Your Switches. Indianapolis, IN, USA: Cisco Press, 2008.
Stakeouts. Sebastopol, CA, USA: O’Reilly Media, 2015.
[12] E. Fernandes, et al., “Security analysis of emerging smart home Arunan Sivanathan received the bachelor’s
applications,” in Proc. IEEE Symp. Security Privacy, May 2016, degree from the University of Peradeniya, Sri
pp. 636–654. Lanka, in 2012. He is currently working toward the
[13] T. guardian, Why the internet of things is the new magic ingredi- PhD degree in the School of Electrical and Tele-
ent for cyber criminals. 2016. [Online]. Available: https://goo.gl/ communication Engineering, University of New
MuH8XS South Wales (UNSW Sydney). He later joined the
[14] T. Yu, et al., “Handling a trillion (Unfixable) flaws on a Billion devi- University of Jaffna, Sri Lanka, as a lecturer from
ces: Rethinking network security for the internet-of-things,” in Proc. 2013 to 2016. His primary research interests
Proc. 14th ACM Workshop Hot Topics Netw., Nov. 2015, Art. no. 5. include security of Internet of Things and data
[15] A. Sivanathan, et al., “Low-cost flow-based security solutions for analytics on machine-to-machine communication.
smart-home IoT devices,” in Proc. IEEE Int. Conf. Advanced Netw.
Telecommun. Syst., Nov. 2016, pp. 1–6.
[16] A. Moore and D. Zuev, “Internet traffic classification using bayes- Hassan Habibi Gharakheili received the BSc
ian analysis techniques,” SIGMETRICS Perform. Eval. Rev., vol. 33, and MSc degrees in electrical engineering from
no. 1, pp. 50–60, Jun. 2005. the Sharif University of Technology in Tehran,
[17] M. Iliofotou, et al., “Exploiting dynamicity in graph-based traffic Iran, in 2001 and 2004, respectively, and the PhD
analysis: Techniques and applications,” in Proc. 5th Int. Conf. degree in electrical engineering and telecommu-
Emerging Netw. Experiments Technol., Dec. 2009, pp. 241–252. nications from UNSW in Sydney, Australia, in
[18] D. Bonfiglio, et al.,“Revealing skype traffic: When randomness 2015. He is currently a postdoctoral researcher
plays with you,” SIGCOMM Comput. Commun. Rev., vol. 37, no. 4, with UNSW Sydney. His research interests inclu-
pp. 37–48, Aug. 2007. de network architectures, software-defined net-
[19] R. Ferdous, et al., “On the use of SVMs to detect anomalies in a working, and Internet of Things.
stream of SIP messages,” in Proc. 11th Int. Conf. Mach. Learn. Appl.,
Dec. 2012, pp. 592–597.
[20] M. Z. Shafiq, et al., “A first look at cellular machine-to-machine Franco Loi is currently working toward the bach-
traffic: Large scale measurement and characterization,” in Proc. elor’s degree in electrical engineering and com-
ACM 12th ACM SIGMETRICS/PERFORMANCE Joint Int. Conf. puter science at the University of New South
Meas. Modeling Comput. Syst., Jun. 2012, pp. 65–76. Wales in Sydney. His research interest includes
[21] N. Nikaein, et al., “Simple traffic modeling framework for the network security of IoT.
machine type communication,” in Proc. 10th Int. Symp. Wireless
Commun. Syst., Aug. 2013, pp. 1–5.
[22] M. Jadoul, The IoT: The network can make it or break it. 2016.
[Online]. Available: https://insight.nokia.com/iot-network-can-
make-it-or-break-it
[23] M. Simon and Alcatel-Lucent. Architecting Networks: Supporting
IoT, 2014, https://www.slideshare.net/usmanusb/mimos-iot-twg-
day1-session-ii-2nd-speaker-mathew-al
Adam Radford received a first class honors
[24] M. Laner, et al., “Traffic models for machine type commu- degree in science, majoring in computer science
nications,” in Proc. 10th Int. Symp. Wireless Commun. Syst., from UNSW. He is a distinguished systems engi-
Aug. 2013, pp. 1–5. neer at Cisco Systems in Sydney, Australia. His
[25] L. Markus, N. Nikaein, P. Svoboda, M. Popovic, D. Drajic, and background is software and automation, having
S. Krco, “8 - Traffic models for machine-to-machine (M2M) com-
spent 10 years building and automating campus
munications: types and applications,” Machine-to-machine (M2M)
networks. He then joined Cisco and has focused
Communications, Woodhead Publishing, pp. 133–154, 2015. on a variety of technologies including voice, wire-
[26] A. Orrevad, “M2M Traffic Characteristics: When Machines Partic- less, and data center. In recent times, his focus
ipate in Communication,” Inf. Commun. Technol., Stockholm, has been enterprise networks, specifically auto-
Sweden, 2009. mation and programmability.
[27] M. Lopez-Martin, et al., “Network traffic classifier with convolu-
tional and recurrent neural networks for internet of things,” IEEE
Access, vol. 5, pp. 18042–18050, 2017.

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.
SIVANATHAN ET AL.: CLASSIFYING IOT DEVICES IN SMART ENVIRONMENTS USING NETWORK TRAFFIC CHARACTERISTICS 1759

Chamith Wijenayake received the BSc degree in Vijay Sivaraman received the BTech degree from
electronic and telecom engineering from the Uni- the Indian Institute of Technology in Delhi, India, in
versity of Moratuwa, Sri Lanka, in 2007, and the 1994, the MS degree from North Carolina State
PhD degree in electrical engineering from the Uni- University, in 1996, and the PhD degree from the
versity of Akron, Ohio, in 2014. He is currently a University of California at Los Angeles, in 2000.
lecturer with the School of Electrical Engineering He has worked at Bell-Labs as a student fellow, in
and Telecommunications at UNSW. His research a Silicon Valley start-up manufacturing optical
interests include multidimensional space-time sig- switch-routers, and as a senior research engineer
nal processing for electronically scanned smart at CSIRO in Australia. He is now a professor with
antenna arrays, light field signal processing, local the University of New South Wales in Sydney, Aus-
signal approximations, and FPGA based system tralia. His research interests include software
design for DSP applications. defined networking, network architectures, and
cyber-security particularly for IoT networks.

Arun Vishwanath (SM’15-M’11) received the " For more information on this or any other computing topic,
PhD degree in electrical engineering from the Uni- please visit our Digital Library at www.computer.org/publications/dlib.
versity of New South Wales, Sydney, Australia, in
2011. He is a lead research scientist with IBM
Research in Melbourne, Australia, working in the
area of IoT for energy optimization in smart build-
ings and IoT security. He was a visiting PhD
scholar in the Department of Computer Science,
North Carolina State University, in 2008. His
research interests include areas of IoT applica-
tions, cybersecurity and software defined net-
working. He has received several awards from IBM for outstanding
technical accomplishments. He is the recipient of the Best Paper Award
at the ACM e-Energy 2018 conference, an appointed a Distinguished
Speaker of the ACM, and a senior member of the IEEE.

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on October 05,2024 at 14:43:49 UTC from IEEE Xplore. Restrictions apply.

You might also like