0% found this document useful (0 votes)
20 views

Intrusion Detection System For IoT Environments Using Machine Learning Techniques

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Intrusion Detection System For IoT Environments Using Machine Learning Techniques

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2022 1st Zimbabwe Conference on Information Communication & Technology(ZCICT) 9th to 10th November

Intrusion Detection System for IoT environments


using Machine Learning Techniques
2022 1st Zimbabwe Conference of Information and Communication Technologies (ZCICT) | 978-1-6654-7576-1/22/$31.00 ©2022 IEEE | DOI: 10.1109/ZCICT55726.2022.10045992

Shammah Chishakwe Belinda Mutunhu Ndlovu

Dept of Computer Science Dpt of Informatics and Analytics Sibusisiwe Dube


National University of Science and National University of Science and Dept of Informatics & Analytics
Technology Technology National University of Science and
Bulawayo, Zimbabwe Bulawayo, Zimbabwe Technology
0000-0001-6046-3240 Bulawayo, Zimbabwe
[email protected]
Nesisa Moyo
Dept of Computer Science
National University of Science and
Technology
Bulawayo, Zimbabwe
[email protected]

Abstract—The Internet of Things (IoT) is fast becoming the Network security is a prevalent topic and a widespread
new normal in our everyday lives. The communication of concern as the IoT ecosystem gets bigger. As such, smart
connected devices without requiring human intervention has led to ecosystems based on the IoT paradigm must address and
the advent of smart ecosystems or environments. Smart ecosystems
emphasize security and privacy[1]. The diversity of IoT
are an environment where smart devices or ‘things” are trying to
protocols and device resource limitations make IoT security
improve the quality of life for their inhabitants by determining the
inhabitant's intent without explicit input. This technological
a difficult challenge. IoT security systems that actively
advancement brings with it security concerns concerning detect attacks such as Intrusion Detection Systems (IDS)
confidentiality, integrity, and availability as large data volumes are would be ideal. An intrusion detection system analyses
processed by smart devices. Mainstream security solutions may not network traffic data to identify and defend against intrusions
work in IoT environments due to their unique nature that jeopardize an information system's confidentiality,
, whereby IoT has different protocols, and they have integrity, and availability [2].
computational resource limitations. This project seeks to
develop an intrusion detection system for IoT environments in
an IoT network utilizing a machine learning technique
Confronted with such a challenge there is a need for an
whereby a user is alerted if an anomaly has been detected.
effective security solution that can aid us in the hostile cyber
Keywords— IoT, Intrusion Detection System, Smart terrain that we find ourselves in today. The advancement of
environments, Machine Learning Artificial Intelligence technology brings an opportunity for
its incorporation into the cybersecurity sector. The ability of
[1] INTRODUCTION machine learning algorithms to learn and infer from data
The Internet of Things environment is made up of web- and then accurately make predictions from the data without
enabled devices that employ embedded systems needing explicit human input makes it ideal as an anomaly
, such as processors, sensors, and actuators to gather certain detector in IDSs.
data which will be used in their s surroundings. IoT devices
exchange the sensor data they gather with other similar
gadgets and take action based on the data they exchange.
Aims and Objectives
Humans may engage in the setting up of these devices by
giving them instructions or retrieving data. However, these
devices conduct most of the tasks without human The project aims to create an intrusion detection system
intervention. model for IoT contexts using machine learning.

The objectives for this project are as follows:

ISBN 978-1-6654-7576-1/22/$31.00 ©2022 IEEE

Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.
• To simulate an IoT environment via an IoT test bed king the source. Considering the advantages of the system,
we noted a few drawbacks concerning the high number of
• To identify anomalies in the IoT environment. false positives recorded and resource consumption.
• To categorize attacks on the IoT environment
• To issue an alert when an intrusion is discovered. Mohamed et al. [8] developed a cloud-based intrusion
detection system (IDS) utilizing Raspberry Pi 3 to gather
traffic from smart devices and send it to a cloud-based
analyzer. Random Forests and neural networks are used by
[2] LITERATURE REVIEW
the intrusion detection system to find intrusions. Data
A. Related Works gathering, data processing, and detection and alerting are the
Physical or cyber-related assaults are common in IoT three divisions of the system. The gadget in question is used
environments, however, for this study, we focused on cyber or to gather IoT communication, extract features, and then
network risks. According to Pa tel and Aggarwal [3], IoT classify the collected characteristics. The UNSWB dataset,
ecosystems antagonize security risks from several dimensions produced by the CAAA, was used by the authors. The
due to the IoT architecture, which includes the application system's biggest flaw, however, is that it has a high rate of
false positives, demands a lot more resources, and is highly
, network, and perception layers. The most common IoT
cyber-attacks according to Asharf et al, [4] are: expensive to use cloud technology.

• Distributed Denial of Service which results in Nwafor, Campbell, and Bloom [9] proposed an origin
service unavailability. graph-based anomaly detection method for detecting
• Worm Attacks - software that spreads via a unexpected instances of sensor-based incidents in an IoT
network and targets host systems with harmful device. They tested their proposed solution by conducting
malware. investigations on an IoT system that mimics a temperature
• Hello flood attacks management system. However, we discovered that their
current anomaly detection system only supports offline data
and does not allow real-time detection.
Nugroho et al. [5] classify IoT security and privacy needs
under client privacy, data authentication, access control, and
resistance to attacks Summerville et al. [10] developed a deep-packet
. They further state that several approaches are employed in anomaly detection-based IDS that uses bit-pattern matching
IDSs and they are categorized by the detection method to choose feature selection. Intersecting groups of bytes, or
which includes anomaly, signature method, specification network packets, are used in the feature selection process to
method, and a hybrid approach, and the placement strategy choose features. If bits match in every aspect, the re is a
which is either on the network or based on the host device. pattern match. They achieved minimal false-positive rates
based on their testing.
We noted that several datasets are used in the study of IDSs
Bakhsh et al. [7] developed a system for dubbed IDPIoT for IoT environments. Popular examples are KDD 99CUP,
which takes in packets from the network interface and decodes NSL- KDD, CICIDS 2017, and the UNSW-NB 15 datasets.
them before processing and delivering them to the detector The benchmark datasets are a requirement for the testing and
agent. To detect anomalies in the packet header, the detector validation of an intrusion detection system, according to
agent examines every packet header for a specific kind of Khraisat and Alazab [6] Furthermore, the use of offline systems
behavior. The system analyzer inspects every packet to and popular aged datasets, for example, NSL-KDD, and KDD
preestablished detection rules such as matching when the cup’99 presents research gaps. The old datasets do not truly
alerting and logging system is turned on. Messages are then reflect the nature of IoT network environments nowadays
sent to the output module, which logs alerts and triggers an which are characterized by large data volumes transmitted and
alarm. The program stores the alarm system's output data in a new attack vectors and anomalies.
pre-set location, like a log file or database. Additionally,
prevention agents isolate the servers and offer real-time attack
mitigation by dropping suspicious packets and bloc

ISBN 978-1-6654-7576-1/19/$31.00 ©2022 IEEE

Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.
Thus the complex nature and vast range of IoT devices
and protocols, and their resource limitations made IoT In this stage, we designed the artifact based on our
security a difficult task [2]. In light of this problem, the objectives and the research gaps. Unified Modeling
motivation was to come up with an effective IoT security Language diagrams were used to achieve this. Below is the
solution for Io T environments. use case diagram showing the actors and how they interact
with the system.
[3] OBJECTIVES
In this phase, we deduced the objectives from the
identified problem and research gaps presented by existing
systems. Thus the objectives of the system are :
• To simulate an IoT environment via an IoT testbed
• To identify anomalies in the IoT environment
• To categorize attacks on the IoT environment

• To issue an alert when an intrusion is discovered.

[4] METHODOLOGY

Fig. 1. Use Case Diagram


Research Methodology
Proposed System
The design science research technique was used in the build-up
of the entire project. It emphasizes the solving of problems The three phases required to create an intrusion detection
through the creation of artefactual solutions. According to Van system are as follows, according to Nugroho et al, [5]:
Brocke [11], design science methodology’s main goal is to • data sources,
advance knowledge through the development of artifacts and
• the occurrence of intrusion detection utilizing
the cultivation of design expertise through inventive solutions
analytical equipment,
to real-world problems. This is an ideal methodology as the
• and reactions produced in response.
objectives of the project to develop the proposed systems are
solutions to a real-world problem at hand.
The suggested system is an anomaly-based intrusion
detection system with a web application that sends notifications
Software Development Methodology to authorized users whenever an abnormality is discovered. The
machine learning classifier placed on the backend will enable
anomaly detection. To collect network traffic
The system development methodology employed is the , Tshark, a network packet analyzer, will be utilized.
Rapid Application Development (RAD) methodology. This
methodology is ideal as it emphasizes software development
To replicate the IoT environment, an Arduino board, Node
that is gradual and iterative. Parallel system component
MCU module, temperature and humidity sensor, and ultrasonic
development in RAD is timeboxed, delivered, and then
sensor will be utilized. The Apache web server with the
assembled into a working prototype. Faster client feedback
NodeMCU module enables communication and data
is made possible as a result, guaranteeing on-time project
transmission between the web application and the IoT sensors.
completion as stated by Sommerville [12].

The basic arrangement of a system, software architecture,


defines an organized solution. It depicts the data and program
components needed to construct a computer-based system.
Software architecture describes how components of a software
system are put together, as well as how they communicate and
interact [12]. Therefore, it specifies several things, making the
software development process easier.

ISBN 978-1-6654-7576-1/19/$31.00 ©2022 IEEE

Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.
showed how well the model has been trained by using the
training data given a set of hyperparameters in the ML
model.

After the selection of the dataset, four machine-learning


models were selected as possible classifiers for this project.
They are:
• Decision Tree
• Random Forest
• Multilayer Perceptron
Fig. 2. Proposed system architecture • Logistic Regression

The user will enter the system, and the database will
authenticate and respond to the user. The I DS and the IoT The metrics for evaluating our model include:
network will create a handshaking connection, and once that
connection is formed, the IoT network will transfer data to
the IDS module. If the IDS module detects an intrusion, the 
Accuracy =
Web app will display a notification and provide details   

about the incident.

Precision =


Recall =


F1 Score = 2 ∗ ∗


Where:

• TP – True Positive
• TP – True Negative
• FP – False Positive
Fig. 3. Sequence Diagram
• FN – False Negative

Machine Learning Model


For this project after going through several datasets, the MODEL
UNSW Canberra's UNSW-NB15 dataset was selected which TABLE I. COMPARISON
was created by Moustafa et, al. Test Test
[13]. The literature on machine learning for intrusion Accuracy Precision Test Test Test
Model Recall F1 AUC
detection has extensively reviewed this publicly available
dataset. We utilized the scikit learn machine learning library Decision 0.8626 0.8233 0.955 0.884 0.854
which has classification algorithms featured in the library. Tree 3 4 6
Random Fo 0.8707 0.8180 0.984 0.893 0.977
We will train several machine learning models for the rest 1 4 0
training set and evaluate their performance on both the training Logistic 0.7745 0.7283 0.941 0.821 0.886
and testing set as shown in table 1. The model was validated via Regression 7 3 5
5-fold cr oss-validation (CV). The cross-validation results
MultiLayer 0.8562 0.8131 0.959 0.880 0.950
Perceptron 4 2 8

ISBN 978-1-6654-7576-1/19/$31.00 ©2022 IEEE


Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.
After the training and testing process using the scikit learn Wi-Fi module. This PHP script is used to receive distance
library the Random Forest model surpassed the other data from the IoT node and save it in the datastorage.txt.
classification algorithms with the highest F1 score. We This is illustrated in the code snippet below.
used the F1 Score as the assessment metric on test data to
demonstrate the trained model's performance.

The matplotlib library was used to aid in the


visualization process of the different classifier results. We
depicted the diagnostic capabilities of the binary classifiers
using the receiver operating characteristic curve which is
constructed by comparing the true positive rate against the
false positive rate at various threshold levels. The graph in
Fig 7 depicts the different machine learning classifiers with
the different areas under the curve. Random Forest
classifier achieved the highest area under the curve
therefore it was selected as the classification algorithm for
Arduino UNO - For reading inputs of ultrasonic
the solution to be designed.
sensor and turning it into output as distance data. This
controls the inputs and outputs from the sensor devices that
are connected to it via connecting wires. The code snippet
below shows the UNO board controlling the input from the
ultrasonic sensor.

Fig. 4. Area Under the Receiver Operating Characteristic Curve

For the application development, the main modules are


stated below with code snippets.

The network traffic is monitored using Tshark and dumped Network Logs - The network logs module is used to show
into a CSV file. TShark is a command line network packet the IoT network's traffic in real time. Similar to the
analyzer used to capture packets from a live network and notification module, the web connection is set up and used
dump the traffic data into file.csv which is later read by the to start the consumers.py lo op that retrieves real-time traffic
backend Python script. The code snippet below illustrates statistics from file.csv. This information is transmitted to the
the packet information the Tshark script will collect. front end, where it is shown. The user can track IoT network
traffic since it updates in real-t ime.

Apache backend – server.php IoT network consists of an


ultrasonic sensor, Arduino, and NodeM CU which sends data to
the apache server using a

ISBN 978-1-6654-7576-1/19/$31.00 ©2022 IEEE

Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.
Fig. 5. Notifications Page

The network packet analyzer, Tshark collects


network traffic data which is passed to the network logs
module to display the traffic in the IoT network in real-time.
The web socket is generated in a manner identical to the
notifications module, to display network traffic statistics in
real time, therefore, allowing the user to keep track of the
collected IoT network traffic. This effectively enables the
user to be able to see in real-time the network traffic
transmitted in the IoT network.

Notifications - The notification module notifies the user if


an attack is detected in the IoT network. React.js’ front end
and Django's back end are connected by a web socket called
attackNotif.w s. The real-time read loop is started by the
web socket in consumers.py, which takes the data from
file.csv, analyses it, and then continuously feeds it to the ML
model

[5] RESULTS Fig. 6. Network Logs Page

The web application is made up of different modules, there is


the attack notification module and the network logs module All the objectives listed above were met, and the
where the network traffic collected by the network packet project artifact produced was a web-based application
analyzer is displayed. When an anomaly is detected in the IoT capable of detecting intrusions in an IoT network. The web-
network by the machine learning classifier, the notification based application can give a classification of the detected
module displays a notification to alert the user. This is attacks or anomalies. It does this by using the Random
achieved by the use of web sockets to convey notification data
Forest classifier trained and tested using a dataset sourced
from the backend to the front end. We noted that on real-time
from UNSW Canberra. The web application utilizes the use
of a Tshark script to analyze the network in real-time and
data the classifier can detect and send an alert if an anomaly
send the network logs. This system works on a Linux-based
has been detected. It is also able to save the packet information
machine.
on the detected anomaly.
[6] BUSINESS BENEFITS

The incorporation of IoT technology in everyday life means


such a security system technology will be extremely
beneficial. The ability to detect anomalies and attacks in
real-time will aid busin

ISBN 978-1-6654-7576-1/19/$31.00 ©2022 IEEE

Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.
esses to save time and financial resources in responding to REFERENCES
incidents they did not anticipate.
[1] M. F. Elrawy, A. I. Awad, and H. F. A. Hamed, “Intrusion detection
Users can also use the information they gather from the systems for IoT-based smart environments: a survey,” Journal of
Cloud Computing, vol. 7, no. 1. Springer Verlag, Dec. 01, 2018. doi:
detected anomalies to implement more effective controls in 10.1186/s13677-018-012 3-6.
their IoT environments. The detected anomalies will show [2] K. V. V. N. L. Sai Kiran, R. N. K. Devisetty, N. P. Kaly an, K.
Mukundini, and R. Karthi, “Building a Intrusion Detection System for
any weaknesses in the IoT environment. Effectively users IoT Environment using Machine Learning Techniques,” in Procedia
can take a leading role in dealing with the weaknesses other Computer Science, 2020, vol. 171, pp. 2372–2379. doi:
than responding late when the network is already down. 10.1016/j.procs.2020.04.257.
[3] M. M. Patel and A. Aggarwal, “Security attacks in wirel ess sensor
networks: A survey,” in 2013 International Conference on Intelligent
IoT devices collect huge amounts of data. Dat that Systems and Signal Processing ( ISSP), 2013, pp. 329–333. doi:
10.1109/ISSP.2013.6526929
privacy is paramount in today’s world. Such technology will [4] J. Asharf, N. Moustafa, H. Khurshid, E. Debie, W. Haider , and A.
aid in data protection and privacy. This in turn will save a Wahab, “A review of intrusion detection systems using machine and
business from losing confidential data that their IoT deep learning in internet of things: Challenges, solutions and future
directions,” Electroni cs (Switzerland), vol. 9, no. 7. MDPI AG, Jul.
environment collects and exchanges. 01, 2020. doi: 10.3390/electronics9071177.
[5] E. P. Nugroho, T. Djatna, I. S. Sitanggang, A. Buono, an d I.
[7] CONCLUSION Hermadi, “A Review of Intrusion Detection System in IoT with
Machine Learning Approach: Current and Future Research,” in 2020
6th International Conference on Scie nce in Information Technology:
Embracing Industry 4.0: Towards Innovation in Disaster
The intended solution was successful in achieving the Management, ICSITech 2020, Oct. 2020, pp. 138–143. doi:
predetermined goals. The use of an Arduino Uno board, Node 10.1109/ICSITech49800.2020. 9392075.
MCU module, ultrasonic sensor, and temperature and [6] A. Khraisat and A. Alazab, “A critical review of intrusion detection
systems in the internet of things: techniq ues, deployment strategy,
humidity sensors allowed for the successful simulation of an validation strategy, attacks, public datasets and challenges”, doi:
IoT ecosystem I n the development of an IoT testbed. Using 10.1186/s42400-02 1-00077-7.
[7] S. T. Bakhsh, S. Alghamdi, R. A. Alsemmeari, and S. R. Hassan, “An
the scikit learn to package, a machine learning classifier was
adaptive intrusion detection and prevention system for Internet of
trained and evaluated, and random forest performed better, Things,” Int J Distrib Sens Netw, vol. 15, no. 11, Nov. 2019, doi:
with an accuracy of 87%. To identify intrusions and notify the 10.1177/1550147719888 109.
[8] T. A. Mohamed, T. Otsuka, and T. Ito, “Towards machine learning
user when an anomaly has been discovered in the IoT network based IoT intrusion detection service,” in Lecture Notes in Computer
simulation environment, Random Forest was selected as the Science (including subseries Lecture Notes in Artificial Intelligence
classifier, and the implementation of a web application with and Lecture Notes in Bioinformatics), 2018, vol. 10868 LNAI, pp.
580–585. d oi: 10.1007/978-3-319-92058-0_56
the classifier functioning in the backend was successfully [9] E. Nwafor, A. Campbell, and G. Bloom, “Anomaly-based Intrusion
constructed. Detection of IoT Device Sensor Data using Proven ance Graphs.”
Computing and Communications Conference (IPCCC), 2015 IE
EE 34th International Performance. IEE
[10] J. vom Brocke, A. Hevner, and A. Maedche, “Introduction to Design
Recommendations Science Research,” 2020, pp. 1–13. doi: 10.10 07/978-3-030-46781-4_1. I.
Sommerville, Software engineering.
For future work, researchers can look at catering to more [11] N. Moustafa and J. Slay, “UNSW-NB15: A Comprehensive Data sfor
attack vectors since new and deadly attacks against IoT Network Intrusion Detection systems (UNSW-NB1 5 Network Data Set).”
[Online]. Available: https://cve. mitre.org/
environments are evolving with time. An active IDS which
will have a prevention module can be added so that it can
detect and then take measures to prevent the anomaly from
causing damage. Currently, the system is web-based,
however, to cater to everyone, a mobile application can also
be produced to include Android and iOS.

ISBN 978-1-6654-7576-1/19/$31.00 ©2022 IEEE

Authorized licensed use limited to: FLORIDA INTERNATIONAL UNIVERSITY. Downloaded on August 18,2023 at 18:33:12 UTC from IEEE Xplore. Restrictions apply.

You might also like