0% found this document useful (0 votes)
24 views5 pages

Real Time Intrusion Detection System For IoT Networks

Real_Time_Intrusion_Detection_System_For_IoT_Networks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views5 pages

Real Time Intrusion Detection System For IoT Networks

Real_Time_Intrusion_Detection_System_For_IoT_Networks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2021 6th International Conference for Convergence in Technology (I2CT)

Pune, India. Apr 02-04, 2021

Real Time Intrusion Detection System For IoT


Networks
Rhishabh Hattarki Shruti Houji Manisha Dhage
Computer Department Computer Department Computer Department
Sinhgad College of Engineering Sinhgad College of Engineering Sinhgad College of Engineering
Pune, India Pune, India Pune, India
[email protected] [email protected] [email protected]

Abstract—The proliferation of IoT devices has piqued the network [9]. Also the proposed system was peer-to-peer and
interest of several adversaries looking for a different means to consolidated. Machine learning algorithms were used to
gain unauthorized access to systems or for other illicit reasons. detect real-time anomalous network behaviors but more
As a result, protecting these devices is essential. The IDS acts accuracy was needed for detection of attacks which also
as a second line of defense after the firewall and can be failed to categorize it into different types.
beneficial in the IoT networks. This paper presents a Real
Time Intrusion Detection System based on the Machine Nilam Upasani and Hari Om came up with a modified
Learning model Random Forest and has been set up for the neuro-fuzzy classifier and it has been implemented using a
2021 6th International Conference for Convergence in Technology (I2CT) | 978-1-7281-8876-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/I2CT51068.2021.9417815

IoT node consisting of Arduino, NodeMCU and an Ultrasonic modern GPU [7]. That helped them achieve a considerable
sensor. Unlike most of the systems that train and test the model speedup for training, classification and recognition phases.
only on data from the dataset, this has been tested with real They had a low false positive rate and recall time, however,
time network traffic. The dataset used is self made, created by the dataset used, KDD cup’99 was not IoT specific and
monitoring the network traffic of our IoT network and not the using GPUs is also not suitable for most IoT devices.
usual popular dataset that is not IoT specific.
Muder Almiani, Alia AbuGhazleh, Amer AI-Rahayfeh,
Keywords—Intrusion detection, Internet of Things, Network Saleh Atiewi proposed a Fog computing based deep
security, Machine learning, Real Time IDS recurrent neural network for IoT intrusion detection system
[5]. It is trained and tested using the NSL-KDD dataset. It
I. INTRODUCTION adapted a recurrent Neural Network trained by an advanced
With the Internet of Things and wireless networks version of backpropagation algorithm where each network
becoming mainstream and ubiquitous, the major concern in was adaptively tuned to different parameters to enhance
IoT systems is handling the security of IoT devices and detection of specific types. Although this model showed a
protection of data from attacks. Protection from these high sensitivity to Dos attacks, it proved insufficient to
attacks is a challenge due to heterogeneity of devices and detect other types of attacks.
protocols, direct exposure of devices to the internet and Felipe de Almeida Florencio, et al. have created an IDS
resource constraints on devices. Security solutions that can using a Multilayer Perceptron NN and tested it on an
provide real time attack detection and diminution are in Arduino [3]. They did test the model on a low powered
demand. The goal of our system is to address this security device however, the dataset used was NSL-KDD which is
gap by implementing an IDS to provide a comprehensive not specific to IoT devices. Also, the tests were done on the
security solution for IoT networks. dataset and not on real time network traffic.
An Intrusion Detection System is a proactive intrusion Muhammad Ashfaq Khan, Md. Rezaul Karim and
detection tool used to detect and classify intrusions, attacks Yangwoo Kim proposed a scalable and hybrid intrusion
or violations of the security policies automatically at detection system based on the Convolutional-LSTM
network level, host level or hybrid infrastructure in a timely network which acted as misuse detection model for both
manner. In order to defend against cyber attacks in IoT, our local and global latent threat signatures and Spark ML
paper presents a real time IDS for IoT networks using which acted as anomaly detection module using ISCX-UNB
random forest algorithm to identify 5 different types of dataset [6]. ML classification algorithms such as DT, RF,
attacks - Wrong Setup, Distributed Denial of Service, Data GBT and SVM were used for detecting attacks which
Type Probing, Scan Attack and Man in the Middle. showed better accuracy and less computational complexity.
II. RELATED WORK However, no attack was detected on real-time streaming of
data.
A. Literature Survey
Tariq Ahamad Ahanger, Usman Tariq and Muneer Nusir
Valerio Morfino, Salvatore Rampone have presented a have presented a real time system using edge computing
random forest based solution that focuses on the popular syn where the processing is transferred to the edge of the
dos attack [12]. They’ve used Apache Spark which is network i.e. on one server/computer of the network [11].
efficient in handling big data, which is used to create a real They show lower RAM and CPU consumption than Snort
time system here. However, the performance is measured and BroIDS, which is normal since the project is being
based on accuracy and not other important factors like false compared to full blown production systems.
alarm rates that tend to be higher in systems like these.
Vinayakumar R, Mamoun Alazab, Soman KP,
Shiver Chawla, Geethapriya Thamilarasu proposed an Prabaharan Poornachandran, Ameer Al-Nemrat, and
independent integrated intrusion detection system which Sitalakshmi proposed a hybrid based scalable framework
provided security as a service and was integrated into any

978-1-7281-8876-8/21/$31.00 ©2021 IEEE 1

Authorized licensed use limited to: Univ Sannio. Downloaded on December 03,2021 at 16:53:46 UTC from IEEE Xplore. Restrictions apply.
employed using deep learning model with DNN which was
chosen by comparing the performance with the classical
machine learning classifiers [13]. The DNN model
performed well on KDDCup 99 and was applied on other
datasets like NSL-KDD, UNSW-NB15, Kyoto, WSN-DS,
CICIDS 2017 to conduct the benchmark . But this hybrid
model is not IOT specific and does not give detailed
information on the structure and characteristics of malware.
Overall, the performance can be further improved by
training complex DNN architectures.

K.V.V.N.L Sai Kiran, R.N. Kamakshi Deviesetty, N.


Pavan Kalyan, K. Mukundini, R. Karthi proposed an IDS
for IoT network consisting of Node MCU ESP8266, DHT11
sensor and wireless router [4]. In the normal phase, sensor Fig. 1. System Overview - Architecture Diagram.
values captured by Node MCU are transmitted to
ThingSpeak server using wireless gateway and in attacking C. Module-wise details
phase, Man in the Middle attack is performed in the network
1) IoT Node: The IoT node consists of 3 devices –
using ARP poisoning. Moreover, Naïve Bayes, SVM,
Arduino UNO, Node MCU, Ultrasonic sensor.
decision tree and Adaboost classifiers are used to categorize
the data. Further the performance of the system can be a) Ultrasonic sensor: Its end goal is to measure
enhanced by creating the dataset and detecting the attacks at distance. It emits ultrasonic waves and receives them.
real-time. b) Arduino UNO: This controls the ultrasonic sensor,
measures the time between sending and receiving the waves
One of the few IoT based IDSs that have used raw traffic which is used to calculate the distance from the object using
data is the real time system from Christian Callegari, Elena the formula-
Bucchianeri, Stefano Giordano and Michele Pagano [2]. The distance= duration*0.034/2 (1)
dataset used is MAWI traffic traces. Since raw traffic
dataset is used, the system can be optimised in c) Node MCU: The distance data is sent from Arduino
preprocessing as well. However, even here the testing has to Node MCU using a serial communication. This is then
been done on the dataset and not on real time network sent to the php script on the Apache server by connecting to
traffic. the wifi network.
B. Survey Conclusion 2) TShark Script: TShark is a command line utility, a
Most of the IDSs have been made using the popular network packet analyzer that is used to capture packets from
datasets like NSL-KDD, KDD cup’99. But these datasets a live network and dump the traffic data into file.csv which
are not IoT specific and can’t directly be used for building is later read by the backend Python script. It captures the
IoT systems. A lot of the systems focus only on one fields – frame number, frame time, frame length, source mac
particular attack and do not cover other attacks. They are address, destination mac address, source IP address,
tested offline without real time traffic data. These destination IP address, IP protocol, IP length, TCP length,
drawbacks are overcome with our system. TCP source port, TCP destination port, packet info.
3) Apache Backend: The apache backend has 1 php
III. METHODOLOGY script – server.php and 1 text file – datastorage.txt.
A. System Overview a) server.php – This php script is used to receive
distance data from the IoT node and save it in the
The system passively monitors the data exchange in the
datastorage.txt.
network and looks for malicious activities that can be
b) datastorage.txt – This text file stores the 10 most
classified as an intrusion or attack. It then notifies the user
recent distance values sent from server.php. The older
on the frontend through the web sockets based real time
values are deleted by server.php. This data is used by the
notification system. The code for this project is available in
front end to display graphics in Dashboard.js.
our GitHub repository [8].
4) ML Model:
B. Architecture Diagram a) Dataset: The dataset needed to create the ML
The system consists of an IoT node (Arduino UNO, model was made using traffic data in our IoT network
Node MCU, Ultrasonic sensor connected together), which retrieved from the TShark script. Multiple attacks (wrong
sends distance data to the backend on the Apache server. setup, DDOS, Data Type Probing, Scan Attack, MITM)
This data is displayed on the GUI of the React.js front end. were performed while monitoring the traffic. The records
The network traffic is monitored by the TShark script and were labeled according to known parameters used to
dumped in the file.csv. The Django Python backend reads identify a particular type of attack.
this data in real time, feeds it to the ML model based on
Random Forest and classifies it either as normal or attack.
This data is sent to the front end and displayed in the form
of a notification.

Authorized licensed use limited to: Univ Sannio. Downloaded on December 03,2021 at 16:53:46 UTC from IEEE Xplore. Restrictions apply.
TABLE I. DATASET - IOT TRAFFIC the notifications module, the web socket is created which
Feature Description Example initializes the loop in consumers.py that fetches traffic data
frame.number serial number of frame 31 from file.csv in real time. This data is sent to the FE where it
Apr 11, 2020 is displayed. It updates in real time so the user can monitor
frame.time receiving time of frame
18:57:12.387597272 IST IoT network traffic.
frame.len frame length 241 7) Dashboard: The dashboard module displays the
eth.src source mac address 97:21:ea:d4:cc:2a distance data received from the IoT node with the help of
eth.dst destination mac address 61:fe:8b:ac:ef:31 graphics, thereby simulating the applications of the IoT
ip.src source ip address 192.168.0.106 network. A web socket is created between the FE and the
ip.dst destination ip address 129.146.49.110 php script websockets.php. An initiation message is sent
ip.proto ip protocol 6 which triggers the loop in the php script which reads the
ip.len ip length 227
data from datastorage.txt and constantly sends this data to
tcp.len tcp length 175
Dashboard.js where the node activity is shown.
tcp.srcport tcp port of source 51246
8) Visualisations: Visualisations modules showcases
tcp.dstport tcp port of destination 443
the data exploration, preprocessing and comparisons in
_ws.col.Info summary of information Application Data
results from the dataset and ML model. It has the visuals –
ML model comparison, types of attack, types of protocol,
b) Model Creation: The generated dataset was filtered correlation matrix, confusion matrix, effects of
for erroneous data in dataFilteration.py, resampling was oversampling. The results were obtained in the python files
done to deal with the issue of fewer attack data points than dataFilteration.py, resampling.py, modelCompare.py,
normal data in resampling.py, then it was divided into dataVisual.py and displayed in the visualisations module.
training and testing datasets and a Random Forest classifier
was trained based on training data in model.py. This was D. Algorithm used
tested using the test data and later optimized. This model Random Forest is a supervised machine learning
has been used in consumers.py in the Django backend for algorithm that can be used for classification as well as
real time attack classification. regression [1]. As the name suggests, it is a robust forest
c) Attack Distribution: A total of 5 attacks were made up of many decision trees in which the process of
performed during the training and testing phase. The finding the root node and splitting the feature nodes will
following table describes these attacks along with their take place randomly. It has many advantages as compared to
frequency. other classifiers in ML like it overcomes the problem of
overfitting, it can handle missing values, can be modelled
TABLE II. ATTACK DISTRIBUTION for categorical values, possesses very high accuracy and is
Oversampled
flexible to use in real-time. Because of these advantages, we
Id Attack Description Frequency used a random forest classification model on the training set
Frequency
0 Normal Normal traffic 79035 79035 with 20 trees in the forest, with criteria for the Gini
1
Wrong IoT Node wrongly
7691 82285 impurity, random state as 3 and maximum depth of the tree
Setup setup
Multiple malicious
as 3. Further, the test set results are predicted using the
Distributed classifier object. This classifier object is saved as a pickle
devices blocking the
2 Denial of 16596 79020
Service
services to deny the (.pkl) file which is then imported in consumers.py file to
legitimate user
Sending data with the
filter the data in real-time.
Data Type wrong data type
3
Probing (String instead of
209 79002 E. Performance Metrics
Integer) The metrics used for gauging the performance of our
Reconnaissance of the system are as follows [10].
open system ports
4 Scan Attack
before the actual
21612 79052 1) Confusion Matrix: This can be used to understand
attack the correctness and accuracy of the model. All the following
Unknowingly performance measures will be calculated on the basis of
Man in the
5 intercepting traffic 15 79032
Middle
between two nodes confusion matrix. It has the actual values on the X-axis and
the predicted values on the Y-axis. The following table
5) Notifications: The notification module is shows the confusion matrix that we used with 6 fields
responsible for displaying the notification to the user (normal + attacks). The values used are the encoding from
whenever an attack has been detected in the IoT network. A the previous Table, Attack Distribution.
web socket (attackNotif ws) is created between the React.js
TABLE III. CONFUSION MATRIX - FOR 3 (DTP ATTACK)
front end and Django backend to send notification data from
BE to FE whenever an attack is detected. The ws starts the Actual
real time read loop in consumers.py which takes the data 0 1 2 3 4 5
from file.csv, processes it and constantly keeps feeding to 0
the ML model. The model makes classification (normal or Pr 1 TN FN TN
edi
attack) in real time and sends a notification if an attack is cte
2
detected. This notification is displayed in the front end. d 3 FP TP FP
6) Network Logs: The network logs module is used to 4
TN FN TN
display the traffic in the IoT network in real time. Similar to 5

Authorized licensed use limited to: Univ Sannio. Downloaded on December 03,2021 at 16:53:46 UTC from IEEE Xplore. Restrictions apply.
a) TP: True Positives are the packets that were TABLE V. CLASSIFICATION REPORT - DATASET TESTING
actually the attack, here Data Type Probing attack {3} and Id Precision b Precision Recall b Recall F1-score b F1-score
were predicted as that attack {3} as well. 0 1.00 1.00 0.99 0.99 0.99 0.99
b) TN: True Negatives are the packets that were 1 0.85 0.99 1.00 1.00 0.92 1.00
actually not the attack {0,1,2,4,5} and were predicted as any 2 1.00 1.00 1.00 1.00 1.00 1.00
other attack {1,2,4,5} or as normal {0}. 3 0.00 1.00 0.00 0.99 0.00 1.00
c) FP: False Positives are the packets that were not
4 1.00 1.00 1.00 1.00 1.00 1.00
the attack {0,1,2,4,5} but were predicted as that attack {3}.
d) FN: False Negatives are the packets that were 5 0.00 1.00 0.00 1.00 0.00 1.00
actually the attack {3} but were predicted as not the attack Accuracy b 0.9892138063279002
{0,1,2,4,5}. Accuracy 0.9973748149803111
2) Accuracy: is the fraction of correctly predicted
values either attack or normal out of the total. For multiclass Although the accuracy remains similar, there is a drastic
classification the accuracy of the model is the average per improvement in precision, recall and f1-score for attacks 3
class accuracy. Accuracy alone can be deceiving since a (Data Type Probing) and 5 (Man in the Middle) and
highly accurate model can have low recall and overall 1(DDoS) to some extent. This classification report shown in
would be a bad model. Table V shows the results of the tests conducted on the split
σೖ
೅ು೔శ೅ಿ೔ dataset. However, the next classification report in Table VI
೔సభ
Accuracy (model) = ೅ು೔శ೅ಿ೔శಷು೔శಷಿ೔
(2) shows the results of tests conducted on real time traffic. The
௞ actual values and predicted values were fetched while
3) Precision: is the number of packets correctly performing all the attacks one after the other and passed to
identified as the attack out of all the packets predicted as the the classification report function.
attack. This will be calculated on a per class basis.
TABLE VI. CLASSIFICATION REPORT - REAL TIME TESTING
்௉
Precision (class) = (3) Id Precision Recall F1-score Support
்௉ାி௉
0 1.00 0.91 0.95 53769
4) Recall: is the number of packets correctly
identified as the attack out of all the actual attacks. This will 1 0.01 1.00 0.02 27
also be calculated on a per class basis. This metric is 2 1.00 1.00 1.00 1224
important here since having a low recall would mean 3 0.97 1.00 0.98 700
missing the identification of potential attacks. 4 0.00 1.00 0.01 5
5 0.00 0.00 0.00 0
்௉
Recall (class) = (4) Accuracy 0.9117810677433826
்௉ାிே

5) F1-score: is the harmonic average instead of the It can be observed that the accuracy remained fairly high at
arithmetic mean of the precision and recall. It gives a 91.18%, but the real picture of the performance of the model
combined score that gives importance to recall as well as can be obtained by looking at the other metrics. The F1-
precision. score for types 0 (normal), 2 (DDoS), 3 (Data Type
ଶ‫כ‬௉௥௘௖௜௦௜௢௡‫כ‬ோ௘௖௔௟௟
F1-score = (5) Probing) are very high showing good overall performance,
௉௥௘௖௜௦௜௢௡ାோ௘௖௔௟௟
however the other types did poorly on the F1-score. The
recall on 1 (wrong setup) and 4 (scan attack) are very high
IV. RESULTS which shows that of all the ones that were attacks were
correctly classified but since the precision is low, the
A. Confusion Matrix
predicted attacks were much higher than the actual attacks
TABLE IV. CONFUSION MATRIX i.e. it had a high false positive rate.
Actual V. CONCLUSION
0 1 2 3 4 5
An Intrusion Detection System has been developed for
0 23309 105 0 34 53 62
IoT networks that can be used in real time to improve
1 0 24815 0 0 0 0
Predi security. Using this system, the user can fetch their IoT
2 0 0 23576 0 0 0
cted network data, monitor network traffic, get notified when an
3 0 0 0 23675 0 0
intrusion is detected in the network. This has been achieved
4 0 0 0 0 23530 0
using the ML model Random Forest. The reduced latency of
5 0 0 0 0 0 23947
the system which makes it real time is achieved by web
sockets. The model suffers from a reduced precision rate for
B. Classification Report some attacks like Scan attack and MITM, leading to high
The classification report combines the performance rate of false alarms which is common with these types of
metrics - precision, recall, f1-score and accuracy all in one models. However it shows a high level of accuracy 91.18%
table. The report shows the performance of the model before in real time testing and correctly classifies most of the
and after oversampling. The measures with ‘b’ after them attacks.
are the ones that were performed before oversampling.

Authorized licensed use limited to: Univ Sannio. Downloaded on December 03,2021 at 16:53:46 UTC from IEEE Xplore. Restrictions apply.
VI. FUTURE WORK Computing Systems Engineering (SBESC), Salvador, Brazil, pp. 190-
195, 2018.
More work can be done in the future to reduce the false [4] K.V.V.N.L Sai Kiran, R.N. Kamakshi Deviesetty, N. Pavan Kalyan,
K. Mukundini, and R. Karthi, “Building a Intrusion Detection System
alarm rates and make the model more robust. The system for IoT Environment using Machine Learning Techniques,” Elsevier
B.V., pp. 2372-2379, 2020.
can be enhanced by adding more protocols and attacks, [5] M. Almiani, A. AbuGhazleh, A. AI-Rahayfeh, S. Atiewi and A.
which will help in covering more types of IoT devices. New Razaque, “Deep recurrent neural network for IoT intrusion detection
system,” Elsevier B.V, vol.101, November, pp. 102031, 2019
types of attacks can be added while training to make the [6] M. Ashfaq Khan, Md. R. Karim, and Y. Kim, “A Scalable and Hybrid
model more extensive. The system can be tested with a Intrusion Detection System Based on the Convolutional-LSTM
Network,” Symmetry, pp. 583, April, 2019.
higher number of nodes in the IOT network. [7] N. Upasani, H. Om, “A modified neuro-fuzzy classifier and its
parallel implementation on modern GPUs for real time intrusion
detection”, Applied Soft Computing, vol. 82, Elsevier B. V., June
ACKNOWLEDGMENT 2019
[8] R. Hattarki, S. Houji, S. Dixit, S. Patil, "Real Time Intrusion
We thank Prof. M.R Dhage for her expert guidance and Detection System for IoT Networks using Random Forest", GitHub
continuous encouragement throughout to see that adequate Repository, 2020, https://github.com/s3r-be/be-project
[9] S. Chawla and G. Thamilarasu, “Security as a Service: Real-time
research had been conducted to approve this project. We Intrusion Detection in Internet of Things,” ACM ISB, April, pp. 2-4,
would also like to thank our teammates, Sahil Dixit and 2018.
Sanika Patil who worked alongside us to finish the project [10] S.Mohammed. “Performance Metrics for Classification Problems in
Machine Learning”. Feb. 2019,
within the required time frame. medium.com/@MohammedS/performance-metrics-for-classification-
problems-in-machine-learning-part-i-b085d432082b.
[11] T. A. Ahanger, U. Tariq and M. Nusir, "Real-Time Methodology for
REFERENCES Improving Cyber Security in Internet of Things Using Edge
[1] B. Yu, “Analysis of a Random Forests Model,” Journal of Machine Computing During Attack Threats," 2019 International Conference on
Learning Research Gerard Biau, pp. 1063-1095, 2012. Smart Systems and Inventive Technology (ICSSIT), Tirunelveli,
India, pp. 293-297, 2019.
[2] C. Callegari, E. Bucchianeri, S. Giordano and M. Pagano, "Real Time
Attack Detection with Deep Learning," 2019 16th Annual IEEE [12] V. Morfino, S. Rampone, “Towards Near-Real-Time Intrusion
International Conference on Sensing, Communication, and Detection for IoT Devices Using Supervised Learning and Apache
Networking (SECON), Boston, MA, USA, pp. 1-5, 2019. Spark.”, Electronics, vol. 9, pp. 444, March 2020.
[3] F. de Almeida Florencio, E. D. Moreno, H. Teixeira Macedo, R. J. P. [13] Vinayakumar R, M. Alazab, Soman KP, P. Poornachandran, A. Al-
de Britto Salgueiro, F. Barreto do Nascimento and F. A. Oliveira Nemrat, et al., “Deep Learning Approach for Intelligent Intrusion
Santos, "Intrusion Detection via MLP Neural Network Using an Detection System,” IEEE Access, unpublished, pp. 2169-3536, 2018.
Arduino Embedded System," 2018 VIII Brazilian Symposium on

Authorized licensed use limited to: Univ Sannio. Downloaded on December 03,2021 at 16:53:46 UTC from IEEE Xplore. Restrictions apply.

You might also like