0% found this document useful (0 votes)

38 views24 pages

Machine Learning Based Intrusion Detection Systems For IoT Applications

Uploaded by

electro-ub ub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views24 pages

Machine Learning Based Intrusion Detection Systems For IoT Applications

Uploaded by

electro-ub ub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Wireless Personal Communications (2020) 111:2287–2310

https://doi.org/10.1007/s11277-019-06986-8

Machine Learning Based Intrusion Detection Systems for IoT

Applications

Abhishek Verma1 · Virender Ranga1

Published online: 30 November 2019

Abstract
Internet of Things (IoT) and its applications are the most popular research areas at present.
The characteristics of IoT on one side make it easily applicable to real-life applications,
whereas on the other side expose it to cyber threats. Denial of Service (DoS) is one of the
most catastrophic attacks against IoT. In this paper, we investigate the prospects of using
machine learning classification algorithms for securing IoT against DoS attacks. A compre-
hensive study is carried on the classifiers which can advance the development of anomaly-
based intrusion detection systems (IDSs). Performance assessment of classifiers is done in
terms of prominent metrics and validation methods. Popular datasets CIDDS-001, UNSW-
NB15, and NSL-KDD are used for benchmarking classifiers. Friedman and Nemenyi
tests are employed to analyze the significant differences among classifiers statistically. In
addition, Raspberry Pi is used to evaluate the response time of classifiers on IoT specific
hardware. We also discuss a methodology for selecting the best classifier as per applica-
tion requirements. The main goals of this study are to motivate IoT security researchers for
developing IDSs using ensemble learning, and suggesting appropriate methods for statisti-
cal assessment of classifier’s performance.

Keywords Internet of Things · Denial of service · Intrusion detection · Anomaly ·

Significance test · Performance analysis

1 Introduction

Security and privacy aspects of the Internet of Things (IoT) [5, 7, 45, 64, 65] are the key
players which drive its potential to become one of the globally adopted technology in the
future [34, 58]. However, self-configuring and open nature of IoT makes it vulnerable
to various insider and outsider attackers [37, 49]. Attackers may compromise the users’
security and privacy in order to gain access to their personal information, create monetary

* Abhishek Verma
[email protected]
Virender Ranga
[email protected]
1
Department of Computer Engineering, National Institute of Technology Kurukshetra, Kurukshetra,
India

13
Vol.:(0123456789)
2288 A. Verma, V. Ranga

losses, and eavesdropping [72]. These factors prevent global adoption of IoT, consequently
slowing down its growth [30]. Denial of service (DoS) is one of the most catastrophic
attacks that prevent legitimate user to access the service it has paid for [17, 55]. This vio-
lates Service Level Agreement (SLA) terms which leads to huge monetary losses for firms
and organizations. Moreover, DoS also affects the services of small networks, i.e., smart
home, healthcare and smart agriculture etc [43]. DoS attacks on critical smart applica-
tions such as healthcare may even lead to death like situations because normal services
get delayed [48]. IoT devices (e.g., smart light bulbs, smart door locks, smart television)
are an easy target of attackers which exploit their vulnerabilities in order to perform DoS
attacks [20, 50, 56, 59, 68]. Thus, securing these devices is one of the important concerns
for researchers nowadays [6, 71]. To address this issue, intrusion detection is being heavily
researched worldwide [9, 69]. Intrusion detection systems (IDSs) are categorized into three
classes based on the detection method, i.e., signature, anomaly, and specification. Among
three IDS types, our focus is primarily on anomaly-based IDS [32]. A signature-based IDS
matches network traffic patterns with the attack patterns (signatures) already stored in its
database. In case a match is found, an alarm is raised. A signature-based IDS has high
accuracy and low false alarm rate, however, it is incapable of detecting new attacks. A
specification-based network IDS match traffic behavior (parameters) against a predefined
set of rules and values (specifications) for detecting malicious activities. These specifica-
tions are manually specified by a security expert. In contrast to signature and specification
based IDS, anomaly-based IDS continuously checks network traffic for any deviation from
normal network profile. In case a deviation exceeds the threshold, an alarm is raised to sig-
nify attack detection. The normal network profile is learned using machine learning (ML)
algorithms. A anomaly-based IDS are preferred over signature and specification based IDS
because of its ability to detect new attacks, but this comes with a cost of high false alarm
rate. The effectiveness of anomaly-based IDS depends on the goodness of detection engine
(model or classifier), and this goodness comes with the quality of network traffic patterns
(dataset instances) being used for engine’s training. Once the detection engine has been
trained, it can detect new attacks effectively. Intrusion detection in IoT networks is char-
acterized as a binary classification problem in which a trained classifier aims to classify
network traffic into normal or attack class with maximum accuracy and minimum false
alarms (FAR). The high performance of the classifier in terms of accuracy and FAR solely
depends on the choice of classification algorithm and training data. Security experts prefer
good performing classifier for the task of intrusion detection. Many solutions for intrusion
detection have been proposed in the literature [8, 14, 18, 32, 44, 47] and most of them
are dedicated for traditional networking paradigms only, and as far as the literature is con-
cerned, less work has been done towards the development of ML-based intrusion detection
for IoT applications. Moreover, we didn’t find any work in the literature which statistically
analyzed the significance of classifier’s performance in particular to IoT based intrusion
detection. Also, no work in the literature has realized the execution of classifier on an IoT
hardware. Thus, our focus is basically on utilizing ML classification algorithms for build-
ing IDS in order to secure IoT against DoS attacks.
In this work, we carry out a performance assessment of ML classifiers for IDS in par-
ticular to IoT. The performance of single classifiers including CART and MLP, and clas-
sifier ensembles namely Random forest (RF), AdaBoost (AB), Extreme gradient boost-
ing (XGB), Gradient boosted machine (GBM), and Extremely randomized trees (ETC) is
measured in terms of prominent metrics, i.e., accuracy, specificity, sensitivity, false positive
rate, area under the receiver operating characteristic curve (AUC). Hyper-tuning (finding
the set of optimal parameters) of all the classifiers is done using random search[10]. The

13
Machine Learning Based Intrusion Detection Systems for IoT… 2289

significant differences of classifiers are statistically assessed using a well known statistical
test. Finally, we have tested the performance of classifiers in terms of average response
time on Raspberry Pi, i.e., IoT device [70].
Our primary contributions can be summarized as follows.

• Performance assessment of different ML classifiers on CIDDS-001, UNSW-NB15,

NSL-KDD datasets with repeated hold-out and repeated cross fold validation methods
is done.
• Statistical assessment of performance results using widely used Friedman test (non-
parametric statistical test) and Nemenyi post-hoc test, i.e., Friedman test for classifier
significance test and Nemenyi test for pairwise comparison among classifiers is done.
• Implementation and execution of classifiers on Raspberry Pi hardware for realizing
actual response time on real IoT hardware is carried out.

The paper organization is as follows. Section 2 discusses recent works in the concerned
domain. Section 3 provides a brief discussion on classification algorithms, i.e., single clas-
sifiers, and ensembles. Experimental design is discussed in Sect. 4. Discussion related to
the classifier’s performance and statistical tests is done in Sect. 5. Section 6 concludes the
paper.

2 Related Work

There are few works present in the literature which suggest methods for defending IoT
against DoS attacks. Like, Misra et al. [46] proposed a specification-based IDS based on
Learning Automata for preventing distributed DoS attacks against IoT. The authors consid-
ered preventing IoT middle-ware layer rather than a particular device. The proposed secu-
rity system sets a threshold for the number of requests a middleware layer can service. As
soon as the number of incoming requests exceeds the set threshold, an attack is detected.
Kasinathan et al. [39] proposed a signature-based IDS framework for detecting DoS attack
in IoT. The proposed framework consists of a monitoring and detection modules. These
modules are integrated with the network framework of European Union (EU) FP7 pro-
ject ‘ebbits’ for securing the network against DoS attacks. A DoS protection manager and
IDS are integrated with the ‘ebbits’ network. A network-based IDS is used for capturing
and analyzing the packets sniffed from IDS probe node’s that are spread across the net-
work. The evaluation results show that the proposed framework performs well in terms of
true positive and false positive rate. Kasinathan et al. [38] proposed an IDS to detect DoS
attacks. Suricata [1] (open source IDS) is used for pattern matching and attack detection. A
probe node is used to sniff all the packet transmissions in the network and transfer informa-
tion to IDS for further analysis. Penetration testing tool ‘Scapy’ is used to test the perfor-
mance of the proposed IDS. No simulation study is done in support of IDS performance
and its usability. Moreover, the authors did not mention any details regarding signature
database management (update). Lee et al. [42] proposed a novel IDS for detecting DoS
attacks. The key idea behind the proposed IDS is to analyze the node’s energy consumption
in order to track malicious nodes. The authors proposed various models for normal energy
consumption in mesh routing based networks. The proposed security system requires
nodes to monitor their own energy consumption at a sampling rate of 0.5 s. The proposed
IDS continuously checks the energy consumption of nodes against the defined threshold,

13
2290 A. Verma, V. Ranga

and whenever a deviation is found for any node, such a node is marked as malicious and
removed from the routing table. The proposed approach shows promising results in terms
of accuracy only. The major concern with this proposed approach is that there is no inbuilt
mechanism to verify the integrity of energy consumption values being reported by a node.
Sonar et al. [60] proposed an IDS to detect distributed DoS attacks in IoT networks. The
authors implemented IDS as software-based manager deployed between network and gate-
way. The proposed IDS maintains greylist and blacklist or IP addresses in order to control
the access to the network. In the proposed IDS, the greylist is updated every 40 s while
blacklist is updated every 300 s. Simulation of the proposed IDS is performed on the con-
tiki [23] operating system. The proposed IDS performs do not achieve satisfactory perfor-
mance in terms of packet delivery ratio, the number of serviced packets, true positives, and
false positives. Moreover, the recovery time is larger than agent learning time which adds to
the major limitations of the proposed IDS. Diro et al. [21] proposed a deep learning based
IDS for defending DoS against IoT networks. The proposed model is evaluated using NSL-
KDD dataset. The authors performed the comparison of proposed IDS with the traditional
shallow model approach. In addition, the proposed IDS is implemented with centralized
and distributed detection scheme. The comparison results show that the distributed attack
detection scheme performs better compared to centralized detection scheme in terms of
accuracy. Similarly, the deep model shows better results compared to the shallow model in
terms of accuracy, precision, recall, and F1 measure. Tama et al. [61] proposed an anomaly
based IDS that uses gradient boosted machine (GBM) as a detection engine. The optimal
parameters of GBM are obtained using grid search and the performance of the proposed
IDS is validated using hold-out and cross fold methods on three different datasets namely
UNSW-NB15, NSL-KDD, and GPRS. The authors show that proposed IDS outperforms
the fuzzy classifier, GAR forest, tree based ensembles in terms of accuracy, specificity,
sensitivity, and area under curve (AUC). Primartha et al. [52] studied the performance of
RF based IDS in terms of accuracy and false alarm rate. The authors employed NSL-KDD,
UNSW-NB15 and GPRS dataset for model training and testing. The proposed IDS is stud-
ied with different tree size ensembles, and statistical analysis based on Friedman ranking
showed that the ensemble of 800 trees achieves best results whereas an ensemble of 20
trees showed the worst performance. Moreover, the proposed RF based IDS outperforms
ensemble of Random tree + Naive Bayes, and single classifiers like NBTree and Multi-
layer perceptron.

3 Classification Algorithms

Wolpert et al. [67] stated a theorem popularly known as “no free lunch” theorem that shows
the importance of experimenting with different machine classifiers for solving classification
tasks. The theorem states that “there is no single learning algorithm that universally performs
best across all domains” [22]. Thus, different classifiers should be tested for solving domain
specific problems, and in our case, the problem is intrusion detection or classification prob-
lem. We consider two types of classification algorithms, i.e., ensembles and single classifiers.
Among ensembles, widely studied algorithms [29, 41, 57] like Random forest (RF), AdaBoost
(AB), Gradient boosted machine (GBM), Extreme gradient boosting (ETC), and Extremely
randomized trees (ETC) are chosen. There are main reasons for selection of mentioned clas-
sification algorithms. First, because ensemble-based classification methods are prone to
over-fitting in case the number of input features is large we also choose to study some single

13
Machine Learning Based Intrusion Detection Systems for IoT… 2291

classifiers like Classification and regression trees (CART), and Multi-layer perceptron (MLP).
Second, the performance of ensembles has not been studied in depth for CIDDS-001 and
UNSW-NB15 datasets. Third, the performance of ensembles and single classifiers over real
IoT hardware has not been studied yet, which motivated us for carrying this analytical study.

3.1 Classifier Ensembles

This section discusses various classifier ensembles in brief. Ensembles have been proven to be
good classification and regression algorithms in the literature. Thus, we have used five differ-
ent ensembles in this analytical study.

3.1.1 Random Forest (RF)

RF[12] is a collection of trees, i.e., predictors {t(xin , 𝜃n ), n = 1, …} which individually make

predictions on a given input xin. Each predictor depends on the random set of variables {𝜃n}
that are sampled independently with the same distribution. The main idea behind RF is that
the number of predictors together might achieve better prediction accuracy while avoiding the
over-fitting problem. Each predictor in RF grows to a maximum size without getting pruned.
Once a large number of trees are created, they make predictions over the input data by voting
for the most popular class at input xin. For the performance assessment the number of estima-
tors (trees) is set to 500 and maximum depth for tree construction is set to 26 as recommended
by [12, 61]. The other parameters are obtained using randomized search.

3.1.2 AdaBoost (AB)

AB[25] is an adaptive meta-estimator that learns the initial training weights on the original
dataset. These weights act as input to additional copies of the classifier based on incorrectly
classified instances. The subsequent classifiers adjust the weights of classified instances, i.e.,
difficult cases. In this way, AB improves the performance of learning algorithms by boosting
weak learners such that final model converges to a strong learner. Equation (1) represents a
boost classifier where cp resembles a weak learner and x resembles an input object.
P
∑
CP (x) = cp (x) (1)
p=1

cp (x) returns the value indicating predicted class. Each cp generates an output hypothesis
(h(xi )) for each instance in the training set. At each iteration p, a cp is chosen and assigned a
coefficient 𝛽p such that the sum training error Et (represented as Eq. 2) of the resulting p-stage
CP (x) is minimized. Cp−1 (xi ) represents the boosted classifier built from previous training
phase, E(C) is error function which is to be minimized, and cp (x) = 𝛽p h(xi ) is the weak learner
that is to be added to the final model, i.e., classifier. The optimal parameters of AB include 50
estimators and 0.1 learning rate.
∑
Et = E[Cp−1 (xi ) + 𝛽p h(xi )] (2)
i

13
2292 A. Verma, V. Ranga

3.1.3 Gradient Boosted Machine (GBM)

GBM [26, 27] is a member of the ensemble family which aims to improve the performance
of decision trees (DT). Like other boosting methods it sequentially combines weak classi-
fiers, i.e., DT, and allows them to optimize an arbitrary differential loss function in order
to form a strong prediction model. Each present learner (tree) relies on the predictions of
previous learners in order to improve the prediction errors. Formally, let us consider a set
of random input variables and random output represented by x (Eq. 3) and z respectively.
x = {x1 , x2 , … , xN } (3)
Our aim is to find an estimate A (approximation) that maps x to z by using training data
{z, xi }N1 . A is represented as Eq. 6. Given a dataset (S) with p samples and q features as rep-
resented by Eq. (4). Then, a ensemble utilizes M additive function to predict final output
Eq. (5).
S = {(xi , zi )}(|D| = p, xi ∈ ℝq , zi ∈ ℝ) (4)

M
∑
ẑ i = 𝜙(xi ) = fm (xi ), fk ∈ A (5)
m=1

A = {f (x) = wr(x) (r ∶ ℝm → K, w ∈ ℝK )} (6)

A is the instance set of DT, and r represents the tree structure that relates an instance to the
correlating leaf index, K indicates the total count of trees, and fm is a single tree with struc-
ture r and leaf weight w. The tree ensemble makes the final prediction by summation of the
scores (w) of leaves which are found by classifying given test sample. Suppose, a first clas-
sifier (i.e., tree) makes prediction h1 (x) over a sample {(xi , yi )}N1 . Then, h1 (x) is fed as input
to next classifier in order to adjust the weights of previously misclassified instances. Con-
sequently, next classifier makes prediction h2 (x) over {(xi , yi − h1 (xi ))}N1 . The final predic-
tion h(x) for a given sample data S is the summation of predictions made by the trees while
minimizing the prediction error. The hyper-tuned parameters for GBM are: 500 estimators,
maximum tree construction depth is 3, minimum samples required for split are 100 and 0.1
learning rate.

3.1.4 Extreme Gradient Boosting (XGB)

XGB [15] is also known as regularized gradient boosting is an improved version of GBM.
Like GBM, XGB follows the same principle of gradient boosting. The only key differ-
ence between them is in terms of modeling details. XGB uses more regularized model
formalization in order to control over-fitting and increase generalization ability while GBM
focuses only on the variance. Regularization parameter (𝜁 ) is mathematically expressed as
Eq. (7). Where, Tl is the number of leaves in the tree, w2j is the score on the jth leaf, 𝜆 rep-
resents regularization term the controls model complexity. XGB uses gradient boosting for
optimizing the loss function during model training. Typically, for binary classification the
LogLoss function (L) [11] is used and represented as Eq. (8). Where, N is the total num-
ber of observations, yi is the binary indicator of whether predicted class c is the correct

13
Machine Learning Based Intrusion Detection Systems for IoT… 2293

classification for particular observation o and pi is the predicted probability that particular
observation o is of class c . Most importantly, L controls the predictive power, and 𝜁 con-
trols the simplicity of the model. The major implementation enhancement of XGB includes
usage of sparse matrices (DMatrix) with sparsity aware algorithms, improved data struc-
tures, parallelization support. Thus, XGB leverages the hardware to achieve high speed
computing with low memory utilization (primary memory and cache). The optimal values
of parameters obtained for XGB are: 100 estimators, maximum tree depth is 8, value of
minimum child weight is 1, gbtree booster is considered, minimum loss reduction and sub-
sample ratio are 2 and 0.6 respectively.
T
1 ∑ 2
𝜁 = 𝛾Tl + 𝜆 w (7)
2 j=1 j

N
1 ∑
L=− (y log(pi ) + (1 − yi )log(1 − pi )) (8)
N i=1 i

3.1.5 Extremely Randomized Trees (ETC)

ETC [33] also known as Extra trees is a tree induction algorithm for performing supervised
classification and regression. To be more specific, ETC builds an ensemble of unpruned
DTs. The key procedure in ETC involves randomly selecting both features and cut-point
irrespective of the target variable. At each tree node, this procedure is followed with totally
or partially selecting a certain number of features among which the optimal one is deter-
mined. In the worst case, the algorithm selects a single feature and cut-point at each node.
In this manner, totally randomized trees are built which are independent of the training
sample’s target attribute values. The classical top-down methodology is followed while
building the ensemble. Unlike other tree-based ensemble algorithms, ETC uses complete
training sample rather than bootstrap replicas in order to grow the trees while minimiz-
ing bias and variance. The ETC splitting procedure for numeric features has three impor-
tant parameters. The first parameter is K indicates the number of features selected at each
node. The second parameter is nmin which represents minimum training set size for split-
ting a node. The third parameter is Tcount that is the number of trees in the ensemble. All
three parameters play a significant role in the ETC building. Where K is responsible for the
strength of feature selection procedure, nmin governs the averaging output noise, and Tcount
specifies the reduction in variance. ETC performs almost the same as RF, however with
optimal feature selection ETC is computationally faster compared to RF. The hypertuned
parameters obtained from randomized search for ETC are: 1788 estimators, maximum tree
depth value is 10, minimum sample size for split is 5, number of features considered for
best split are log10 2, gini criterion is considered with no bootstrapping.

3.2 Single Classifiers

This section discusses single classifiers in brief. In this comparative study we have used
classification and regression trees, and multi-layer perceptron.

13
2294 A. Verma, V. Ranga

3.2.1 Classification and Regression Trees (CART)

CART [13] is one of the widely employed ML methods for predictive modeling prob-
lems. It is a non-parametric algorithm with a built-in mechanism to handle missing fea-
ture values. CART involves recursive partitioning of training samples and fitting of a
simple prediction model within each partition. This partitioning can be represented as
DT graphically. CART employs exhaustive search technique, in order to identify the
splitting variables such that the total impurity of node’s child nodes (two children) is
minimized. CART uses the Gini index as its impurity function which makes it computa-
tionally efficient over entropy based classification tree algorithms. The number of folds
in the internal cross-validation and the minimal number of observations at the terminal
nodes considered are 5 and 2, respectively as used in [61]. The optimal value of maxi-
mum depth of tree construction obtained is 10.

3.2.2 Multi‑layer Perceptron (MLP)

MLP [35] is a logical unit of connected nodes (artificial neurons) that attempts to
mimic the biological brain behavior commonly referred as a feed-forward artificial neu-
ral network. It learns its expertise towards a particular task using supervised learning
approaches. MLP comprises several layers, i.e., input, middle and output. Training of
MLP involves learning a mathematical function f(.) shown in Eq. (9), where d, c are the
number of inputs and outputs respectively. In order to perform any predictive task, MLP
learns a non-linear function approximator over a set of input features { I = i1 , i2 , … , id }
and output variable O, i.e., class. The leftmost layer also termed as input layer consists
of many artificial neurons {ip |i1 , i2 , … , id } each representing a particular input feature.
The second layer, i.e., the middle layer performs the task of transformation. First, the
outputs from the former layer are summed using weighted linear summation y repre-
sented as Eq. (10). Second, a non-linear activation function ( g(·)) is applied to y which
results into a value that is forwarded to further layers, typically output layer in case sin-
gle hidden layer is present. The rightmost layer is the output layer which receives values
from the last hidden layer and responsible for firing outputs, i.e., final predictions. The
optimal parameters values of MLP obtained are: hidden layer size of 100, logistic acti-
vation function, sgd solver, learning rate of 0.001, and 200 maximum iterations.

f (⋅) = Rd → Rc (9)

d
∑
y= (wp ip ) (10)
p=1

4 Experimental Design

4.1 Experimental Setup

The performance assessment has been carried out on a machine operated on 64-bit Win-
dows 10 Pro and equipped with Intel‸ i7-7700 four core CPU having 3.60 GHz clock

13
Machine Learning Based Intrusion Detection Systems for IoT… 2295

speed and 12GB main memory. The classifiers are implemented in the Python program-
ming language (version 3.6.1). Parameter hyper-tuning is performed on PARAM Shavak
system operated on 64-bit Ubuntu 14.04 and equipped with Intel‸ Xeon‸ Gold 6132
twenty eight-core CPU having 2.6 GHz clock speed and 96 GB main memory. Rasp-
berry Pi 3 Model B+ operated on Raspbian operating system and equipped with 64-bit
quad-core ARM CPU running having 1.4 GHz clock speed and 1 GB main memory is
used for assessing the response time of classifiers. Popular ML library scikit-learn [51]
is utilized for implementing classifiers. In order to perform statistical tests on the per-
formance results, we used the STAC [54] web platform application.

4.2 Datasets

In this study three different datasets, i.e., CIDDS-001 [2], UNSW-NB15 [4], and NSL-
KDD [3] are used. We choose CIDDS-001 and UNSW-NB15 dataset as they are most
recently generated datasets and contain traffic of real data, and hence can be beneficial for
building accurate IDSs for monitoring and detection of new type of DoS attacks in IoT net-
works. The CIDDS-001 dataset has recently been released for facilitating the development
of anomaly-based IDS. The complete dataset contains approximately 32 million records
comprising of normal and attacks traffic. CIDDS-001 possesses 12 features and 2 labeling
attributes. Random sampling is employed to extract 100,000 instances from the internal
server traffic data, compromising of 80,000 normal and 20,000 attacks (DoS) records. The
extracted sample is used for carrying out hold-out and cross fold validation tests of classi-
fiers. Our previous works [62, 63] focused on evaluating the performance of various ML
classification algorithms on CIDDS-001 dataset.
Further, we have conducted our experiments on newly publicly available dataset known
as UNSW-NB15. The dataset possesses 49 features and 1 class attribute. A part of the data-
set is used as train and test sets, i.e., UNSW_NB15_Train and UNSW_NB15_Test. The
train set comprises of 175,341 instances, and the test set comprises of 82,332 instances.
The train set includes 56,000 instances of normal traffic and 119,341 instances of attack
traffic. Similarly, the test set includes 37,000 instances of normal traffic and 45,332
instances of attack traffic. Hold-out validation is conducted using the complete train and
test sets, whereas for cross-fold validation test only train set is employed.
Subsequently, NSL-KDD dataset is also used for performing validation of classifiers.
The dataset contains 41 features and 1 class attribute. In this study, KDDTrain+ (training)
and KDDTest+ (testing) sets of NSL_KDD dataset are used. The KDDTrain+ set contains
total 25,192 instances comprising of 13,499 instances of attack traffic and 11,743 instances
of normal traffic. Whereas, the KDDTest+ set contains total 22,544 instances comprising
of 9,711 instances of attack traffic and 12,833 instances of normal traffic. Hold-out and
cross fold validation of classifiers is done on each dataset individually. The choice of these
sets is done in order to avoid random sampling of instances from complete NSL-KDD
dataset.

4.3 Evaluation Metrics and Validation Methods

Selection of input parameter settings influences the overall performance of the clas-
sifiers, thus we follow random search [10] procedure to find the optimal input param-
eters of RF, AB, XGB, GBM, and ETC for different datasets. RandomizedSearchCV
implementation in scikit-learn package of Python programming language is used for

13
2296 A. Verma, V. Ranga

hyper-tuning of parameters. RandomizedSearchCV finds optimal parameter settings

by performing a cross-validated search over candidate parameter values provided by
the user. Prominent metrics for evaluating classifier’s performance have been used in
this study. These metrics include accuracy, specificity or true negative rate, sensitiv-
ity or true positive rate, FPR, and AUC, mathematically represented as Eqs. (11)–(15)
respectively.
TP + TN
Accuracy = (11)
TP + TN + FP + FN

TN
Specificity = (12)
TN + FP

TP
Senstivity = (13)
TP + FN

FP
FPR = (14)
TN + FP

∫0 TP + FN FP + TN
TP FP
AUC = d (15)

where true positive (TP) represents the number of correctly classified attack instances, true
negative (TN) represents the number of correctly classified normal instances, false positive
(FP) is the number of wrongly classified attack instances, and false negative (FN) is the
number of wrongly classified normal instances. Accuracy is defined as the total number of
correctly classified instances over the total number of instances in the dataset. Specificity is
defined as the number of correctly classified normal instances over the total number of nor-
mal instances. Sensitivity is defined as the number of correctly classified attack instances
over the total number of attack instances. FPR is defined as the number of incorrectly clas-
sified attack instances over the total number of normal instances. AUC refers to the area
under the receiver operating characteristic (ROC) curve, where ROC curve defined as plot-
ting TPR against FPR.
In order to perform a comprehensive performance assessment of different classifiers,
we conducted experiments by using repeated hold-out as well as repeated k-fold cross
validation (10f) method [40]. As suggested in [53], the repeated version stabilizes the
error estimation and minimizes the variance of the validation approach. For hold-out
validation, we divided sample dataset into 60:40 ratio (60% training instances and 40%
testing instances) in order to create train and test set. Similarly, for k-fold cross valida-
tion the value of k is considered as 10. We considered 100 rounds of repeated 10f and
hold-out validation as the classification models are observed to be stable, i.e., same
prediction for the same test data. 10f is performed in order to asses the classifier’s per-
formance while avoiding the effect of instance sampling (i.e in case of hold-out valida-
tion). In order to avoid bias, all the performance results reported in this paper are the
mean value of outputs from 10 iterations of each repeated validation approach. Each
experiment is repeated by using different seed (an input to a random number generator)
for avoiding biased results.

13
Machine Learning Based Intrusion Detection Systems for IoT… 2297

4.4 Statistical Significance Tests

In ML studies, comparison of multiple algorithms over multiple datasets is an essential

issue [19]. An algorithm may show better performance over one dataset whereas may fail
to achieve similar result over another dataset. The reason for this may be the presence of
outliers, feature distribution or algorithm characteristics. Thus, it becomes quite difficult to
compare different algorithms among themselves. This consequently makes it challenging
to decide which algorithm is better than others. To address this issue, the statistical assess-
ment is needed to statistically validate the performance results. In this study, two statistical
significance tests [16] are utilized in order to perform the comparison of classifiers in a
correct way. Friedman [28] and Nemenyi [24] tests are selected for this purpose. The sig-
nificance tests help in finding whether the classifiers are significantly different from each
other or not [19, 31]. The null hypothesis ( H0) considered in this case is that there is no
performance difference among classifiers. While alternative hypothesis ( H1) is that there is
at least one classifier that performs significantly different that at least one other classifier.
The main reason behind choosing the Friedman test is that it is the most powerful statistical
test in case the number of entities being compared are greater than five [16, 61]. Friedman
test helps in determining whether at least one classifier performs significantly better than
the others in case of all the datasets. In any one such classifier is found, then Nemenyi post-
hoc test is performed for pairwise multiple comparisons. As suggested in [19], it is crucial
to conduct post-hoc test so as to identify the performance differences among the classifiers.
To be more specific, Friedman test checks for the significant difference among the clas-
sifiers being tested, whereas the Nemenyi test pinpoints where that difference lies. The fur-
ther discussion assumes d as the number of datasets, and k as the number of classifiers. In
Friedman test, initially the performance results ( Xij ) of classifiers are ranked ( R(Xij )) for all
the datasets. Then, sum of the R(Xij ) is computed for each classifier in order to obtain Rj
(Eq. 16), where j = 1, 2, … , k . The Friedman statistic (F-Statistic) is computed as Eq. (17),
where Q is calculated as Eq. (18).
d
∑
Rj = R(Xij ) (16)
i=1

(d − 1)Q
F-Statistic = (17)
d(k − 1) − Q

k ( )2
12 ∑ d(k + 1)
Q= Rj − (18)
dk(k + 1) j=1 2

The F-Statistic is tested against the F-quantiles for a given 𝛼 with degree of freedom,
f1 = k − 1 and f2 = (d − 1)(k − 1), where 𝛼 is the significance level being considered. In
this study the values of d, k are 4 and 7 respectively. Nemenyi post-hoc test is performed by
calculating test statistic 𝛾xy (represented as Eq. 19) for all classifier pairs, where Rx and Ry
are mean ranks of classifiers x and y respectively on all datasets, and computed as Eq. (20).
After all the 𝛾xy are calculated and those which exceed a critical value are said to indicate
a significant difference between classifiers x and y at 𝛼 significance level. In this study, two
values of 𝛼 are considered, i.e., 0.05 and 0.1. The statistical analysis of both hold-out and
10f validation results is carried out in this experimental study.

13
2298 A. Verma, V. Ranga

Rx − Ry
𝛾xy = √ (19)
k(k+1)
6d

d
1∑
Rj = R(Xij ) (20)
d i=1

5 Results and Analysis

In this section, a detailed discussion on performance analysis of ensembles (RF, AB, GBM,
XGB and ETC) and single classifiers (CART and MLP) specific to CIDDS-001, UNSW-
NB15, and NSL-KDD datasets is done. The results are compared and statistically ana-
lyzed. We have shown that the classifiers used in this study are suitable for intrusion detec-
tion in IoT applications. First, we analyze the performance results of hold-out validation.
Figure 1 indicates the average value of all prominent metrics other than FPR, achieved with
hold-out validation across CIDDS-001, UNSW-NB15, KDDTrain+, and KDDTest+ data-
sets. It is observed that RF outperforms other classifiers in terms of accuracy (94.94%) and
specificity (91.6%). GBM performs best in terms of sensitivity (99.53%). In terms of AUC
metric, XGB performs best by achieving 98.76%. MLP is the worst performer in terms of
accuracy (82.76%), whereas AB performs worst in terms of specificity (86.72%) and sen-
sitivity (97.94%). CART achieves lowest AUC value (94.01%). Figure 2 shows the average
FPR values of classifiers across four datasets with hold-out validation. It is observed that
RF performs best whereas AB performs worst among all the classifiers in terms of FPR by
achieving 8.89% and 13.26% respectively. Table 1 lists out model building time (MBT) of
different classifiers across four datasets with hold-out validation. The main reason behind
computing MBT is that it is very important to consider the training time a model takes, as
it would directly impact the resources usage, which is an important criterion for resource-
constrained devices [66]. Thus, MBT helps in making a good trade-off between resource
usage and classification performance of a classifier, i.e., IDS. RF and CART take approxi-
mately 2 s for training on all four datasets. The highest time for model training is taken by
GBM and ETC in case of KDDTrain+ dataset. MBT of all the classifiers is calculated for
hold-out validation only.
Figure 3 shows the average value of all prominent metrics other than FPR, achieved with
10f validation across CIDDS-001, UNSW-NB15, KDDTrain+, and KDDTest+ datasets. It
is observed that the performances of all the used classifiers improve with 10f validation in
comparison to classifier’s performances with hold-out validation. This is due to the effect
of sampling which results in the selection of random instances that leads to poor classifica-
tion. This phenomenon advocates the use of 10f validation over hold-out validation. The
10f validation results show promising performance for all the classifiers. However, from
the point of comparison, CART performs best in terms of accuracy (96.74%). AB achieves
the highest average value of specificity (97.5%) metric. RF and XGB perform best in terms
of sensitivity by achieving 97.31% performance measure. For AUC, the best performing
classifier is XGB which achieves 98.77%. Figure 4 shows the average FPR values of clas-
sifiers across four datasets with 10f validation. It is observed that CART performs best

13
Machine Learning Based Intrusion Detection Systems for IoT… 2299

0.9953

0.9952
0.9946
0.9895

0.9879

0.9876
0.9849
0.9848
0.9841
1

0.9802

0.9794

0.9767
0.98

0.9565
0.9494

0.96

0.9401

0.9315
0.9298
0.94
0.9198
0.916

0.9037
0.92

0.8934
0.8915

0.8914
Value

0.9 0.8857
0.8791

0.8672
0.88

0.86

0.8299
0.8276

0.84

0.82

0.8

RF CART MLP AB GBM XGB ETC

Classification Algorithm
Accuracy Specificity Sensitivity AUC

Fig. 1 The average value of prominent metrics per classifier across four datasets with hold-out validation

Fig. 2 The average value of FPR

per classifier across four datasets 0.4 FPR
with hold-out validation

0.3
Value

0.2
0.1326
0.1207

0.1141

0.1085
0.1084

0.1065
0.0898

0.1

RF CART MLP AB GBM XGB ETC

Classification Algorithm

13
2300 A. Verma, V. Ranga

Table 1 MBT (seconds) of classifiers across four datasets

Dataset RF CART MLP AB GBM XGB ETC

CIDDS-001 0.4124 0.2353 1.0160 0.9557 20.1139 7.3965 17.5031

UNSW-NB15 1.4657 0.6260 4.3782 7.9092 12.7477 23.7196 44.4775
KDDTrain+ 0.4087 0.2653 16.7050 2.6928 318.8914 14.0115 143.0728
KDDTest+ 0.0601 0.0337 5.3062 0.4866 22.7327 2.9957 1.5041

0.9877
0.9853

0.9851
0.99
0.9813

0.9768
0.975
0.9731

0.9731
0.98
0.973

0.973

0.973
0.9674
0.9673

0.9673

0.9673
0.9619

0.9618

0.9618
0.97
0.962

0.962

0.9578

0.96
Value

0.9422

0.95

0.94

0.93

0.92

0.91

0.9

RF CART MLP AB GBM XGB ETC

Classification Algorithm

Accuracy Specificity Sensitivity AUC

Fig. 3 The average value of prominent metrics per classifier across four datasets with 10f validation

whereas RF performs worst among all classifiers in terms of FPR by achieving 3.78% and
21.85% respectively.
The performance results are statistically assessed using Friedman and Nemenyi post-
hoc test. Both the statistical tests are performed for two values of significance level 𝛼
(i.e., 0.05 and 0.1). For 𝛼 = {0.05, 0.1} the value of f1, f2 are 6 and 18 respectively,
the F-Statistic and p value for each performance metric is computed. Table 2 shows
Friedman test statistics for hold-out validation results. From the results it is observed
that the performance of the classifiers is significantly different ( p < 0.05 and p < 0.1)
in terms of all the considered performance metrics. Thus, it is concluded that there is

13
Machine Learning Based Intrusion Detection Systems for IoT… 2301

Fig. 4 The average value of FPR

per classifier across four datasets 0.4 FPR
with 10f validation

0.3

0.2185

0.2184
Value
0.2

0.0379
0.0378
0.1

0.038

0.038
0

RF CART MLP AB GBM XGB ETC

Classification Algorithm

Table 2 Friedman test statistics Accuracy Specificity Sensitivity FPR AUC

for hold-out validation
F-Statistic 6.7745 7.7091 7.1434 7.7091 4.7020
p value 0.0007 0.0003 0.0005 0.0003 0.0048
𝛼 = 0.05 R R R R R
𝛼 = 0.1 R R R R R

Table 3 Friedman test (mean Accuracy Specificity Sensitivity FPR AUC

ranks for hold-out validation)
RF 4.875 5.750 2.875 2.250 3.750
CART 2.250 2.000 3.250 6.000 1.500
MLP 3.250 3.250 2.250 4.750 3.000
AB 1.250 1.500 2.000 6.500 2.875
XGB 6.000 6.375 5.500 1.625 6.000
GBM 5.750 4.625 6.250 3.375 5.875
ETC 4.625 4.500 5.875 3.500 5.000

at least one classifier that performs significantly different that one another classifier.
Because the results of Friedman test are highly significant ( p < 0.05 and p < 0.1), the
null hypothesis H0 is rejected (represented by R in Table 2) and alternative hypothesis
H1 is accepted. In Table 3, the mean ranks of all the classifiers for hold-out validation
are shown. In order to find which classifier pairs perform significantly different, Neme-
nyi post-hoc test is performed. For this purpose, the p value of all the pairwise compari-
sons is tested against the considered significance level 𝛼.
Table 4 presents the results of the Nemenyi test (pairwise comparison) over accuracy,
specificity and sensitivity. As shown in Table 4, the classifier’s accuracy is highly sig-
nificant ( p < 0.05) in case of AB-XGB pair, whereas less significant ( p < 0.1) in case

13
2302

Table 4 Nemenyi pairwise comparison (hold-out validation) Part I

A1 versus A2 Accuracy Specificity Senstivity

13
F-statistic p value 𝛼 F-statistic p value 𝛼 F-statistic p value 𝛼
0.05 0.1 0.05 0.1 0.05 0.1

AB versus XGB 3.1096 0.0393 R R 3.1914 0.0297 R R 2.2912 0.4608 A A

AB versus GBM 2.9459 0.0676 A R 2.0457 0.8563 A A 2.7822 0.1133 A A
AB versus RF 2.3731 0.3704 A A 2.7822 0.1134 A A 0.5728 1.0 A A
AB versus ETC 2.2094 0.57 A A 1.9639 1.0 A A 2.5367 0.2349 A A
AB versus MLP 1.3093 1.0 A A 1.1456 1.0 A A 0.4091 1.0 A A
AB versus CART 0.6546 1.0 A A 0.3273 1.0 A A 0.8183 1.0 A A
CART versus ETC 1.5548 1.0 A A 1.6366 1.0 A A 1.7184 1.0 A A
CART versus MLP 0.6546 1.0 A A 0.8183 1.0 A A 0.6546 1.0 A A
ETC versus MLP 0.9001 1.0 A A 0.8183 1.0 A A 2.3731 0.3704 A A
GBM versus CART 2.2912 0.4608 A A 1.7184 1.0 A A 1.9639 1.0 A A
GBM versus MLP 1.6366 1.0 A A 0.9001 1.0 A A 2.6186 0.1854 A A
GBM versus ETC 0.7364 1.0 A A 0.0818 1.0 A A 0.2455 1.0 A A
GBM versus RF 0.5728 1.0 A A 0.7364 1.0 A A 2.2094 0.57 A A
GBM versus XGB 0.1636 1.0 A A 1.1456 1.0 A A 0.4909 1.0 A A
RF versus CART 1.7184 1.0 A A 2.4549 0.2959 A A 0.2455 1.0 A A
RF versus MLP 1.0638 1.0 A A 1.6366 1.0 A A 0.1636 1.0 A A
RF versus ETC 0.1636 1.0 A A 0.8183 1.0 A A 1.9639 1.0 A A
XGB versus CART 2.4549 0.2959 A A 2.8641 0.0878 A R 1.4729 1.0 A A
XGB versus MLP 1.8003 1.0 A A 2.0457 0.8563 A A 2.1276 0.7007 A A
XGB versus ETC 0.9001 1.0 A A 1.2274 1.0 A A 0.2455 1.0 A A
XGB versus RF 0.7364 1.0 A A 0.4091 1.0 A A 1.7184 1.0 A A
A. Verma, V. Ranga
Machine Learning Based Intrusion Detection Systems for IoT… 2303

Table 5 Nemenyi pairwise comparison (hold-out validation) Part II

A1 versus A2 FPR AUC
F-statistic p value 𝛼 F-statistic p value 𝛼
0.05 0.1 0.05 0.1

AB versus XGB 3.1914 0.0297 R R 2.0457 0.8563 A A

AB versus GBM 2.0457 0.8563 A A 1.9639 1.0 A A
AB versus RF 2.7822 0.1133 A A 0.5728 1.0 A A
AB versus ETC 1.9639 1.0 A A 1.3911 1.0 A A
AB versus MLP 1.1456 1.0 A A 0.0818 1.0 A A
AB versus CART 0.3273 1.0 A A 0.9001 1.0 A A
CART versus ETC 1.6366 1.0 A A 2.2912 0.4608 A A
CART versus MLP 0.8183 1.0 A A 0.9819 1.0 A A
ETC versus MLP 0.8183 1.0 A A 1.3093 1.0 A A
GBM versus CART 1.7184 1.0 A A 2.8641 0.0878 A R
GBM versus MLP 0.9001 1.0 A A 1.8821 1.0 A A
GBM versus ETC 0.0818 1.0 A A 0.5728 1.0 A A
GBM versus RF 0.7364 1.0 A A 1.3911 1.0 A A
GBM versus XGB 1.1456 1.0 A A 0.0818 1.0 A A
RF versus CART 2.4549 0.2959 A A 1.4729 1.0 A A
RF versus MLP 1.6366 1.0 A A 0.4909 1.0 A A
RF versus ETC 0.8183 1.0 A A 0.8183 1.0 A A
XGB versus CART 2.8641 0.0878 A R 2.9459 0.0676 A R
XGB versus MLP 2.0457 0.8563 A A 1.9639 1.0 A A
XGB versus ETC 1.2274 1.0 A A 0.6546 1.0 A A
XGB versus RF 0.4091 1.0 A A 1.4729 1.0 A A

Table 6 Friedman test statistics Accuracy Specificity Sensitivity FPR AUC

for 10f validation
F-Statistic 0.1698 0.2346 0.2740 0.4242 4.5294
p value 0.9816 0.9594 0.9418 0.8532 0.0057
𝛼 = 0.05 A A A A R
𝛼 = 0.1 A A A A R

of AB-GBM pair. While the remaining pairs are not significant ( p < 0.1). Moreover, in
terms of specificity, the highly significant pair is AB-XGB, whereas the less significant
pair is XGB-CART, whilst no other pair is found to be significant. Furthermore, no pair
is significant in terms of sensitivity. Table 5 shows the results of the Nemenyi test (pair-
wise comparison) over FPR, AUC and MBT. It is observed that for FPR metric the clas-
sifier’s performance is highly significant in the case of AB-XGB and less significant in
the case of XGB-CART, whereas remaining pairs are not significant. While, in the case
of AUC metric there are no pairs which are highly significant, whilst GBM-CART and
XGB-CART are the only less significant pairs among all the classifier pairs, every other
pair is not significant.

13
2304 A. Verma, V. Ranga

Table 7 Friedman test (mean Accuracy Specificity Sensitivity FPR AUC

ranks for 10f validation)
RF 4.000 4.000 4.500 4.625 3.625
CART 4.375 4.250 3.375 3.500 1.500
MLP 4.250 5.000 3.375 2.750 3.125
AB 4.625 4.000 4.500 4.750 2.875
XGB 3.125 3.375 4.500 4.250 6.125
GBM 4.000 3.250 4.500 4.625 5.625
ETC 3.625 4.125 3.250 3.500 5.125

Table 8 Nemenyi test (10f A1 versus A2 AUC

validation)
F-statistic p value 𝛼
0.05 0.1

AB versus XGB 2.1276 0.7007 A A

AB versus GBM 1.8003 1.0 A A
AB versus ETC 1.4729 1.0 A A
AB versus CART 0.9001 1.0 A A
AB versus RF 0.4909 1.0 A A
AB versus MLP 0.1636 1.0 A A
CART versus GBM 2.7004 0.1454 A A
CART versus ETC 2.3731 0.3704 A A
CART versus MLP 1.0638 1.0 A A
ETC versus MLP 1.3093 1.0 A A
ETC versus GBM 0.3273 1.0 A A
MLP versus GBM 1.6366 1.0 A A
RF versus CART 1.3911 1.0 A A
RF versus GBM 1.3093 1.0 A A
RF versus ETC 0.9819 1.0 A A
RF versus MLP 0.3273 1.0 A A
XGB versus CART 3.0277 0.0517 A R
XGB versus MLP 1.9639 1.0 A A
XGB versus RF 1.6366 1.0 A A
XGB versus ETC 0.6546 1.0 A A
XGB versus GBM 0.3273 1.0 A A

Table 6 shows Friedman test statistics for 10f validation results. From the results it is
observed that the performance of the classifiers is significantly different ( p < 0.05 and
p < 0.1) in terms of AUC only. Thus, it is concluded that there is at least one classifier that
performs significantly different than one another classifier. Because the result of Friedman
test is highly significant ( p < 0.05 and p < 0.1) for AUC metrics, the null hypothesis H0 is
rejected and alternative hypothesis H1 is accepted (represented by A in Table 6). In Table 7
the mean ranks of all the classifiers for 10f validation results are shown. Table 8 presents the
results of the Nemenyi test (pairwise comparison) over AUC values. As shown in Table 8, the

13
Machine Learning Based Intrusion Detection Systems for IoT… 2305

classifier’s AUC measure is found less significant in case of XGB-CART pair, whereas all
other pairs are not significant.
In addition, we have analyzed the average response time (seconds) that a classifier takes
to classify an instance. The main reason to perform this experiment is that the knowledge of
classifier’s response time plays an important role in its selection as a intrusion detection sys-
tem [36]. The classifiers with quick (small) response time are favored over classifier with slow
(large) response time. To accomplish this task, all the classifiers with test data as input were
executed on Raspberry Pi 3 Model B+. The average time is computed by dividing the total
time taken by the classifier for classifying all the test instances by the total number of test
instances.
∑ntest
ti
Average response time = i=1 (21)
ntest

Equation (21) represents the mathematical expression of average response time, where
i represents an instance number, ti represents time taken by a classifier to classify ith test
instance into attack or normal category, and ntest is the total number of test instances.

·10−4

2.47 · 10−4
2.09 · 10−4
2.5

2
Time (seconds)

1.28 · 10−4
1.05 · 10−4
1.02 · 10−4

1.5
8.29 · 10−5
7.12 · 10−5
5.33 · 10−5
5.33 · 10−5
5.19 · 10−5

1
4.92 · 10−5
3.37 · 10−5
3.01 · 10−5
2.77 · 10−5

3.3 · 10−5

1.25 · 10−5
1.24 · 10−5
1.09 · 10−5

1.09 · 10−5
1.03 · 10−5
9.74 · 10−6

10−6
8.39 · 10−6
8.18 · 10−6

1.16 · 10−6
9.55 · 10−7

0.5
8.22 · 10−7

1.4 · 10−6

8.47 ·

RF CART MLP AB GBM XGB ETC

Classification Algorithm

CIDDS-001 UNSW-NB15 KDDTrain+ KDDTest+

Fig. 5 Average response time of classifiers

13
2306 A. Verma, V. Ranga

Figure 5 shows the average response time taken by different classifiers for classifying
a single instance. From the experiment results, it is observed that CART takes minimum
time to classify an instance of CIDDS-001, UNSW-NB15, KDDTrain+, and KDDTest+
in comparison to other classifiers. RF and XGB show almost similar results in terms of
average response time for all four datasets. ETC takes maximum time for classifying an
instance of CIDDS-001 and KDDTest+ dataset in comparison to other classifiers. Moreo-
ver, GBM is the worst performer in the case of KDDTrain+ dataset. The experimental
results show promising solutions for the choice of different classifiers suitable for perform-
ing the task of intrusion detection (DoS specific) in IoT applications. The classifiers have
been validated using hold-out and 10f validation methods. Both the methods show prom-
ising performance results in terms of accuracy, specificity, sensitivity, FPR, AUC. These
results can be used to select the suitable classifier as per the requirement of the application.
Like, if an application demands high accuracy and low FPR, then CART, MLP, AB, XGB,
or ETC can be used. Whereas, if an application demands quick response time, then CART,
RF, or XGB can be selected. Similar trade-offs can be considered in the selection of the
best suitable classifier for an IoT application. The real-time performance of IDS depends
on the dataset selected for model training. Thus, a dataset containing traffic patterns of
recent types of DoS attacks must be used for achieving the best real-time results. CIDDS-
001 and UNSW-NB15 are suitable choices for this purpose. It can be observed from the
experimental results that classifiers show promising performance results with CIDDS-001
and UNSW-NB15 dataset thus, we suggest these datasets for training IDSs to achieve the
best classification results.
In this study only supervised learning based ML classifiers are used. It is one of the pop-
ular ML approaches in which the classifier uses known target values for training. The result
shown by different ML classifiers shows the effectiveness of using supervised learning for
the intrusion detection task. The main reason for choosing supervised learning is that the
network characteristics (traffic patterns) can be effectively used to train ML models for fur-
ther predictions. These patterns can be differentiated between normal and attack based on
different network based features. The other ML approach like unsupervised learning can
also be used to perform a similar task. Clustering (i.e., unsupervised learning algorithm)
can be used to train the different classifiers. In this paper, the emphasis is made particularly
on the performance assessment of supervised ML algorithms. The performance assessment
of unsupervised ML algorithms for intrusion detection in IoT will be considered in our
future work.

6 Conclusion

In this paper, a study on anomaly-based IDS suitable for securing IoT against DoS attacks
is carried out. The performance assessment of seven machine learning classification algo-
rithms including random forests, adaboost, gradient boosted machine, extremely rand-
omized trees, classification and regression trees, and multi-layer perceptron is done. The
optimal parameters of classifiers are obtained using a random search algorithm. Perfor-
mance of all the classifiers is measured in terms of accuracy, specificity, sensitivity, false
positive rate, and area under the receiver operating characteristic curve. Benchmarking of
all the classifiers is performed on CIDDS-001, UNSW-NB15, and NSL-KDD datasets.
Moreover, in order to find significant differences among classifiers the statistical analysis
of performance measures is done using Friedman and Nemenyi post host tests. In addition

13
Machine Learning Based Intrusion Detection Systems for IoT… 2307

to this, the average response time of all classifiers is evaluated on Raspberry Pi hardware
device. From the performance results and statistical tests, it is concluded that classifica-
tion and regression trees, and extreme gradient boosting classifier show the best trade-off
between prominent metrics and response time, thus both are the suitable choice for build-
ing IoT specific anomaly-based IDS. Our future goal is to design an IDS for defending
routing attacks in IoT networks.

Acknowledgements This research was supported by the Ministry of Human Resource Development, Gov-
ernment of India.

Compliance with Ethical Standards

Conflict of interest On behalf of all authors, the corresponding author states that there is no conflict of inter-
est.

References
1. (2014). Suricata: Open-source ids/ips/nsm engine. Retrieved November 3, 2019, from https://suricata-
ids.org/.
2. (2017). CIDDS-001 dataset. Retrieved November 3, 2019, from https://www.hs-coburg.de/forschung-
kooperation/forschungsprojekte-oeffentlich/ingenieurwissenschaften/cidds-coburg-intrusion-detection-
data-sets.html.
3. (2017). NSL-KDD dataset. Retrieved November 3, 2019, from http://nsl.cs.unb.ca/nsl-kdd/.
4. (2017). UNSW-NB15 dataset. Retrieved November 3, 2019, from https://www.unsw.adfa.edu.au/austr
alian-centre-for-cyber-security/cybersecurity/ADFA-NB15-Datasets/.
5. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., & Ayyash, M. (2015). Internet of Things:
A survey on enabling technologies, protocols, and applications. IEEE Communications Surveys Tutori-
als, 17(4), 2347–2376.
6. Arış, A., Oktuğ, S. F., & Yalçın, S. B. Ö. (2015). Internet-of-things security: Denial of service attacks.
In IEEE 23th signal processing and communications applications conference (SIU) (pp. 903–906).
7. Ashton, K. (2009). That ‘Internet of Things’ thing. RFID Journal, 22(7), 97–114.
8. Axelsson, S. (2000). Intrusion detection systems: A survey and taxonomy. Technical report.
9. Baykara, M., & Das, R. (2017). A novel hybrid approach for detection of webbased attacks in intrusion
detection systems. International Journal of Computer Networks and Applications, 4(2), 62–76.
10. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of
Machine Learning Research, 13(Feb), 281–305.
11. Bishop, C. M. (2006). Pattern recognition and machine learning (Information science and statistics).
Berlin: Springer.
12. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
13. Breiman, L. (2017). Classification and regression trees. London: Routledge.
14. Butun, I., Morgera, S. D., & Sankar, R. (2014). A survey of intrusion detection systems in wireless
sensor networks. IEEE Communications Surveys & Tutorials, 16(1), 266–282.
15. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In ACM, proceedings of the
22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785–794).
16. Conover, W. J., & Conover, W. J. (1980). Practical nonparametric statistics. New York: Wiley.
17. Das, R., Tuna, A., Demirel, S., & Yurdakul, M. K. (2017). A survey on the Internet of Things solutions
for the elderly and disabled: Applications, prospects, and challenges. International Journal of Com-
puter Networks and Applications, 4(3), 84–92.
18. Debar, H., Dacier, M., & Wespi, A. (2000). A revised taxonomy for intrusion-detection systems.
Annales Des Télécommunications, 55(7), 361–378.
19. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine
Learning Research, 7(Jan), 1–30.
20. Dhanjani, N. (2013). Hacking lightbulbs: Security evaluation of the philips hue personal wireless
lighting system. Retrieved November 3, 2019, from https://www.dhanjani.com/docs/Hacking.

13
2308 A. Verma, V. Ranga

21. Diro, A. A., & Chilamkurti, N. (2018). Distributed attack detection scheme using deep learning
approach for Internet of Things. Future Generation Computer Systems, 82, 761–768.
22. Douglas, P. K., Harris, S., Yuille, A., & Cohen, M. S. (2011). Performance comparison of machine
learning algorithms and number of independent components used in FMRI decoding of belief vs.
disbelief. Neuroimage, 56(2), 544–553.
23. Dunkels, A., Gronvall, B., & Voigt, T. (2004). Contiki—A lightweight and flexible operating sys-
tem for tiny networked sensors. In IEEE 29th annual IEEE international conference on local com-
puter networks (pp. 455–462).
24. Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Asso-
ciation, 56(293), 52–64.
25. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an
application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.
26. Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. The Annals of
Statistics, 29(5), 1189–1232.
27. Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis,
38(4), 367–378.
28. Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis
of variance. Journal of the American Statistical Association, 32(200), 675–701.
29. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2011). A review on
ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches.
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4),
463–484.
30. Gao, L., & Bai, X. (2014). A unified perspective on the factors influencing consumer acceptance of
Internet of Things technology. Asia Pacific Journal of Marketing and Logistics, 26(2), 211–231.
31. Garcia, S., & Herrera, F. (2008). An extension on statistical comparisons of classifiers over multiple
data sets for all pairwise comparisons. Journal of Machine Learning Research, 9(Dec), 2677–2694.
32. Garcia-Teodoro, p, Diaz-Verdejo, j, & Maciá-Fernández, G. (2009). Anomaly-based network intru-
sion detection: Techniques, systems and challenges. Computers & Security, 28(1–2), 18–28.
33. Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning,
63(1), 3–42.
34. Granjal, J., Monteiro, E., & Silva, J. S. (2015). Security for the Internet of Things: A survey of existing
protocols and open research issues. IEEE Communications Surveys Tutorials, 17(3), 1294–1312.
35. Haykin, S. (1994). Neural networks: A comprehensive foundation. Englewood Cliffs: Prentice Hall
PTR.
36. Hodo, E., Bellekens, X., Hamilton, A., Dubouilh, P. L., Iorkyase, E., Tachtatzis, C., et al. (2016).
Threat analysis of IoT networks using artificial neural network intrusion detection system. In Inter-
national symposium on networks, computers and communications (ISNCC) (pp. 1–6). IEEE.
37. Hwang, Y. H. (2015). Iot security & privacy: Threats and challenges. In: Proceedings of the 1st
ACM workshop on IoT privacy, trust, and security (pp. 1–1). New York, NY: ACM
38. Kasinathan, P., Costamagna, G., Khaleel, H., Pastrone, C., & Spirito, M. A. (2013). Demo: An ids
framework for Internet of Things empowered by 6lowpan. In Proceedings of the 2013 ACM SIG-
SAC conference on computer & communications security (CCS ’13) (pp. 1337–1340). New York,
NY: ACM.
39. Kasinathan, P., Pastrone, C., Spirito, M. A., & Vinkovits, M. (2013). Denial-of-service detection
in 6lowpan based Internet of Things. In IEEE 9th international conference on wireless and mobile
computing, networking and communications (WiMob) (pp. 600–607).
40. Kim, J. H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out
and bootstrap. Computational Statistics & Data Analysis, 53(11), 3735–3745.
41. Krawczyk, B., Minku, L. L., Gama, J., Stefanowski, J., & Woźniak, M. (2017). Ensemble learning
for data stream analysis: A survey. Information Fusion, 37, 132–156.
42. Lee, T. H., Wen, C. H., Chang, L. H., Chiang, H. S., & Hsieh, M. C. (2014). A lightweight intrusion
detection scheme based on energy consumption analysis in 6lowpan (pp. 1205–1213)., Advanced
technologies, embedded and multimedia for human-centric computing Dordrecht: Springer.
43. Li, X., Lu, R., Liang, X., Shen, X., Chen, J., & Lin, X. (2011). Smart community: An Internet of
Things application. IEEE Communications Magazine, 49(11), 68–75.
44. Lunt, T. F. (1993). A survey of intrusion detection. Computers & Security, 12, 405–418.
45. Medhat, M., Elshafey, K., & Rashed, A. (2019). Evaluation of optimum NPRACH performance in
NB-IoT systems. International Journal of Computer Networks and Applications, 6(4), 55–64.
46. Misra, S., Krishna, P. V., Agarwal, H., Saxena, A., & Obaidat, M. S. (2011). A learning automata
based solution for preventing distributed denial of service in Internet of Things. In IEEE, 4th

13
Machine Learning Based Intrusion Detection Systems for IoT… 2309

international conference on cyber, physical and social computing, Internet of Things (ithings/cpscom)
(pp. 114–122).
47. Modi, C., Patel, D., Borisaniya, B., Patel, H., Patel, A., & Rajarajan, M. (2013). A survey of intrusion
detection techniques in cloud. Journal of Network and Computer Applications, 36(1), 42–57.
48. Moosavi, S. R., Gia, T. N., Rahmani, A. M., Nigussie, E., Virtanen, S., Isoaho, J., et al. (2015). Sea: A
secure and efficient authentication and authorization architecture for IoT-based healthcare using smart
gateways. Procedia Computer Science, 52, 452–459.
49. Mosenia, A., & Jha, N. K. (2017). A comprehensive study of security of internet-of-things. IEEE
Transactions on Emerging Topics in Computing, 5(4), 586–602.
50. Notra, S., Siddiqi, M., Gharakheili, H. H., Sivaraman, V., & Boreli, R. (2014). An experimental study
of security and privacy risks with emerging household appliances. In 2014 IEEE conference on com-
munications and network security (pp. 79–84). https://doi.org/10.1109/CNS.2014.6997469.
51. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-
learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct), 2825–2830.
52. Primartha, R., & Tama, B. A. (2017). Anomaly detection using random forest: A performance revis-
ited. In 2017 International conference on data and software engineering (ICoDSE) (pp. 1–6). IEEE.
53. Rodriguez, J. D., Perez, A., & Lozano, J. A. (2010). Sensitivity analysis of k-fold cross validation in
prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3),
569–575.
54. Rodríguez-Fdez, I., Canosa, A., Mucientes, M., & Bugarín, A. (2015). Stac: A web platform for the
comparison of algorithms using statistical tests. In IEEE international conference on fuzzy systems
(FUZZ-IEEE) (pp. 1–8).
55. Roman, R., Zhou, J., & Lopez, J. (2013). On the features and challenges of security and privacy in
distributed Internet of Things. Computer Networks, 57(10), 2266–2279.
56. Ronen, E., & Shamir, A. (2016). Extended functionality attacks on IoT devices: The case of smart
lights. In 2016 IEEE European symposium on security and privacy (EuroS P) (pp. 3–12). https://doi.
org/10.1109/EuroSP.2016.13.
57. Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data
Mining and Knowledge Discovery, 8(4), e1249.
58. Sfar, A. R., Natalizio, E., Challal, Y., & Chtourou, Z. (2018). A roadmap for security challenges in the
Internet of Things. Digital Communications and Networks, 4(2), 118–137.
59. Sivaraman, V., Gharakheili, H. H., Vishwanath, A., Boreli, R., & Mehani, O. (2015). Network-level
security and privacy control for smart-home IoT devices. In IEEE 11th international conference on
wireless and mobile computing, networking and communications (WiMob) (pp. 163–167). https://doi.
org/10.1109/WiMOB.2015.7347956.
60. Sonar, K., & Upadhyay, H. (2016). An approach to secure Internet of Things against DDOS. In
Springer proceedings of international conference on ICT for sustainable development (pp. 367–376).
61. Tama, B. A., & Rhee, K. H. (2019). An in-depth experimental study of anomaly detection using gradi-
ent boosted machine. Neural Computing and Applications, 31(4), 955–965.
62. Verma, A., & Ranga, V. (2018a). On evaluation of network intrusion detection systems: Statistical
analysis of CIDDS-001 dataset using machine learning techniques. Pertanika Journal of Science &
Technology, 26(3), 1307–1332.
63. Verma, A., & Ranga, V. (2018). Statistical analysis of CIDDS-001 dataset for network intrusion detec-
tion systems using distance-based machine learning. Procedia Computer Science, 125, 709–716.
64. Verma, A., & Ranga, V. (2019a). ELNIDS: Ensemble learning based network intrusion detection sys-
tem for RPL based Internet of Things. In 2019 4th International conference on Internet of Things:
Smart innovation and usages (IoT-SIU) (pp. 1–6). IEEE.
65. Verma, A., & Ranga, V. (2019). Evaluation of network intrusion detection systems for RPL based
6LoWPAN networks in IoT. Wireless Personal Communications, 108(3), 1571–1594.
66. Williams, N., Zander, S., & Armitage, G. (2006). A preliminary performance comparison of five
machine learning algorithms for practical IP traffic flow classification. ACM SIGCOMM Computer
Communication Review, 36(5), 5–16.
67. Wolpert, D. H., Macready, W. G., et al. (1997). No free lunch theorems for optimization. IEEE Trans-
actions on Evolutionary Computation, 1(1), 67–82.
68. Zahoor, S., & Mir, R. N. (2018). Virtualization and IoT resource management: A survey. International
Journal of Computer Networks and Applications, 5(4), 43–51.
69. Zarpelão, B. B., Miani, R. S., Kawakani, C. T., & de Alvarenga, S. C. (2017). A survey of intrusion
detection in Internet of Things. Journal of Network and Computer Applications, 84, 25–37.
70. Zhao, C. W., Jayanand, J., & Son, C. L. (2015). Exploring IoT application using Raspberry Pi. Interna-
tional Journal of Computer Networks and Applications, 2(1), 27–34.

13
2310 A. Verma, V. Ranga

71. Zhao, K., & Ge, L. (2013). A survey on the Internet of Things security. In IEEE 9th international con-
ference on computational intelligence and security (CIS) (pp. 663–667).
72. Ziegeldorf, J. H., Morchon, O. G., & Wehrle, K. (2014). Privacy in the Internet of Things: Threats and
challenges. Security and Communication Networks, 7(12), 2728–2742.

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Abhishek Verma is Ph.D. research scholar in the Department of Com-

puter Engineering at National Institute of Technology Kurukshetra,
Haryana, India. He received his B.Tech. degree (2014) in Computer
Science & Engineering from Uttar Pradesh Technical University, Luc-
know, India and M.Tech. degree (2016) in Computer Engineering from
National Institute of Technology Kurukshetra, India. He is an IEEE
student member. He has more than 3 years’ experience in research and
published several international journal and conference papers. He is an
active reviewer of many reputed journals of IEEE, Springer, Wiley,
and Bentham Science. His current areas of interests include network
security, in particular, Internet of Things, Wireless Sensor Networks
using Statistical methods and Machine Learning mechanisms.

Virender Ranga is working as Assistant Professor in the Department of

Computer Engineering at National Institute of Technology Kuruk-
shetra, India. He received his Ph.D. degree in 2016 from National
Institute of Technology Kurukshetra, Haryana, India. He has published
more than 60 research papers in various International SCI Journals as
well as reputed International Conferences in the area of Computer
Communications. He has been conferred by Young Faculty Award in
2016 for his excellent contributions in the field of Computer Commu-
nications. He has been acted as a member of TPC in various Interna-
tional conferences of repute. He is a member of editorial board various
reputed journals like Journal of Applied Computer Science & Artifi-
cial Intelligence, International Journal of Advances in Computer Sci-
ence and Information Technology(IJACSIT), Circulation in Computer
Science (CCS), International Journal of Bio Based and Modern Engi-
neering (IJBBME) and International Journal of Wireless Networks and
Broadband Technologies. Currently, he has been selected Guest Editor
for a special issue to be published in the International Journal of Sen-
sors, Wireless Communications and Control (Bentham Science Publication). He is an active reviewer of
many reputed journals of IEEE, Springer, Elsevier, Talyor & Francis, Wiley, and InderScience. His current
area of interest include Wireless Sensor & Ad hoc Networks, Internet of Things, Flying Ad hoc networks,
and Software Defined Networking.

A Machine Learning-Based Intrusion Detection
100% (1)
A Machine Learning-Based Intrusion Detection
15 pages
Ppan-Accomplishment-Report 2024
100% (1)
Ppan-Accomplishment-Report 2024
13 pages
A Survey of Deep Learning Technologies For Intrusion Detection in Internet of Things
No ratings yet
A Survey of Deep Learning Technologies For Intrusion Detection in Internet of Things
17 pages
Action Plan For Lis Cy
No ratings yet
Action Plan For Lis Cy
2 pages
Group 12
No ratings yet
Group 12
54 pages
A Survey On Intrusion Detection System in IoT Networks
No ratings yet
A Survey On Intrusion Detection System in IoT Networks
19 pages
Anomaly Based Intrusion Detection On IOT Devices Using Logistic Regression
No ratings yet
Anomaly Based Intrusion Detection On IOT Devices Using Logistic Regression
5 pages
IJCCCE - Volume 22 - Issue 3 - Pages 15-24
No ratings yet
IJCCCE - Volume 22 - Issue 3 - Pages 15-24
10 pages
Information 15 00164
No ratings yet
Information 15 00164
26 pages
1 s2.0 S0950705123007165 Main
No ratings yet
1 s2.0 S0950705123007165 Main
14 pages
Needs Assessment
100% (1)
Needs Assessment
30 pages
TSE-IDS: A Two-Stage Classifier Ensemble For Intelligent Anomaly-Based Intrusion Detection System
No ratings yet
TSE-IDS: A Two-Stage Classifier Ensemble For Intelligent Anomaly-Based Intrusion Detection System
11 pages
Sensors 22 01154 v2
No ratings yet
Sensors 22 01154 v2
28 pages
Báo cáo tiến độ W8 - 25Oct
No ratings yet
Báo cáo tiến độ W8 - 25Oct
24 pages
Mini Project Report
No ratings yet
Mini Project Report
42 pages
Enhancing Industrial IoT Security: A Comprehensive Approach To Intrusion Detection Using PCA-Driven Decision Trees
No ratings yet
Enhancing Industrial IoT Security: A Comprehensive Approach To Intrusion Detection Using PCA-Driven Decision Trees
11 pages
IJMIE7April24 22133
No ratings yet
IJMIE7April24 22133
27 pages
A Survey On Machine Learning Based Intrusion
No ratings yet
A Survey On Machine Learning Based Intrusion
19 pages
Effective Intrusion Detection in IoT Env
No ratings yet
Effective Intrusion Detection in IoT Env
8 pages
NIDS-CNNLSTM Network Intrusion Detection Classification Model Based On Deep Learning
No ratings yet
NIDS-CNNLSTM Network Intrusion Detection Classification Model Based On Deep Learning
14 pages
Springer Iot Ddos
No ratings yet
Springer Iot Ddos
22 pages
Renesas Flash Programmer Sample Circuit For Programming PC Serial PDF
No ratings yet
Renesas Flash Programmer Sample Circuit For Programming PC Serial PDF
5 pages
Intrusion Detection System For Internet of Things Based On A Machine Learning Approach
No ratings yet
Intrusion Detection System For Internet of Things Based On A Machine Learning Approach
6 pages
7) Paper 80-Internet of Things Cyber Attacks Detection
No ratings yet
7) Paper 80-Internet of Things Cyber Attacks Detection
8 pages
Intrusion Detection System For IoT Environment Using Ensemble Approaches
No ratings yet
Intrusion Detection System For IoT Environment Using Ensemble Approaches
4 pages
Design of An Intrusion Detection Model For IoT-Enabled Smart Home
No ratings yet
Design of An Intrusion Detection Model For IoT-Enabled Smart Home
18 pages
Abi Bharu Athi Arul Anitha Project
No ratings yet
Abi Bharu Athi Arul Anitha Project
48 pages
Cps Lit Review
No ratings yet
Cps Lit Review
3 pages
Group1 Document
No ratings yet
Group1 Document
54 pages
Two-Layer Intrusion Detection System For Security in Internet of Things-2
No ratings yet
Two-Layer Intrusion Detection System For Security in Internet of Things-2
50 pages
IOT Based Ids System Using ANN
No ratings yet
IOT Based Ids System Using ANN
8 pages
Learning Approaches For Security and Privacy in Internet of Things
No ratings yet
Learning Approaches For Security and Privacy in Internet of Things
12 pages
Electronics 12 03911 v2
No ratings yet
Electronics 12 03911 v2
26 pages
Applsci 11 03022 v2
No ratings yet
Applsci 11 03022 v2
19 pages
Electronics: Federated Machine Learning To Enable Intrusion Detection Systems in Iot Networks
No ratings yet
Electronics: Federated Machine Learning To Enable Intrusion Detection Systems in Iot Networks
19 pages
Boosting Industrial Internet of Things Intrusion Detection: Leveraging Machine Learning and Feature Selection Techniques
No ratings yet
Boosting Industrial Internet of Things Intrusion Detection: Leveraging Machine Learning and Feature Selection Techniques
10 pages
Intrusion Detection System For IoT Environments Using Machine Learning Techniques
No ratings yet
Intrusion Detection System For IoT Environments Using Machine Learning Techniques
7 pages
SSRN 4611920
No ratings yet
SSRN 4611920
15 pages
Synonym Match:: Paragraphs 1 and 2
No ratings yet
Synonym Match:: Paragraphs 1 and 2
6 pages
ML Aproch 4 Cyberattack Detection & Prevention IOT
No ratings yet
ML Aproch 4 Cyberattack Detection & Prevention IOT
10 pages
Liu 2021
No ratings yet
Liu 2021
37 pages
A Comparative Study of Using Boosting-Based Machine Learning Algorithms For Iot Network Intrusion Detection
No ratings yet
A Comparative Study of Using Boosting-Based Machine Learning Algorithms For Iot Network Intrusion Detection
15 pages
Deep Learning Algorithms For Intrusion D
No ratings yet
Deep Learning Algorithms For Intrusion D
8 pages
Sensors 23 05568
No ratings yet
Sensors 23 05568
20 pages
Sensors 22 08417
No ratings yet
Sensors 22 08417
12 pages
ICCAD25 Paper 7737
No ratings yet
ICCAD25 Paper 7737
5 pages
20 MCQ On Basics of Psychological Testing
No ratings yet
20 MCQ On Basics of Psychological Testing
9 pages
Eport Simplicity
25% (4)
Eport Simplicity
2 pages
Intrusion Detection Using Deep Neural Network Algorithm On The Internet of Things
No ratings yet
Intrusion Detection Using Deep Neural Network Algorithm On The Internet of Things
4 pages
Computational and Experimental Analysis of Advanced Materials and Its Processing
No ratings yet
Computational and Experimental Analysis of Advanced Materials and Its Processing
2 pages
Intrusion Detection System Through Advance Machine Learning For The Internet of Things Networks
No ratings yet
Intrusion Detection System Through Advance Machine Learning For The Internet of Things Networks
7 pages
THE MORAL AGENT Lesson
No ratings yet
THE MORAL AGENT Lesson
3 pages
Accepted Manuscript: Computer Networks
No ratings yet
Accepted Manuscript: Computer Networks
14 pages
English Written Works and Performance Task
No ratings yet
English Written Works and Performance Task
7 pages
Sensors 21 02987 v2
No ratings yet
Sensors 21 02987 v2
21 pages
Review Article
No ratings yet
Review Article
24 pages
Formato de Excel Modelo para Revision de Literatura
No ratings yet
Formato de Excel Modelo para Revision de Literatura
11 pages
AnIntrusion Detection System Over The IoT Data Streams Using Explainable Artificial Intelligence (XAI)
No ratings yet
AnIntrusion Detection System Over The IoT Data Streams Using Explainable Artificial Intelligence (XAI)
30 pages
Portable Intrusion Detection in IoT Enabled Smart City Networks: A Comparative Study of Machine Learning Models
No ratings yet
Portable Intrusion Detection in IoT Enabled Smart City Networks: A Comparative Study of Machine Learning Models
7 pages
Augmenting IoT Intrusion Detection Syste
No ratings yet
Augmenting IoT Intrusion Detection Syste
24 pages
19148-Article Text-78917-2-10-20240405
No ratings yet
19148-Article Text-78917-2-10-20240405
24 pages
Research 2
No ratings yet
Research 2
12 pages
1 s2.0 S0045790623000514 Main
No ratings yet
1 s2.0 S0045790623000514 Main
14 pages
PPR 31
No ratings yet
PPR 31
43 pages
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
No ratings yet
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
7 pages
AI-enabled Intrusion Detection Systems in IoT Networks Advancing Defense
No ratings yet
AI-enabled Intrusion Detection Systems in IoT Networks Advancing Defense
18 pages
Nursing Care Plan: 1. Safe and Quality Nursing Care On-Going Planned To Join A Support Group
75% (4)
Nursing Care Plan: 1. Safe and Quality Nursing Care On-Going Planned To Join A Support Group
5 pages
A Deep Learning Based Framework For Cyberattack Detection in IoT Networks
No ratings yet
A Deep Learning Based Framework For Cyberattack Detection in IoT Networks
21 pages
The Scarlet Letter
No ratings yet
The Scarlet Letter
3 pages
s44147 025 00635 7
No ratings yet
s44147 025 00635 7
26 pages
Pa Core 2014 Reading Street CC 2013 Grade K Final
No ratings yet
Pa Core 2014 Reading Street CC 2013 Grade K Final
35 pages
Spec 2
No ratings yet
Spec 2
19 pages
Lesson Plan 4
No ratings yet
Lesson Plan 4
4 pages
BTEC Nationals in Aerospace Engineering Structures
No ratings yet
BTEC Nationals in Aerospace Engineering Structures
4 pages
IoT Network Attack Detection Using Supervised Machine Learning
No ratings yet
IoT Network Attack Detection Using Supervised Machine Learning
15 pages
Agency Report Ahs 8100
No ratings yet
Agency Report Ahs 8100
9 pages
Adversarial Examples - Attacks and Defenses in The Physical World
No ratings yet
Adversarial Examples - Attacks and Defenses in The Physical World
12 pages
Read Aloud Prek
No ratings yet
Read Aloud Prek
4 pages
MS Unit2
No ratings yet
MS Unit2
94 pages
Digital Design and Computer Architecture:: ARM® Edition
No ratings yet
Digital Design and Computer Architecture:: ARM® Edition
29 pages
Fea R
No ratings yet
Fea R
21 pages
CT101 IntroductionToDataCommunicationsAndNetworking
No ratings yet
CT101 IntroductionToDataCommunicationsAndNetworking
55 pages
03 Assembly Language Programming
No ratings yet
03 Assembly Language Programming
33 pages
Kasiri Keyvan
No ratings yet
Kasiri Keyvan
126 pages
VirtualHeritage BlackBook Finall
No ratings yet
VirtualHeritage BlackBook Finall
76 pages
The Working Memory Model
No ratings yet
The Working Memory Model
3 pages
EECS 373: Design of Microprocessor-Based Systems
No ratings yet
EECS 373: Design of Microprocessor-Based Systems
75 pages
ARM Tachnology
No ratings yet
ARM Tachnology
44 pages
The Map
No ratings yet
The Map
72 pages
Nature Inspired Techniques and Applications in Intrusion Detection Systems - Recent Progress and Updated Perspective
No ratings yet
Nature Inspired Techniques and Applications in Intrusion Detection Systems - Recent Progress and Updated Perspective
23 pages
ARM3
No ratings yet
ARM3
39 pages
A Review of Artificial Intelligence To Enhance The Security of Big Data Systems - State-of-Art, Methodologies, Applications, and Challenges
No ratings yet
A Review of Artificial Intelligence To Enhance The Security of Big Data Systems - State-of-Art, Methodologies, Applications, and Challenges
19 pages
A Survey of Neural Networks Usage For Intrusion Detection Systems
No ratings yet
A Survey of Neural Networks Usage For Intrusion Detection Systems
18 pages
Blacksite - Human-In-The-Loop Artificial Immune System For Intrusion Detection in Internet of Things
No ratings yet
Blacksite - Human-In-The-Loop Artificial Immune System For Intrusion Detection in Internet of Things
13 pages
A Survey On Tools and Techniques For Localizing Abnormalities in X-Ray Images Using Deep Learning
No ratings yet
A Survey On Tools and Techniques For Localizing Abnormalities in X-Ray Images Using Deep Learning
29 pages
Intrusion Detection System Based On Pattern Recognition
No ratings yet
Intrusion Detection System Based On Pattern Recognition
9 pages
Recognition of Weld Defects From X-Ray Images Based On Improved Convolutional Neural Network
No ratings yet
Recognition of Weld Defects From X-Ray Images Based On Improved Convolutional Neural Network
18 pages
Real-Time Network Intrusion Detection System Based On Deep Learning
No ratings yet
Real-Time Network Intrusion Detection System Based On Deep Learning
4 pages
L03-ARM Architecture
No ratings yet
L03-ARM Architecture
23 pages
From Noob To AI Developer
No ratings yet
From Noob To AI Developer
7 pages
Upcat 2020
No ratings yet
Upcat 2020
4 pages
A Welding Defect Identification Approach in X-Ray Images Based On Deep Convolutional Neural Networks
No ratings yet
A Welding Defect Identification Approach in X-Ray Images Based On Deep Convolutional Neural Networks
12 pages
Definitions: Specifications
No ratings yet
Definitions: Specifications
20 pages
2D Object Recognition Techniques - State-Of-The-Art Work
No ratings yet
2D Object Recognition Techniques - State-Of-The-Art Work
15 pages
Weighted Heterogeneous Ensemble For The Classification of Intrusion Detection Using Ant Colony Optimization For Continuous Search Spaces
No ratings yet
Weighted Heterogeneous Ensemble For The Classification of Intrusion Detection Using Ant Colony Optimization For Continuous Search Spaces
15 pages
Osd
No ratings yet
Osd
19 pages
Analytics
No ratings yet
Analytics
14 pages
Agile Teamwork - Minimize Handoffs
No ratings yet
Agile Teamwork - Minimize Handoffs
3 pages
Visual Attention and Its Intimate Links To Spatial Cognition
No ratings yet
Visual Attention and Its Intimate Links To Spatial Cognition
10 pages
ZMCT118F Specification
No ratings yet
ZMCT118F Specification
2 pages
Writing Applications Prewriting Handout
No ratings yet
Writing Applications Prewriting Handout
1 page
Main
No ratings yet
Main
3 pages
Combining XGBoost With Particle Swarm Optimization To Improve Phishing Detection (JOURNAL (Revisi Note
No ratings yet
Combining XGBoost With Particle Swarm Optimization To Improve Phishing Detection (JOURNAL (Revisi Note
8 pages
Carmack 2004
No ratings yet
Carmack 2004
5 pages
5E Lesson Plan Template
No ratings yet
5E Lesson Plan Template
3 pages
03 05 Worksheet
No ratings yet
03 05 Worksheet
3 pages
Young and Dyslexic
No ratings yet
Young and Dyslexic
2 pages
Important Dynasties and Kingdoms of Ancient India UPSC IAS Prelims Examination
No ratings yet
Important Dynasties and Kingdoms of Ancient India UPSC IAS Prelims Examination
1 page
Robert-Zenz Sqlite
No ratings yet
Robert-Zenz Sqlite
1 page
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
From Everand
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
Bolakale Aremu
5/5 (1)

Machine Learning Based Intrusion Detection Systems For IoT Applications

Uploaded by

Machine Learning Based Intrusion Detection Systems For IoT Applications

Uploaded by

Wireless Personal Communications (2020) 111:2287–2310

Machine Learning Based Intrusion Detection Systems for IoT

Abhishek Verma1 · Virender Ranga1

Published online: 30 November 2019

Keywords Internet of Things · Denial of service · Intrusion detection · Anomaly ·

• Performance assessment of different ML classifiers on CIDDS-001, UNSW-NB15,

3.1.1 Random Forest (RF)

RF[12] is a collection of trees, i.e., predictors {t(xin , 𝜃n ), n = 1, …} which individually make

3.1.3 Gradient Boosted Machine (GBM)

A = {f (x) = wr(x) (r ∶ ℝm → K, w ∈ ℝK )} (6)

3.1.4 Extreme Gradient Boosting (XGB)

3.1.5 Extremely Randomized Trees (ETC)

3.2.1 Classification and Regression Trees (CART)

3.2.2 Multi‑layer Perceptron (MLP)

4.3 Evaluation Metrics and Validation Methods

hyper-tuning of parameters. RandomizedSearchCV finds optimal parameter settings

4.4 Statistical Significance Tests

In ML studies, comparison of multiple algorithms over multiple datasets is an essential

5 Results and Analysis

RF CART MLP AB GBM XGB ETC

Fig. 2 The average value of FPR

RF CART MLP AB GBM XGB ETC

Table 1 MBT (seconds) of classifiers across four datasets

CIDDS-001 0.4124 0.2353 1.0160 0.9557 20.1139 7.3965 17.5031

RF CART MLP AB GBM XGB ETC

Accuracy Specificity Sensitivity AUC

Fig. 4 The average value of FPR

RF CART MLP AB GBM XGB ETC

Table 2 Friedman test statistics Accuracy Specificity Sensitivity FPR AUC​

Table 3 Friedman test (mean Accuracy Specificity Sensitivity FPR AUC​

Table 4 Nemenyi pairwise comparison (hold-out validation) Part I

AB versus XGB 3.1096 0.0393 R R 3.1914 0.0297 R R 2.2912 0.4608 A A

Table 5 Nemenyi pairwise comparison (hold-out validation) Part II

AB versus XGB 3.1914 0.0297 R R 2.0457 0.8563 A A

Table 6 Friedman test statistics Accuracy Specificity Sensitivity FPR AUC​

Table 7 Friedman test (mean Accuracy Specificity Sensitivity FPR AUC​

Table 8 Nemenyi test (10f A1 versus A2 AUC​

AB versus XGB 2.1276 0.7007 A A

RF CART MLP AB GBM XGB ETC

CIDDS-001 UNSW-NB15 KDDTrain+ KDDTest+

Fig. 5 Average response time of classifiers

Compliance with Ethical Standards

Abhishek Verma is Ph.D. research scholar in the Department of Com-

Virender Ranga is working as Assistant Professor in the Department of

You might also like

3.1.1 Random Forest (RF)

3.1.3 Gradient Boosted Machine (GBM)

3.1.4 Extreme Gradient Boosting (XGB)

3.1.5 Extremely Randomized Trees (ETC)

3.2.1 Classification and Regression Trees (CART)

3.2.2 Multi‑layer Perceptron (MLP)

4.3 Evaluation Metrics and Validation Methods

4.4 Statistical Significance Tests

5 Results and Analysis

Table 2 Friedman test statistics Accuracy Specificity Sensitivity FPR AUC

Table 3 Friedman test (mean Accuracy Specificity Sensitivity FPR AUC

Table 6 Friedman test statistics Accuracy Specificity Sensitivity FPR AUC

Table 7 Friedman test (mean Accuracy Specificity Sensitivity FPR AUC

Table 8 Nemenyi test (10f A1 versus A2 AUC