0% found this document useful (0 votes)
116 views13 pages

Analysis of Support Vector Machine-Based Intrusion Detection Techniques

The document analyzes support vector machine (SVM)-based intrusion detection techniques. It discusses the methodology, which involves data collection, preprocessing, SVM training and testing, and decision making. The performance of various SVM techniques is evaluated using a benchmark dataset based on detection accuracy, ROC curves, and confusion matrices.

Uploaded by

Insta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views13 pages

Analysis of Support Vector Machine-Based Intrusion Detection Techniques

The document analyzes support vector machine (SVM)-based intrusion detection techniques. It discusses the methodology, which involves data collection, preprocessing, SVM training and testing, and decision making. The performance of various SVM techniques is evaluated using a benchmark dataset based on detection accuracy, ROC curves, and confusion matrices.

Uploaded by

Insta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Arabian Journal for Science and Engineering (2020) 45:2371–2383

https://doi.org/10.1007/s13369-019-03970-z

RESEARCH ARTICLE - SPECIAL ISSUE - INTELLIGENT COMPUTING AND


INTERDISCIPLINARY APPLICATIONS

Analysis of Support Vector Machine-based Intrusion Detection


Techniques
Bhoopesh Singh Bhati1 · C. S. Rai1

Received: 31 December 2018 / Accepted: 10 June 2019 / Published online: 2 July 2019
© King Fahd University of Petroleum & Minerals 2019

Abstract
From the last few decades, people do various transaction activities like air ticket reservation, online banking, distance learning,
group discussion and so on using the internet. Due to explosive growth of information exchange and electronic commerce in
the recent decade, there is a need to implement some security mechanisms in order to protect sensitive information. Detection
of any intrusive behavior is one of the most important activity for protecting our data and assets. Various intrusion detection
systems are incorporated in the network for detecting intrusive behavior. In this paper, an analytical study of support vector
machine (SVM)-based intrusion detection techniques is presented. Here, the methodology involves four major steps, namely,
data collection, preprocessing, SVM technique for training and testing and decision. The simulated results have been analyzed
based on overall detection accuracy, Receiver Operating Characteristic and (ROC) Confusion Matrix. NSL-KDD dataset is
used to analyze the performance of SVM techniques. NSL-KDD dataset is a benchmark for intrusion detection technique and
contains huge amount of network records. The analyzed results show that Linear SVM, Quadratic SVM, Fine Gaussian SVM
and Medium Gaussian SVM give 96.1%, 98.6%, 98.7% and 98.5% overall detection accuracy, respectively.

Keywords Information security · Intrusion detection · SVM · Machine learning

1 Introduction a network is called intrusion, and to diagnose such events


as illegal is called detection. Intrusion detection refers to
In today’s world, network security has become important due the analysis of events happening in a computer system or
to the growing computer usage. Everyday, users are subjected in a network and checking whether the event is normal or
to new kinds of attacks, so it is essential to alert them about harmful [1]. IDS is classified into three types: host-based
any malicious activities taking place in the network. As the IDS (HBIDS), network-based IDS (NBIDS) and hybrid IDS
threat is becoming a serious matter day by day, there is a (HIDS). In HBIDS, the IDS resides on the host. These IDSs
need to implement a system that realizes faster attack detec- generally scan the host system for events. It may scan operat-
tion and response to protect the data that are exchanged on ing system log records and application log records for event
the network. For sharing of confidential, private, public or traces. The advantages of HBIDS are its cost effectiveness
business assets using the internet connection, protection of and direct control over host entities. In NBIDS, the IDS is
these assets against intrusive attacks is of prime importance. responsible for detecting malicious or illegal events for a
Intrusion detection system (IDS) plays a significant role in subject network. NBIDS receives all packets on a network
network security, and it is a prime area of research these segment and analyzes the packets to detect malicious activ-
days. Illegal event occurrence in a computer system or in ities. NBIDS has signatures to alert intrusion attempt. Some
NBIDSs require a set of pattern in the payload, which is very
B Bhoopesh Singh Bhati useful to detect outside intrusion or attack [2]. HIDSs have
[email protected] features of both NBIDS and HBDIS. In HIDS, monitored
C. S. Rai events are analyzed at both the host level and network level.
[email protected] Further, the paper is organized as follows: In Sect. 2,
1 USIC&T, Guru Gobind Singh Indraprastha, University
details about the IDS components and basic techniques are
Dwarka, New Delhi 110078, India given. In Sect. 3, basic details about SVM and some related

123
2372 Arabian Journal for Science and Engineering (2020) 45:2371–2383

Fig. 1 IDS components [3]

work are presented. Methodology is explained in Sect. 4; tures present in the database. There are other ways to perform
thus, Sect. 5 deals with empirical evaluation and results; detection like the detection of anomalous behavior in the net-
whereas analysis and discussion are presented in Sect. 6, work traffic, but such kind of techniques can lead to high false
and finally, this paper is concluded in Sect. 7. alarm. The detection protocol repository contains the rules
and algorithms on how to detect and prevent intrusions. The
main information about the anomaly detection that needs to
2 Intrusion Detection Systems be sent to the response unit is stored in detection protocol.
Reply unit or response contains those kinds of data which
2.1 IDS Components are intrusive in nature, and they are needed to be handled
carefully. The reply is generated for these data with the help
Figure 1 shows the main phases of generic IDS model. The of reply statements which are present in the reply protocol
first phase is data accumulator in which the events are gener- repository [3]. The decision on how to respond in multiple
ated on the basis of log data. These log data are formed with events is taken by the main response unit and reply protocol
the help of data collected by the target system. The data can database. The reply unit may get multiple inputs from dif-
be network traffic, operating system logs or application logs. ferent analyzers in distributive environment, and thus, it can
The data stored in configuration repository are then used in collaborate on all alarms coming from the main submission.
the attack detection phase. In order to help administrator, the reply unit can perform cer-
In attack detection, there are multiple analyzers which tain tasks after detecting an intrusion like creating an alarm
work simultaneously on the same data. They apply multiple to notify the controller, configuring to shut down the system
match scripts to match text strings, which are unique to vari- or cut off the connection from where the malicious traffic is
ous intrusions. This phase is similar to various anti-viruses in coming.
which the malicious code is matched against various signa-

123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2373

2.2 Types of Intrusion Detection Techniques

There are basically three types of intrusion detection tech-


niques. These are broadly explained as follows:

2.2.1 Rule-Based Intrusion Detection

This is the earliest scheme for intrusion detection. In this


scheme, predefined rules are stored in IDS knowledge base.
When an event occurs, it passes through the stored rules. Fig. 2 Support vector machine
If the event follows the rules, then it is treated as normal
event otherwise it is referred to as intrusion. In rule-based
intrusion detection, the rules are periodically updated, which standard. This technique is primarily characteristic depen-
is an overhead and can be considered as a disadvantage. But dent.
this technique gives low false alarm. It is also called pattern
matching-based intrusion detection.
3 Support Vector Machine
2.2.2 Signature-Based Intrusion Detection
SVM usually involves separating dataset into two sets: one is
In signature-based intrusion detection, the signatures of mali-
training set and another is testing. Every instance in training
cious activities are stored in intrusion detection knowledge
dataset has target values and various attributes or features.
base. These activities are harmful for the system. When an
The main aim of SVM is to develop a model using training
event occurs, signatures of the event are analyzed and passed
dataset which predicts the target values of the test dataset
through the database. If the signature is matched, then it
based on its attributes. SVM is also memory efficient since it
is referred as intrusion otherwise it is treated as normal
uses a subset of training points. It does not perform well on
event. The efficiency of signature-based intrusion detection
noisy datasets with overlapping classes. It is used in text clas-
depends on signatures in database. Thus, more and more sig-
sification, image processing and intrusion detection systems.
natures should be stored in IDS knowledge base for better
When compared to neural network, it is easier to use SVM.
efficiency, which is taken as a drawback of this detection
Also, in high-dimensional space, SVM tool is very effective.
technique. It is also called misuse intrusion detection.
In this paper, the detailed analysis of SVM-based intrusion
detection techniques is presented.
2.2.3 Anomaly-Based Intrusion Detection

In anomaly-based intrusion detection, any deviation is con- 3.1 SVM Hyperplane


sidered for detecting intrusion. In this scheme, deviation is
analyzed and examined from the normal behavior. If the devi- Support vector machine (SVM) is a related set of supervised
ation is more from the normal behavior, then occurring event machine learning algorithm which is used as a discriminative
is referred as intrusion. This IDS is very useful for unfore- classifier for distinguishing training datasets with the help
seen malicious activities. Anomaly-based intrusion detection of separating hyperplane. The hyperplane is considered to
is easily configurable and provides acceptable accuracy. The be weak if it passes closely to the points as it will be noise
only drawback of this type of IDS is that it produces more sensitive to those points.SVM is mostly used in classification
false alarms. Anomaly-based intrusion detection is cate- problem. It minimizes the classification error and maximizes
gorized as Behavior Anomaly Detection, Network Behav- the geometric margin between two classes [4]. Every feature
ior Anomaly Detection and Protocol Anomaly Detection. of data items can be represented in n-dimensional space, and
Behavior Anomaly Detection technique is a characteristic- then, SVM can be used to find the hyperplane which divides
dependent technique. In this method, behavior of the user the dataset into two classes; a two-dimensional example is
is seen to detect anomalies. In Network Behavior Anomaly shown in Fig. 2.
Detection, proprietary network is analyzed for anomalies. Support vectors are considered as highly critical elements
This means that monitoring of a network takes place to of the dataset because these data points of support vector are
detect unsafe events. It is a statistical technique rather than nearest to the hyperplane. If the dataset points are removed,
characteristic dependent. This technique is also called traffic then it would alter the position of the dividing hyperplane
anomaly system. In Protocol Anomaly Detection technique, [5].The advantage of SVM is that it gives better accuracy
the system is examined for any variation from set protocol and works well for smaller datasets.

123
2374 Arabian Journal for Science and Engineering (2020) 45:2371–2383

matrix. Some common kernel functions used with SVM are


given below:

a. Linear SVM: When there are a huge number of features,


this kernel is preferred. Linear kernel is suitable for lin-
early separable data. It is faster to train the classifier with
linear kernel than other kernels because the linear kernel
does not perform any mapping. Linear SVM deals effi-
ciently with huge amounts of datasets. Suppose ys and yt
Fig. 3 Optimal hyperplane using SVM are data points, then for linear kernel:

K (ys , yt )  ys yt (5)
3.2 Optimal Hyperplane
b. Quadratic SVM: At training phase, memory usage for
There is a need to maximize the width of margin in order binary classification is medium and for multiclass classi-
to get an optimal hyperplane. In SVM, a large width fication is large in Quadratic SVM. The prediction speed
of margin gives better results, so our aim should be to for binary classification is fast and for multiclass is slow.
maximize the width of margin. The optimization method In Quadratic SVM, interpretability is hard and model
using Karush–Kuhn–Tucker (KKT) [6] conditions is used to flexibility is medium. Quadratic SVM may be defined
achieve optimal hyperplane. An example of optimal hyper- as:
plane is given in Fig. 3. Optimal hyperplane can be defined
mathematically for linear separable as for binary classifica- K (ys , yt )  (1 + ys yt )2 (6)
tion, the dataset of two pairs is (xi , yi ), i  1, 2,…,m where
xi ∈Rn and yi ∈ {+ 1,− 1}, for the correct classification of c. Fine Gaussian SVM: In Fine Gaussian SVM, at training
datasets [7] phase, memory usage for binary classification is medium
and for multiclass is large. The interpretability is hard and
(w. xi ) + b ≥ + 1 for yi  + 1 (1) model flexibility is high in this case because it decreases
with kernel scale setting sqrt (P)/4. It makes elaborate
(w. xi ) + b ≤ − 1 for yi  − 1 (2) distinctions amid classes. The prediction speed for binary
classification is fast and for multiclass is slow.
d. Medium Gaussian SVM: The prediction speed at training
where y represents two classes which have a binary value, w
phase for binary classification is fast and for multiclass is
is a weight vector, x is an input vector and b is a threshold
slow. In Medium Gaussian SVM, memory usage for mul-
value. Combining Eqs. (1) and (2), we get
ticlass is large and for binary classification is medium.
The interpretability is hard, and model flexibility of
yi ((w. xi ) + b) − 1 ≥ 0 for i  1, 2, . . . , m (3)
Medium Gaussian SVM is medium.

3.3 Kernel Function 3.4 SVM-Based Intrusion Detection Techniques


The high-dimensional feature conversion is not preferred SVM’s training algorithm analyzes the data and accordingly
because it leads to high computational cost. By using kernel generates a new function to classify new data, which in turn
function, transformation y →  (y) can be achieved without improvises the new training datasets. In the research work
any major implication or computational cost. Kernel function of Matteo Fischetti [10], new model of SVM is represented
is widely used in support vector machine. A set of mathe- which uses kernel Gaussian to train the machine for new
matical functions used by the SVM algorithm is defined as datasets. This SVM has made the use of Gaussian Kernel
the kernel [8]. In kernel function, a mapping function  is function in place of kernel function for finding out the simi-
obtained by mapping the original input data into a higher- larity among multiple input points. With the help of Multiple
dimensional feature space. Integer Linear Programming in SVM, it has lead to better
classification accuracy and slight increase in the speed of
K (ys , yt )  ( (ys ) ·  (yt )) (4) training process. Moreover, Zhang et al. [11] proposed the
selection of useful features subset from the features space
Kernel function K must follow the conditions [9], i.e., it with the help of Fisher score. As SVM makes classifica-
must be continuous, symmetric and have a positive definite tion among certain features, the Fisher score is completely

123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2375

Fig. 4 ROC of DOS Fig. 6 ROC of probe

Fig. 5 ROC of normal

Fig. 7 ROC of R 2 L normal


based on the criterion of distance among different features.
The highest Fisher score is given to those features which
has data points of different classes far from each other while These types of machines heavily relive on anomaly-based
the data points of the same class are closer to each other. ID techniques, but it takes a lot of time to train such machines
This combination of SVM and Fisher Score is applied on the with huge amount of training datasets. SVM is one of the most
KDDcup99 to detect intrusions in the dataset. The research successful techniques in making classifications of intrusive
work of Peddabachigari [12] proposes the hybrid intrusion behaviors. Many techniques have been used for reducing
detection systems which combine the base classifiers with the time factor to a feasible limit like random selection or
the new paradigms of machine learning like Decision Trees approximation. But such techniques can also have major con-
and SVM to maximize the accuracy and reduce the quadratic sequences like loss of result and the increase in false positive.
computational complexities. Hierarchical clustering of the training dataset can be use-

123
2376 Arabian Journal for Science and Engineering (2020) 45:2371–2383

Fig. 8 ROC of U 2 R
Fig. 10 ROC of DOS

Fig. 9 Overall Confusion Matrix


Fig. 11 ROC of normal

ful. Wang et al. [13] used the concept of “one class SVM”
which includes one set of examples belonging to a particu- and showed the superiority of their method. Cuong et al. [16]
lar class. Reducing the training time can be carried out with gave a modified support vector machine and modified back
the subsampling techniques in which classifier gets quick propagation scheme using covariate shift. In this research
after removing some random training points. If the probabil- work, authors used KDD1999 dataset and used modified
ity distribution of the training and testing data is not same, technique for intrusion detection. Authors solved negative
then this subsampling technique cannot be useful for train- effect problem in dataset shift in IDS and discussed how to
ing process. Guangping et al. [14] proposed an Improved overcome them. Cuong et al. got higher accuracy by using
Harmony Search Algorithm to improve SVM efficiency for modified technique when compared to the original technique.
network intrusion detection. The author used the dataset of a In this research work, authors deployed the Kernel Mean
certain city of China during 2006–2012. Li et al. [15] com- Matching (KMM) and Unconstrained Least-Squares Impor-
bined the fuzzy SVM and multiclass SVM based on binary tance Fitting (ULSIF) technique to modify the support vector
tree. In this research work, authors used KDDcup99 dataset machine and to make this technique work under covariate

123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2377

Fig. 12 ROC of probe

Fig. 14 ROC of U 2 R

Fig. 13 ROC of R 2 L

shift. M-SVM gives higher result than the best of nine binary
classifications.
In the paper proposed by Parwekar et al. [17], the different Fig. 15 Overall Confusion Matrix
issues related to networks attacks have been discussed. The
researchers suggested using hybrid approaches for intrusion
detection either in wired network or in wireless networks. and Probe attack when compared to other intrusion detection
These features have many advantages in experimental result. system on the same dataset. To the SVM training, original
Horng, et al. [18] proposed a scheme for intrusion detection large dataset does not provide abstracted, highly qualified
using hierarchical clustering and support vector machine. and reduced dataset, whereas it can provide BIRCH hierar-
Horng et al. combined features of Balanced Iterative Reduc- chical. By using the KDDcup99 dataset in their experiment,
ing and Clustering using Hierarchies (BIRCH) hierarchical they have reached an accuracy of 95.72% with the false posi-
clustering and SVM technique in order to propose a SVM- tive rate of 0.7% on the proposed system. Without sampling,
based intrusion detection system. To evaluate the proposed KDDcup1999 dataset has been used to perform the experi-
system, they used the KDDcup99 dataset. In this paper, ment. The proposed system has the best performance in terms
authors show the best performance in the detection of DOS of accuracy at 95.72%. Azad et al. [19] proposed an IDS

123
2378 Arabian Journal for Science and Engineering (2020) 45:2371–2383

Fig. 18 ROC of probe


Fig. 16 ROC of DOS

Fig. 17 ROC of normal Fig. 19 ROC of R 2 L

with the help of Decision Tree and Genetic Algorithm. KDD- a pipeline in IDS. Authors make a compact data by clustering
cup99 datasets have been used to test the proposed system. redundant data; Ant Colony Algorithm is used for selecting
Tiwari et al. [20] proposed an IDS using Neural Network, small training set to seize the key feature of network and to
SVM and Neuro Fuzzy. In this research work, authors ana- obtain the classifier with SVM. IDS pipeline has an accuracy
lyzed the results with the help of ROC and achieved 97.71% of 98.62% with tenfold cross–validation, and Matthews Cor-
detection accuracy. Xiaoping [21] proposed a SVM-based relation Coefficient achieved is 0.861161. Li et al. proposed
intrusion detection technique with Nonlinear Dimensional- gradually feature removal method to precise wrapper-based
ity Reduction Algorithm. Isometric mapping has been used feature reduction method. In paper [23], Parwekar et al.
for dimension reduction in this research work. Li et al. [22] proposed an idea to implement robotics in military area appli-
proposed a scheme for intrusion detection based on support cations for intrusion detection. Yogita et al. [24] proposed
vector machine and gradually feature removal method. In a scheme for IDS using SVM. NSL-KDDcup99 dataset has
this research work, authors used the KDDcup99 dataset and been used to conduct some experiments for the proposed sys-

123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2379

Fig. 20 ROC of U 2 R Fig. 22 ROC of DOS

Fig. 21 Overall Confusion Matrix


Fig. 23 ROC of normal

tem. By using the SVM, the classification is performed. The 4 Methodology


limitation of SVM is the extensive time it requires to build a
model. Their experiments demonstrate that when the proper Following methodology is used to analyze the well-known
SVM kernel is selected, i.e., radial basis function (RBF) and support vector machine techniques for intrusion detection,
dataset is processed properly, then it can overcome the limi- e.g., Linear SVM, Quadratic SVM, Fine Gaussian SVM and
tation of SVM. The attack detection accuracy achieved was Medium Gaussian SVM. Our methodology has four steps,
94.18%, and the time required to build a model was 77.07 s. i.e., data collection, preprocessing, SVM technique and deci-
This attack detection accuracy is increased up to 98.57%, sion. For evaluation of any intrusion detection technique,
when they conducted experiment by using supply test set data collection is needed. NSL-KDD datasets are used in
with same RBF SVM kernel function with tenfold cross- this methodology. It is widely used in intrusion detection,
validation. and it is freely available. Basically, NSL-KDD [25] is a

123
2380 Arabian Journal for Science and Engineering (2020) 45:2371–2383

Fig. 24 ROC of probe

Fig. 26 ROC of U 2 R

Fig. 25 ROC of R 2 L
Fig. 27 Overall Confusion Matrix

benchmark for intrusion detection technique and contains Denial of Service (DOS): A DOS or Distributed Denial-
a huge amount of network records. NSL-KDD dataset has of-Service (DDOS) attack occurs when an attacker makes
41 features, among them some are basic features, some are memory resources or other network resources unavailable or
content-related features, few are time-related features and too full to respond to valid request, for example Smurf, Ping
remaining are host-based traffic features. of death, Neptune, etc.
NSL-KDD dataset is a mixture of attacks and normal
records. The attacks come under four categories, i.e., Denial Remote 2 User (R 2 L): R 2 L is an attack in which
of Services (DOS), Remote 2 User (R 2 L), User 2 Root (U unauthorized access is gained by an attacker and is able to
2 R) and Probing. So, NSL-KDD dataset has five classes: send packets over network. The attacker can easily utilize to
DOS, R 2 L, U 2 R, Probe and Normal. The details of attack gain local access as machine’s user, e.g., Guess_password,
categories are given below [26]: Xclock, etc.

123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2381

User 2 Root (U 2 R): This attack occurs when attacker tries 11, 12, 13 and 14 show the ROC diagram using Quadratic
to get access of system/account of normal user, for example SVM for DOS, normal, Probe, R 2 L and U 2 R, respectively.
Perl, Xtrem, etc. Figure 15 shows the overall Confusion Matrix for Quadratic
SVM.
Probing: Probing is an attack when an unauthorized user
does screening of system for gaining information about net-
5.3 Fine Gaussian SVM
work. This information can be further used for future attacks
or for bypassing the security control, for example mscan,
This SVM makes finely detailed distinctions between classes,
Satan, etc.
using this kernel with kernel scale set to sqrt (P)/4, where
Preprocessing means refining the collected datasets. Pre-
P represents predictors. Fine Gaussian SVM gives 98.7%
processing is very important because in this phase, there are
overall accuracy and 1.3% overall error. The model building
some options to select the features and eliminate the features
time using Fine Gaussian SVM is 100 s. Figures 16, 17,
in implementation. It reduces the computation time.
18, 19 and 20 show the ROC diagram using Fine Gaussian
After preprocessing, for training and testing the model,
SVM for DOS, normal, Probe, R 2 L and U 2 R, respectively.
SVM technique is applied. Cross-validation scheme is used
Figure 21 shows the overall Confusion Matrix for this type
for validation. Validation scheme is used to examine the pre-
of SVM.
dictive accuracy of the model. Validation scheme is chosen
before the training of the model. Suppose k-fold is selected;
5.4 Medium Gaussian SVM
that is, original sample is divided in k subsample among
which (k − 1) subsample is used for training and rest of the
This creates lesser distinctions than fine Gaussian SVM,
subsample is used for testing the model. Here, in implementa-
using the Gaussian kernel scale set to sqrt (P), where P repre-
tion fivefold cross-validation is used. Finally, decision comes
sents predictors. Medium Gaussian SVM gives 98.5% overall
as either normal or malicious.
accuracy and 1.5% overall error. The model building time
using Medium Gaussian SVM is 50 s. Figures 22, 23, 24,
25 and 26 show the ROC diagram using Medium Gaussian
5 Empirical Evaluation and Results SVM for DOS, normal, Probe, R 2 L and U 2 R, respectively.
Figure 27 shows the overall Confusion Matrix for Medium
5.1 Linear SVM Gaussian SVM.

In this technique, SVM makes an easy linear separation


between classes. This SVM technique uses linear kernel. It 6 Analysis and Discussion
is the simplest SVM method to interpret. Linear SVM gives
96.1% overall accuracy and 3.9% overall error. The model This analysis involves SVM types, kernel function, box
building time using Linear SVM is 74 s. Figures 4, 5, 6, 7 constraint level, kernel scale mode and multiclass method.
and 8 show the ROC diagram using Linear SVM for DOS, Kernel function is used to compute the Gram matrix. Box
normal, Probe, R 2 L and U 2 R, respectively. ROC shows constraint value is used to hold the allowable value of mul-
the performance of the SVM technique. Figure 9 shows the tipliers. Multiclass method is used to reduce the multiclass
overall Confusion Matrix for Linear SVM. classification problem. Multiclass SVM constructs k differ-
ent classes at the training phase of IDS. Table 1 gives the
5.2 Quadratic SVM details about the SVM techniques. In this analysis study,
one-verse-one multiclass method is chosen. All the possible
In this type of SVM, quadratic kernel is used. Quadratic SVM SVM classifiers can be obtained in one-verse-one method
gives 98.6% overall accuracy and 1.4% overall error. The [27]. The number of classifiers is k (k − 1)/2. Test phase uses
model building time using Quadratic SVM is 53 s. Figures 10, Max Wins Algorithm [28]; individual classifier gives vote for

Table 1 Details of SVM


techniques SVM type Kernel function Box constraint level Kernel scale mode Multiclass method

Linear SVM Linear 1 Auto One-vs-one


Quadratic SVM Quadratic 1 Auto One-vs-one
Fine Gaussian SVM Fine Gaussian 1 Manual One-vs-one
Medium Gaussian Medium Gaussian 1 Manual One-vs-one
SVM

123
2382 Arabian Journal for Science and Engineering (2020) 45:2371–2383

Table 2 Comparison based on overall detection accuracy and overall False Negative (FN): Number of records not correctly
error detected as normal class.
SVM technique Accuracy (%) Error (%)
TP
True positive rate (TPR)  (7)
Linear SVM 96.1 3.9 (TP + FN)
Quadratic SVM 98.6 1.4
FP
Fine Gaussian SVM 98.7 1.3 False positive rate (FPR)  (8)
(FP + TN)
Medium Gaussian SVM 98.5 1.5
TP + TN
Detection accuracy  (9)
(TP + FP + TN + FN)
Table 3 Comparison based on model building time
The overall detection accuracy and overall error for all
SVM technique Time in seconds
SVM techniques are given in Table 2. Table 3 shows the
Linear SVM 74 time (in seconds) taken by every technique for model build-
Quadratic SVM 53 ing. Comparison based on detection accuracy for individual
Fine Gaussian SVM 100 attack class is presented in Table 4.
Medium Gaussian SVM 50

7 Conclusion
Table 4 Comparison based on detection accuracy for individual type of
attack class
In this paper, a detailed analysis of intrusion detection based
Types of Linear Quadratic Fine Medium on support vector machine (SVM) techniques has been
attack SVM SVM Gaussian Gaussian
demonstrated. The different SVM techniques on NSL-KDD
class SVM SVM
dataset have been applied, and MATLAB has been used
DOS 99.7398 99.7945 99.9644 99.8601 for simulations. The results have been analyzed with the
Normal 98.6537 99.5751 99.8343 99.5752 help of ROC and Confusion Matrix. Model building time
Probe 98.5939 99.4321 99.7743 99.6813 for every SVM technique has been observed. Individual
R2L 98.2637 97.9137 99.3901 99.5728 detection accuracy of attack class has been identified and ana-
U2R 96.3245 91.8876 87.4841 89.4713 lyzed. The analyzed results show that Linear SVM, Quadratic
SVM, Fine Gaussian SVM and Medium Gaussian SVM give
96.1%, 98.6%, 98.7% and 98.5% overall detection accuracy,
respectively, with an overall error of 3.9%, 1.4%, 1.3% and
its preferred class and the class with maximum votes wins.
1.5%, respectively. This analysis concludes that Fine Gaus-
The number of the classifiers increases super linearly when
sian SVM provides best accuracy and least error for intrusion
k grows.
detection.
The simulated results have been analyzed based on the
In future research, the real-time dataset can be used to
overall detection accuracy in %, ROC and Confusion Matrix.
analyze these techniques and SVM optimization techniques
ROC is also called performance curve. It shows the per-
may be involved. SVM-based intrusion detection techniques
formance of selected SVM technique and the relationship
can be used for IOT in the smart world.
between true positive rate and false positive rate. In ROC
plot, x-axis shows the false positive rate and y-axis shows the
detection accuracy [29]. For an ideal case in ROC, the angle
between x-axis and y-axis should be 90° at the upper left cor- References
ner. Confusion Matrix graphically displays the performance 1. Tsai, C.F.; Hsu, Y.F.; Lin, C.Y.; Lin, W.Y.: Intrusion detec-
of selected SVM technique in each attack class. Confu- tion by machine learning: a review. Expert Syst. Appl. 36(10),
sion Matrix shows the relationship between well-classified 11994–12000 (2009). https://doi.org/10.1016/j.eswa.2009.05.029
records and misclassified records. The terms used in analy- 2. Bhati, B.S.; Rai, C.S.: Intrusion detection systems and techniques:
a review. Int. J. Crit. Comput.-Based Syst. 6(3), 173–190 (2016).
sis stage are defined as follows: https://doi.org/10.1504/IJCCBS.2016.079077
True Positive (TP): Number of records correctly detected 3. Lundin, E.; Jonsson, E.: Survey of intrusion detection research.
as attack class. Chalmers University of Technology, Gothenburg (2002)
False Positive (FP): Number of records not correctly 4. Joachims, T. (1998). Text categorization with support vec-
tor machines: learning with many relevant features. In: Euro-
detected as attack class. pean Conference on Machine Learning, pp. 137–142. Springer,
True Negative (TN): Number of records correctly detected Berlin.https://doi.org/10.1007/bfb0026683
as normal class. 5. www.kdnuggets.com. Accessed 07 Sept 2018

123
Arabian Journal for Science and Engineering (2020) 45:2371–2383 2383

6. Huang, C.L.; Wang, C.J.: A GA-based feature selection and 19. Azad, C., & Jha, V. K. (2019). Decision tree and genetic algo-
parameters optimization for support vector machines. Expert Syst. rithm based intrusion detection system. In: Proceeding of the
Appl. 31(2), 231–240 (2006). https://doi.org/10.1016/j.eswa.2005. Second International Conference on Microelectronics, Computing
09.024 & Communication Systems (MCCS 2017), pp. 141–152. Springer,
7. Burges, C.J.: A tutorial on support vector machines for pattern Singapore
recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998). 20. Tiwari, A.; Ojha, S. K.: Design and analysis of intrusion detection
https://doi.org/10.1023/A:1009715923555 system via neural network, svm, and neuro-fuzzy. In: Abra-
8. Smola, A.J.; Ovari, Z.L.; Williamson, R.C.: Regularization with ham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds)
dot-product kernels. In: Advances in Neural Information Process- Emerging Technologies in Data Mining and Information Secu-
ing Systems, pp. 308–314 (2001) rity, Advances in Intelligent Systems and Computing, vol. 755.
9. https://nlp.stanford.edu/IR-book/html/htmledition/nonlinear- pp. 49–63. Springer, Singapore (2019)
svms-1.html. Accessed 09 Aug 2018 21. Li, X.: Support vector machine based intrusion detection method
10. Fischetti, M.: Fast training of support vector machines with Gaus- combined with nonlinear dimensionality reduction algorithm.
sian kernel. Discrete Optim. 22, 183–194 (2016). https://doi.org/ Sens. Transducers 159(11), 226 (2013)
10.1016/j.disopt.2015.03.002 22. Li, Y.; Xia, J.; Zhang, S.; Yan, J.; Ai, X.; Dai, K.: An effi-
11. Xue-qin, Z., Chun-hua, G., & Jia-jun, L. (2006). Intrusion detection cient intrusion detection system based on support vector machines
system based on feature selection and support vector machine. In: and gradually feature removal method. Expert Syst. Appl. 39(1),
2006 First International Conference on Communications and Net- 424–430 (2012). https://doi.org/10.1016/j.eswa.2011.07.032
working in China, pp. 1–5. IEEE.IEEE. https://doi.org/10.1109/ 23. Parwekar, P., & Singhal, R. (2014). Robot assisted emergency
chinacom.2006.344739 intrusion detection and avoidance with a wireless sensor network.
12. Peddabachigari, S.; Abraham, A.; Grosan, C.; Thomas, J.: Mod- In: Proceedings of the International Conference on Frontiers of
eling intrusion detection system using hybrid intelligent systems. Intelligent Computing: Theory and Applications (FICTA) 2013,
J. Netw. Comput. Appl. 30(1), 114–132 (2007). https://doi.org/10. pp. 417–422. Springer, Cham.
1016/j.jnca.2005.06.003 24. Bhavsar, Y.B.; Waghmare, K.C.: Intrusion detection system using
13. Wang, K., Stolfo, S. J. (2003) One-class training for masquerade data mining technique: support vector machine. Int. J. Emerg. Tech-
detection. In: Workshop on Data Mining for Computer Security, nol. Adv. Eng. 3(3), 581–586 (2013). https://doi.org/10.17485/ijst/
Melbourne, Florida Nov 19, pp. 10–19 2017/v10i14/93690
14. Zhou, G.; Shrestha, A.: Efficient intrusion detection scheme based 25. http://nsl.cs.unb.ca/NSL-KDD/. Accessed 09 Jan 2018
on SVM. J. Netw. 8(9), 2128–2134 (2013). https://doi.org/10.4304/ 26. Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A
jnw.8.9.2128-2134 detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE Sym-
15. Li, L., Gao, Z. P., & Ding, W. Y. (2010). Fuzzy multi-class support posium on Computational Intelligence for Security and Defense
vector machine based on binary tree in network intrusion detec- Applications, pp. 1–6. IEEE.https://doi.org/10.1109/cisda.2009.
tion. In: 2010 International Conference on Electrical and Control 5356528
Engineering, pp. 1043–1046. IEEE.IEEE. https://doi.org/10.1108/ 27. Hsu, C.W.; Lin, C.J.: A comparison of methods for multiclass sup-
ics-04-2013-0031 port vector machines. IEEE Trans. Neural Netw. 13(2), 415–425
16. Cuong, T. D., & Giang, N. L. (2012). Intrusion detection under (2002). https://doi.org/10.1109/72.991427
covariate shift using modified support vector machine and modi- 28. Friedman, J. H. (1996). Another approach to polychotomous
fied backpropagation. In: Proceedings of the Third Symposium on classification. Technical Report, Statistics Department, Stanford
Information and Communication Technology, pp. 266–271. ACM. University.
https://doi.org/10.1145/2350716.2350756 29. Kumar, P.A.R.; Selvakumar, S.: Distributed denial of service
17. Parwekar, P.; Satapathy, S. C.: Leveraging Bigdata Towards attack detection using an ensemble of neural classifier. Com-
Enabling Analytics Based Intrusion Detection Systems in Wire- put. Commun. 34(11), 1328–1341 (2011). https://doi.org/10.1016/
less Sensor Networks. CSI Communications, 12 (2012) j.comcom.2011.01.012
18. Horng, S.J.; Su, M.Y.; Chen, Y.H.; Kao, T.W.; Chen, R.J.; Lai,
J.L.; Perkasa, C.D.: A novel intrusion detection system based on
hierarchical clustering and support vector machines. Expert Syst.
Appl. 38(1), 306–313 (2011). https://doi.org/10.1016/j.eswa.2010.
06.066

123

You might also like