Deep Learning Approach On Network Intrusion Detection System Using NSL-KDD Dataset
Deep Learning Approach On Network Intrusion Detection System Using NSL-KDD Dataset
Aroj Subedi
Sikkim Manipal Institute of Technology, Sikkim Manipal University, Majitar, Sikkim, India
E-mail: [email protected]
Abstract—The network infrastructure of any organization As defined by Heady et al. [7], ―an intrusion is a set of
is always under constant threat to a variety of attacks; actions that attempt to compromise the integrity,
namely, break-ins, security breach or system misuse. The confidentiality or availability of information resources.‖
Network Intrusion Detection System (NIDS) employed in The system employed to detect such malicious actions
a network detects such penetration attacks and intrusions in a network is termed as a Network Intrusion Detection
within a network. Known classes of attacks can be System (NIDS). It should be able to detect a wide range
detected easily by performing pattern matching while the of attacks and security violations inflicted by outsiders.
unknown attacks are harder to detect. An attempt has The system should also be able to check on any activity
been made to design a system using a deep learning of malpractices and abuses practiced by the insiders.
approach for intrusion detection that not only learns but Intruders can broadly be classified into three different
also adjusts itself to the patterns not defined earlier. categories. Masqueraders are typically outsiders who are
Sparse auto-encoder has been used for unsupervised not authorized users but penetrate the system using
feature learning. Logistic classifier is then utilized for legitimate user accounts. A Misfeasor is an insider, a
classification on NSL-KDD dataset. The performance of legitimate user who misuses the privileges given and
the system has been measured with respect to accuracy, accesses resources that they are not authorized to. A
precision and recall and the results have been found to be Clandestine can be either an insider or an outsider who
very promising for future use and modifications. tries to gain supervisory access to the system [1].
The NIDS are of two categories namely; Signature-
Index Terms—NIDS, deep learning, Sparse auto-encoder, based Network Intrusion Detection System (SNIDS) and
logistic classifier, NSL-KDD. Anomaly detection based Network Intrusion Detection
System (ADNIDS). SNIDS raises an alarm for intrusion
by performing a pattern matching on the features of the
I. INTRODUCTION information it is aware of. ADNIDS on the other hand,
raises an alarm for intrusion if there are any significant
The network architecture is always vulnerable to
deviations of the user activity under analysis from the
various types of security breaches, attempted break-ins,
normal traffic pattern. SNIDS, therefore, has a higher
penetration attacks and other similar intrusions by
detection rate for the known types of attacks, while
unauthorized and malicious users. The network being a
ADNIDS performs better in case of novel/unknown
repository aims at sharing resources between authorized
patterns of attacks. However, due to the variations in the
users, also attracts unwanted users who are interested in
behavior of the intruder, an ADNIDS has a tendency of
exploiting them. In addition, formulations of global
generating high false alarms. The security violations can
protection policies are rare and difficult to implement.
be detected by monitoring the system audit record for any
The security breach or intrusion is a critical issue for any
abnormal pattern of system usage [2].
organization. It is thus important to develop
Different kinds of machine learning techniques have
precautionary measures to safeguard the interest of the
been employed to develop a Network Intrusion Detection
organization from various categories of attacks to which
System for anomaly detection [9]. The NIDS model
it is susceptible to.
designed can be trained and tested for performance using
NSL-KDD dataset [5], which is a significant upgrade of
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14
Deep Learning Approach on Network Intrusion Detection System using NSL-KDD Dataset 9
the KDD Cup 99 dataset [4]. Different machine learning regression for classification. The model was evaluated for
techniques perform differently based on the input features, 2 class, 5 class and 23 class classification against the
the training and the test datasets selected [3]. Similar benchmark NSL-KDD dataset and the results obtained
types of approaches, learning techniques, and input were encouraging and the model showed better
features do not always guarantee the same results for a performance. Yin et al. [11] proposed a deep learning
variety of different classes of possible unknown attacks. approach for intrusion detection using Recurrent Neural
Deep learning techniques are popular as they facilitate the Networks (RNN). The experimental results showed that
design of robust and efficient NIDS. A deep learning the performance of the model was promising in both
approach based on Self-taught Learning (STL) [6] and a binary and multiclass classification and the model was
Non-symmetric Deep Auto-Encoder (NDAE) [8] have able to classify with high accuracy. Shone et al. [8]
been found to be useful for unsupervised feature learning proposed a novel deep learning classification model
of unlabeled data to understand the intrinsic behavioral constructed using stacked Non-Symmetric Deep Auto-
patterns of intruders. A classification of the patterns can Encoder (NDAE) for unsupervised feature learning and
be performed using suitable classifiers like soft-max RF classification algorithm for classification. The model
regression. In the proposed work a deep learning was implemented in Tensor Flow using benchmark KDD
approach based on sparse auto-encoders is used to learn Cup‘99 and NSL-KDD datasets. The model achieved a
the nature of the patterns and a logistic regression consistent level of classification accuracy with the
classifier is used to classify the behavior of users learned reduction in training time and a high level of precision
through the stacked encoders. The related work is and recall.
discussed in Section II. The proposed work is given in The problem with building a robust NIDS is
Section III followed by its design in Section IV. The unavailability of real-time pattern of network data
experimental results and discussion are given in Section consisting of both intrusions and normal uses, constantly
V. The conclusion and future work are given in Section evolving and changing attack patterns, long training time
VI. and insufficient knowledge about modifications required
in datasets. A model may achieve high accuracy against
the test datasets but the accuracy always seems to degrade
II. RELATED WORKS while analyzing the real network traffic. The NIDS
implemented using deep learning somewhat broke this
Most of the works carried out for Intrusion detection
trend. Most of works found in the literature implementing
predictive modeling part is performed using similar types
deep learning had significant detection rate and could
of datasets for training and testing. It is difficult to
somewhat detect anomalies not known earlier. Majority
generalize the real-time events through these datasets.
of the work is yet to be done in the intrusion detection
The performance measure of the majority of these
field to build an applicable and efficient NIDS but this
predictive models thus decreases when thrown into real
certainly seem to be the way forward.
network traffic.
Several approaches have been proposed for the
classification of normal connections with anomalies to
detect intrusions in a network. Shyu et al. [9] proposed a III. PROPOSED WORK
novel scheme using Principal Component Analysis (PCA) The proposed work aims at using a deep-learning
treating anomalies as outliers. The anomaly detection based approach for network intrusion detection. The
scheme performed better with the KDD‘99 dataset. The system uses a deep network to train itself with the
detection rate rose to 99% while the false alarm rate patterns of anomalies and classify the network traffic
dropped to as low as 1%. Revathi, et al. [3] performed a between the normal connections and the intrusions. The
detailed analysis on the NSL-KDD dataset using only approach is also focused at reducing the false alarm rate
relevant features both with and without feature reduction to a minimum value. The approach has the flexibility to
of the dataset on different classification algorithms like adjust to new patterns of intrusions and the behavior of
J48 decision tree, Random Forest, Support Vector the person that might change during the course.
Machine, Naive Bayes algorithm, etc. Random Forest The proposed system implements a deep network
achieved the highest test accuracy in both the cases. system (sparse auto-encoder with logistic regression),
Deep learning techniques facilitate the design of trained by the NSL-KDD dataset. It gives an output value
flexible and robust NIDS. Khaled et al. [10] proposed a of 0 or 1, where 1 denotes an intruder and 0 corresponds
deep learning approach for intrusion detection using one to a normal user.
hidden layer of Restricted Boltzmann Machine for The system utilizes a total of 115 features as an input
unsupervised feature reduction and Logistic regression to the system some of which are; protocol used, source
with multi-class soft-max for classification. The model address, destination address, the time-stamp, services,
was tested on the total 10% KDD-Cup‘99 test dataset and flag, number of failed logins, number of logins. Each
a detection rate of 97.9% was achieved. The KDD- feature is given as an input to the neurons. A sparse auto-
Cup‘99 dataset doesn‘t remotely impose reasonable encoder with sparsity constraint is utilized for training
challenge as that of real network traffic. Niyaz et al. [6] and learning new features from the data set. A deep
proposed a Self-Taught Learning (STL), a deep learning network is created by stacking the auto-encoders and the
technique for unsupervised feature learning and soft-max classification from the features learned is implemented
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14
10 Deep Learning Approach on Network Intrusion Detection System using NSL-KDD Dataset
using logistic regression network. Logistic regression is activation function. While training a sparse auto-encoder,
taken as the output involves the identification of two the process of optimizing the loss function involves the
classes of users. sum of three terms: Mean Squared Error term, L2
Regularization term and the Sparsity Regularization term.
The loss function used for training a sparse auto-
IV. DESIGN encoder is given in equation 1.
Pre-processing of the dataset is done before being
N K 2
applied to the network. The non-numeric parameters are 1
replaced with numeric values and the data set is
normalized using max-min operation. The overall flow of
E
N
( x kn xˆkn ) * weights * sparsity
n 1 k 1
the proposed system is given below in Fig 1. The KDD- (1)
Cup Dataset, a modification of the NSL-KDD dataset
includes 41 features derived from TCP/IP connections, Where,
traffic features accumulated in window interval and
content features extracted from the application layer data = coefficient for the L2 regularization term and
of connections. Out of the 41 features, 34 are continuous, = coefficient for the Sparsity regularization term
4 are binary and 3 are symbolic (protocol_type, service,
flags).
An auto-encoder is an artificial neural network used for Different parameters used for training a sparse auto-
unsupervised learning capable of adapting to encoder are discussed below:
understanding new features from a set of input data. The
input layer represents the original sets of features; the Regularization: Protection against over-fitting
hidden layer facilitates the better understanding of the problems. The parameter value set to 0 implies no
new features with reduced dimension helps. The output protection against over fitting has been applied..
layer represents the target feature which is the same as L2 Weight Regularization: It quantifies the
that of the input source. Sparse auto-encoders with a complexity of the model as the sum of squares of
sparsity constraint allow the network for a clear all feature weights. It takes a positive scalar value
exploration of the effects of sparseness for a given dataset to control the impact of regularization used in the
thus helping in finding new pattern distribution of the loss function (also termed as weight decay).
input data. Sparsity: It reduces the dependency between the
feature vectors thus, allowing the users to increase
the number of features. It is set close to 0 to
provide average activation on the hidden layer.
Sparsity Regularization: It is positive scalar value
to control the impact of Sparsity Regularization
term in the loss function.
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14
Deep Learning Approach on Network Intrusion Detection System using NSL-KDD Dataset 11
learned features to 10 new features which are then given consisting of 1 input layer, 1 output layer, and 2 Hidden
as inputs to logistic regression. In the diagram as shown Units as shown in Fig 5. The 115 inputs from the original
in Fig 3, h1i(1-50) represents input nodes, h2i(1-10) dataset are compressed and reduce to 50 nodes in the
represents the hidden layer nodes and h1i(1-50) represents second layer and to 10 nodes in the third layer. The final
the output layer nodes. output layer classifies whether a user is normal or not.
V. EXPERIMENTAL RESULTS
A total of 22,545 data with 41 features was taken from
the NSL-KDD dataset for training. The 3 symbolic
features (protocol, service, flag) were expanded using 1-
N encoding. The encoded data contains 115 features (3
from protocol, 64 from service and 11 from flag). The
protocol_type has 64 variations namely; FTP, HTTP,
login, etc which indicates the protocol used. The service
type describes the ICMP, TCP, and UDP services. The
flags REJ, SF, S0, S1, etc denote the priority of the data.
The num_access_files is ignored as it stays 0 throughout
the dataset. The NSL-KDD dataset is normalized with a
Fig.3. Level 2 Auto-encoder (Reduction of 50 to 10 Features) max-min operation.
The confusion matrix is utilized to measure the
The new features learned from level 2 auto-encoder are performance of the model. It takes the following
fed into the logistic classifier which identifies whether a parameters into consideration. True Positive (TP)
user is normal (0) or an intruder (1) as given in Fig 4. represents the correct classification of the Intruder. A
Logistic Regression uses a sigmoid or logistic function as False Positive (FP) is the incorrect classification of a
its activation function giving the probability measure of normal user taken as an intruder. The True Negative (NP)
the output in the range of [0,1]. represents a normal user classified correctly, where as a
False Negative (FN) is an instance where the intruder is
incorrectly classified as a normal user.
The accuracy is the ratio of the correctly predicted
values to the total number of test cases.
(2)
(3)
(5)
A. Sparse Auto-encoder 1:
The parameters associated with auto-encoder
‗msesparse‘ at level 1, as shown in Fig 6 are:
Fig.5. Fully Connected Layer (Input: 115, Hidden 1:50 Hidden 2:10 1. Regularization=0,
Output: 2) 2. L2WeightRegularization=0.001
3. Sparsity Regularization= 4,
The final stack implements a fully connected network 4. Sparsity=0.2
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14
12 Deep Learning Approach on Network Intrusion Detection System using NSL-KDD Dataset
1. Regularization = 0,
2. L2WeightRegularization = 0.001
3. Sparsity Regularization = 1,
4. Sparsity = 0.05
.
Fig.9. Regression plot for Logistic Classifier.
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14
Deep Learning Approach on Network Intrusion Detection System using NSL-KDD Dataset 13
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14
14 Deep Learning Approach on Network Intrusion Detection System using NSL-KDD Dataset
How to cite this paper: Sandeep Gurung, Mirnal Kanti Ghose, Aroj Subedi,"Deep Learning Approach on Network
Intrusion Detection System using NSL-KDD Dataset", International Journal of Computer Network and Information
Security(IJCNIS), Vol.11, No.3, pp.8-14, 2019.DOI: 10.5815/ijcnis.2019.03.02
Copyright © 2019 MECS I.J. Computer Network and Information Security, 2019, 3, 8-14