0% found this document useful (0 votes)
17 views7 pages

A Few-Shot Deep Learning Approach For Improved

This paper presents a few-shot deep learning approach to enhance intrusion detection systems (IDS) by utilizing a deep convolutional neural network (CNN) combined with support vector machine (SVM) and 1-nearest neighbor (1-NN) classifiers. The proposed method addresses the challenges of imbalanced datasets, specifically applying it to the KDD 99 and NSL-KDD datasets, achieving better performance than existing state-of-the-art methods. The study highlights the effectiveness of traditional imbalanced learning techniques integrated with few-shot learning for improved detection accuracy in IDS.

Uploaded by

neverland128455
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views7 pages

A Few-Shot Deep Learning Approach For Improved

This paper presents a few-shot deep learning approach to enhance intrusion detection systems (IDS) by utilizing a deep convolutional neural network (CNN) combined with support vector machine (SVM) and 1-nearest neighbor (1-NN) classifiers. The proposed method addresses the challenges of imbalanced datasets, specifically applying it to the KDD 99 and NSL-KDD datasets, achieving better performance than existing state-of-the-art methods. The study highlights the effectiveness of traditional imbalanced learning techniques integrated with few-shot learning for improved detection accuracy in IDS.

Uploaded by

neverland128455
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

A Few-shot Deep Learning Approach for Improved

Intrusion Detection
Md Moin Uddin Chowdhury∗ , Frederick Hammond† , Glenn Konowicz‡ ,Chunsheng Xin § ,Hongyi Wu¶ and Jiang Li
Department of Electrical and Computer Engineering, Old Dominion University
Email: ∗ [email protected], † [email protected], ‡ [email protected], § [email protected],
[email protected],  [email protected],

Abstract—Our generation has seen the boom and ubiquitous having limited potential against unfamiliar attacks, misuse
advent of Internet connectivity. Adversaries have been exploiting detection systems find greater usage in commercial arena [3].
this omnipresent connectivity as an opportunity to launch cyber With the increasing processing power of modern CPUs, data
attacks. As a consequence, researchers around the globe devoted a
big attention to data mining and machine learning with emphasis mining/machine learning technique has become an alternative
on improving the accuracy of intrusion detection system (IDS). In to manual human input. This approach was first introduced
this paper, we present a few-shot deep learning approach for im- in mining audit data for dynamic and automatic models for
proved intrusion detection. We first trained a deep convolutional intrusion detection (MADAMID) using association rules [4].
neural network (CNN) for intrusion detection. We then extracted However, the majority of IDSs currently in use are prone
outputs from different layers in the deep CNN and implemented a
linear support vector machine (SVM) and 1-nearest neighbor (1- to generation of false positive alarms. To this end, there
NN) classifier for few-shot intrusion detection. few-shot learning are only few datasets reflecting actual network connections
is a recently developed strategy to handle situation where training being publicly available for classifying normal from abnormal
samples for a certain class are limited. We applied our proposed connections. Among them KDD 99 and NSL-KDD [3] are
method to the two well-known datasets simulating intrusion in well known public datasets to promote anomaly detection
a military network: KDD 99 and NSL-KDD. These datasets are
imbalanced, and some classes have much less training samples techniques using machine learning.
than others. Experimental results show that the proposed method The KDD 99 dataset [3] is the pioneer for machine learning
achieved better performances than the state-of-the-art on those based IDS. KDD 99 dataset was harvested from data gather-
two datasets. ing during the 1998 DARPA Intrusion Detection Evaluation
Index Terms—Intrusion Detection System(IDS); low shot learn-
Program, where a LAN was set up in an effort to simulate an
ing; CNN; SVM.
actual military LAN, collecting TCPdump data over a duration
I. I NTRODUCTION of several weeks, with multiple attack data interspersed within
normal connection data. The training data consists of five
It is estimated that there will be roughly 50 billion devices
million connections records, and two weeks of testing data
that will connect to the Internet by year 2020. To keep abreast
yielded around two million records. The training data contains
with this exponential pace of Internet growth, cyber attacks by
22 different attacks out of the 39 present in the test data. The
hackers will exploit new flaws in Internet protocols, operating
known attack types are those present in the training dataset
systems and application software. There exists several protec-
while the novel attacks are the additional attacks in the test
tive measures such as firewall which is placed at the gateway
datasets not available in the training dataset. The attacks types
to check activities of intruders. To meet the dynamic character-
are grouped into four categories [3] :
istics of attacks, intrusion detection systems (IDSs) [1] is used
as a second line of defense. IDSs dynamically monitor network • DOS (Denial of Service): Under this attack, the attacker
log, file system, and real-time events occurring in a computer prevents user from using resources by pre-occupying
system or network and analyze them for signs of adversaries resources so that the service provider can no longer
or attacks [2]. IDSs are classified as host-based or network- handle new user requests.
based. Host-based IDSs operate on information collected from • Probing: Under the probing attack, an attacker gathers
within an individual computer system, while network-based information to bypass existing security measures by port
IDSs collects raw network packets as the data source from the scanning.
network and analyze for signs of intrusions [2]. There are two • U2R (User to Root): Under this attack, attackers attempt
different detection techniques, misuse detection and anomaly to gain unauthorized access to local super user (root)
detection, employed in IDSs to search for attack patterns. privileges.
Misuse detection systems find known attack signatures in the • R2L (Remote to Local): This attack means unauthorized
monitored resources, whereas anomaly detection systems iden- access from a remote machine outside of the system to
tify attacks by detecting changes in the pattern of utilization or access a valid user account.
behavior of the system. However, at present, anomaly detection Despite of its potential, KDD 99 dataset is considered as
IDSs have not been widely adopted. On the other hand, despite having several drawbacks such as duplicate evidences and

978-1-5386-1104-3/17/$31.00 ©2017 IEEE 456


inherent packet handling problem of TCPdump [3]. To bypass Our contribution in this paper can be summarized as fol-
these limitations and create a dataset for better evaluation of lows,
machine learning based IDS techniques, an improved dataset • We developed a few-shot deep learning method for intru-
known as NSL-KDD was created by removing all duplicate sion detection, where the average class-wise performance
records. The inherent drawbacks in the KDD 99 dataset was improved where data is highly imbalanced.
has affected detection accuracies of many IDSs. NSL-KDD • We effectively incorporated the traditional imbalanced
contains essential records of the complete KDD 99 dataset. learning techniques, oversampling and undersampling,
There is a collection of downloadable files at the convenience into few-shot deep learning for intrusion detection and
of the researchers. Three main refinements done on KDD 99 achieved state-of-the-art performances on the two bench-
dataset were [3]: mark datasets.
• All of the redundancy issues were taken care of in order The rest of the paper are organized as follows. Section II
to enable the classifiers to provide un-biased classification provides a brief literature review. We discussed classifiers used
results. in this research in Section III. Pre-processing methods and
• A comparable number of train and test records were numerical evaluation are presented in Section IV and Section
provided to conduct rational experiments V respectively. Finally, Section VI concludes this paper.
• The number of selected records from each difficult level
II. R ELATED W ORKS
group is inversely proportional to the percentage of
records in the original KDD 99 dataset [3]. Intrusion detection using KDD 99 and NSL-KDD is well
studied in literature. Researchers proposed single, hybrid and
After their introduction, KDD 99 and NSL-KDD attracted ensemble classifiers to increase test accuracy. Niyaz in [7]
many research efforts and numerous machine learning archi- proposed a deep learning approach (self taught learning TSL)
tectures were developed for these datasets. The analysis of with two stage classification. The approach consists of learning
these two datasets revealed that they are highly imbalanced [3], a good feature representation from unlabeled data and apply
i.e., U2R attack class type has only a few samples whereas the to labeled data for classification. The authors used sparse
normal type connection class has more than 50% presence. As auto-encoder for unsupervised feature learning and soft-max
a result, classifiers face difficulty to detect the minor classes. regression for classification. Using NSL-KDD Dataset they
We did a literature survey on these two datasets and found that obtained 79.1% test accuracy considering 5 classes.
the testing accuracy of NSL-KDD considering all the features Tang et al. [8] proposed a deep neural network model with
stuck at around 85% considering all attack types. So we were 3 hidden layers for NSL-KDD. Using only 6 basic features ,
looking for the answer to the following question, Is there any their model resulted into 75.75 % test accuracy. In [9] Zhang
way to increase the test accuracy of NSL-KDD especially and Zulkermine proposed a random forest based anomaly
for the minor classes? detection model. Their hybrid framework combined misuse
Traditional imbalanced learning strategies include oversam- and anomaly detection. They evaluated it on KDD 99 data set.
pling minority classes or undersampling majority classes [5]. They converted the attacks into 2-class and got a detection rate
Recently, authors in [6] introduced a novel method called ’few- of 94.7%. Naive Bayes classifiers were also used for intrusion
shot learning’, where they used a generic dataset to learn detection problem in [10] and provided competitive results.
a feature representation using deep CNN structures. They The experiments were performed on KDD 99 data set and
learned new feature representations then went through a SVM focused on 4 classes and 2 classes of attacks. The authors
classifier or 1-NN classifier for few-shot learning. The SVM implemented a decision tree classifier along with the Naive
or 1-NN classifier only needs few samples to predict the class Bayes. For 4 classes case, the model provided 91.28% and
label for new observations. They found that, since the feature 91.47% by decision trees and Naive Bayes, respectively. For
extractor is optimized for the generic dataset but not the few- the 2-classes case, the decision trees yielded a classification
shot samples, this model can perform well on the generic accuracy of 93.02% while the Naive Bayes yielded a 91.45%
dataset but not on the minor classes. accuracy.
In this paper, we combine the traditional imbalance learning Unsupervised learning techniques were also proposed for
techniques: oversampling and undersampling [5] with the few- this problem. In [11], authors implemented a k-mean clustering
shot deep learning strategy [6] for few-shot intrusion detection. for the NSL-KDD data set. In the paper, the author tried to
We extracted outputs from different layers of a deep CNN classify data into the major attacks categories (4 clusters). the
classifier and designed a SVM classifier and a 1-NN classifier clustering algorithm provided a good distribution of data and
for few-shot intrusion detection. We also considered oversam- showed that it is very useful for unlabeled data.
pling and undersampling methods to make datasets balanced to A number of papers applied feature selection methods to
compare their performances. We conducted a literature survey reduce the complexity of the data. Panda et al. [12] proposed
on these two datasets and to the best of our knowledge that a hybrid intelligent approach combining principal component
extracting outputs from different layers of deep CNN classifier analysis (PCA) and random forest among other techniques.
for few-shot learning is not yet implemented for both the KDD In the paper, NSL-KDD data set was used with 2 classes.
and NSL-KDD datasets. The system gave a high detection rate (almost 100%) and

457
TABLE I
CNN ARCHITECTURE FOR F EATURE E XTRACTION

Layer Index Layer Output Shape Padding & Stride


1 Conv2D (64 filters, size: 1x3) 1, 38, 64 0,1
2 Conv2D (64 filters, size: 1x3) 1, 36, 64 0,1
3 Maxpooling2D 1, 18, 64
4 Conv2D (128 filters, size: 1x3) 1, 16, 128 0,1
5 Conv2D (128 filters, size: 1x3) 1, 16, 128 1,1
6 Conv2D (128 filters, size: 1x3) 1, 16, 128 1,1
7 Maxpooling2D 1, 8, 128
8 Conv2D (256 filters, size: 1x3) 1, 8, 256 1,1
9 Conv2D (256 filters, size: 1x3) 1, 8, 256 1,1
10 Conv2D (256 filters, size: 1x3) 1, 8, 256 1,1
11 Maxpooling2D 1, 4, 256
12 Flatten 1024
13 Fully Connected 100
14 Dropout 100
15 Fully Connected 20
16 Softmax 5

Fig. 1. Flowchart of our considered method.


small false positive rates. Bouzida et al. [13] also used PCA
for feature reduction and applied to KDD 99 data set. Then,
the authors evaluated the proposed approach by using nearest III. C LASSIFIER I NTRODUCTION
neighbor (NN) algorithm and decision trees with/without PCA.
The approach provided high accuracies for each attack class. A. Support Vector Machine
Mukkamala in [14] used SVM and Neural Networks (NN) SVM [18] is a supervised discriminative classifier which
(Multi-layer, feed forward network with 4 & 3 layers) for the is known for detecting class labels by creating a separating
intrusion detection problem. The models were evaluated on hyperplane with large margins. In other words, given labeled
KDD 99 data set. Both SVM and NN delivered high accuracies training data, the algorithm provides an optimal hyperplane
(almost 100%) considering 2 classes. However, the evaluation that maximizes the margins for all categories considered. SVM
showed in significant difference in training times between NN training algorithm builds a model that separates two categories
and SVM. Wun-Hwa Chen also used SVM and NN for the with the largest margin that can generalize to unseen data. For
problem [15] and their approaches were evaluated on KDD 99 multiple class problem, it usually utilize one-vs-all strategy to
data sets considering 2 classes and both models provided very establish a multi-class classifier.
high accuracy rates (very close to 100%).

Authors in [17], proposed an optimizer to overcome the B. K Nearest Neighbor


weakness of gradient based optimization used in deep learning
k-NN [19] is a non-parametric learning algorithm, i.e., it
algorithms. Their proposed optimizer controls both short-term
does not consider any assumptions on data distributions. For a
knowledge within a task and long-term knowledge common
testing example, it searches k nearest neighbors of the testing
among all the tasks. Experiments showed that their method
example in the training dataset and assigns the majority of the
performs better than natural baselines and is competitive to
labels from the nearest neighbors as the class label for the
the state-of-the art in metric learning for few-shot learning.
testing example. We utilized 1-NN classifier in this paper and
In [16], authors presented prototypical networks for few-shot
it only needs a few training examples to predict class label for
learning algorithm. They trained these networks to specifically
a new coming data point.
perform well in the few-shot setting by using episodic training.
Apart from this, they demonstrated how to generalize proto-
typical networks to the zero-shot setting, and achieved state- C. Convolutional Neural Network
of-the-art results on the CUB-200 dataset
CNN [20] is comprised of one or more convolutional layers
In essence, a high variety of models were proposed for (often with a maxpooling step) and then followed by one or
these two datasets. Nevertheless, the accuracy of NSL-KDD more fully connected layers as in a standard multilayer neural
never reached beyond a certain point while considering all 41 network. CNN achieved state-of-the-art performances in image
features and 5 classes. This motivated us to look for another classification and speech recognition fields in recent years. We
process other than traditional classifiers. utilized the CNN as a general feature extractor in this paper.

458
100 100
SVM
90 k-NN 90

80 80

70 70

Test Accuracy(%)
Test Accuracy

60 60

50 50

40 40

30 30

20 20 KDD-kNN
KDD-SVM
10 10 NSL-kNN
NSL-SVM
0 0
KDD NSL-KDD Layer 10 Layer 11 Layer 13 Layer 15
Dataset CNN Layer Index
(a)
Fig. 2. Test Accuracy performance of undersampled datasets
80
100
70
90

Mean classwise Test Accuracy(%)


80 60

70 50
Test Accuracy

60 Normal
Dos 40
50 Probe
R2L
30
40 U2R

30 20
KDD-kNN
KDD-SVM
20
10 NSL-kNN
NSL-SVM
10
0
0 Layer 10 Layer 11 Layer 13 Layer 15
SVM k-NN CNN Layer Index
Classifiers
(b)
Fig. 3. Classwise Accuracy performance of undersampled KDD 99 Fig. 4. (a) Test Accuracy performance & (b) Mean Class-wise test perfor-
mance of features from different layers for original KDD and NSL-KDD
datasets.
IV. M ETHODS
A. Pre-processing each feature and dividing by its standard deviation. After that,
Neural network based classification uses only numerical we normalized the test features using the mean and standard
values for training and testing. Hence a pre-processing stage deviation of each feature from train datasets.
is needed to convert the non-numerical values to numerical
values. Two main tasks in our pre-processing are: C. CNN Architecture for Feature Extraction
• Converting the non-numerical features in the dataset to We trained a CNN architecture to extract features for both
numerical values. The features 2, 3 and 4 namely the datasets. Pre-processed data were fed through the input layer.
protocol type, service and flag were non-numerical. These We used various number of filters such as 64, 128 and 256 with
features in the train and test data set were converted filter size as 1x3. After the convolution layers, there was a fully
to numerical types by assigning specific values to each connected dense neural network with 3 hidden layers with 100
variable (e.g. TCP = 1, UDP = 2 and ICMP = 3). , 20 and 5 hidden units respectively. We trained the model
• Convert the attack types at the end of the dataset into using train data and test data separately and extracted outputs
its numeric categories. Category 1 is assigned to normal from intermediate CNN layers to create new representations
data, and 2, 3, 4 and 5 are assigned to DoS, Probe, R2L with different number of features. We considered mainly four
and U2R attack types, respectively layers in this study. The highest layer was a fully connected
layer with 20 outputs, i.e., output from this layer has 20
B. Normalization features. We also considered a fully connected layer with
Since the features of both KDD 99 and NSL-KDD datasets 100 outputs, a maxpooling layer with 4x256 outputs and the
have either discrete or continuous values, the ranges of the last CNN layer which had 8x256 outputs. We also tried to
values were different and this made them incomparable. In this extract features from lower level CNN layers but the testing
study, the features were normalized by subtracting mean from accuracy was around 40% to 43% for both of the datasets and

459
TABLE II accuracies for NSL-KDD is also close to those of previous
O RIGINAL DATASET ACCURACIES literatures. Table III summarizes the class wise performance
Dataset Classifier Test Accuracy of the classifiers for both of the datasets. It turns out that
KDD SVM 95.27% KDD 99 dataset outperforms NSL-KDD in terms of detecting
99 k-NN 96.19% each class individually by scoring 65.83% and 64.05% for
NSL- SVM 77.68%
KDD k-NN 80.74% 1-NN and SVM respectively. For NSL-KDD, the classifiers
were not able to detect the minor classes properly which
resulted into class-wise performance degradation (53.84% and
TABLE III 56.609% respectively for 1-NN and SVM).
O RIGINAL DATASET C LASS - WISE M EAN ACCURACIES
100
Dataset Classifier Mean of Class-wise Test Accuracy
KDD SVM 65.833% 90
99 k-NN 64.048%
NSL- SVM 56.609% 80

KDD k-NN 53.84% 70

Test Accuracy(%)
60

50
hence omitted for comparison. A brief overview of the CNN
40
architecture is shown in Table I.
After getting the intermediate features we used them as 30

input to SVM and k-NN. We considered 1 Neighbor for k-NN 20 SVM-KDD 2 fold
KNN-KDD 2 fold
classifier. Fig. 1 shows our considered methodology and work- 10 SVM-KDD 9 fold
SVM-KDD 9 fold
flow. As performance metric, we considered mean classwise
0
accuracy along with test accuracy. In other words, we first Layer 10 Layer 11 Layer 13 Layer 15
CNN Layer Index
calculated the accuracy for each class and then considered the
(a)
mean of test accuracies of all classes as performance metric.
90
V. N UMERICAL R ESULTS
80
A. Experiment Setup
70
For model development and evaluation we have considered
Intel core i-7 7700 3.60 Ghz CPU with 32 GB RAM work- 60
Test Accuracy(%)

station. We have implemented SVM using Liblinear [21] in


50
MATLAB. For implementing k-NN we used the MATLAB
built in function. The CNN was implemented using the Python 40

keras package [22]. We have considered the following KDD 30


and NSL-KDD datasets for this research:
20 SVM-KDD 2 fold
• KDD Train : 10% KDDtrain KNN-KDD 2 fold
SVM-KDD 9 fold
• KDD Test: Corrected labels 10
SVM-KDD 9 fold
• NSL-KDD Train : KDDTrain+ 0
Layer 10 Layer 11 Layer 13 Layer 15
• NSL-KDD Test: KDDTest+
CNN Layer Index
The 10 % KDDtrain dataset consisted of 494,021 records (b)
among which 97,277 (19.69%) were normal, 391,458 Fig. 5. (a) Test Accuracy performance & (b) Mean Class-wise accuracy
(79.24%) DOS, 4,107 (0.83%) Probe, 1,126 (0.23%) R2L and performance of features from different layers for 2 & 9 fold oversampled
52 (0.01%) U2R connections. In each connection, there are 41 KDD dataset.
attributes describing different features of the connection and
a label assigned to each either as an attack type or as normal.
C. Performances from Oversampling and Undersampling
The 41 attributes can be classified into four different categories
as Basic, Content , Traffic and Host. Corrected labels has As mentioned before, the U2R attack class has only .05%
311027 records. NSL KDDTrain+ and KDDTest+ has 125973 evidences. So deep learning methods are not able to classify
and 22544 records respectively. this class correctly as they are biased towards more frequent
classes. So we tried to undersample the other four classes
B. Initial Evaluation randomly at the same evidence of U2R i.e., we tried to create
We tried to measure the performance of the original datasets new training datasets which is balanced (each class has 20%
first. Table II shows the test accuracies for different classifiers presence). Then we trained 2 classifiers (1-NN & SVM )
for both datasets. As expected, the test accuracy for KDD using these datasets and compared the test performances using
99 is much higher (about 96%) than the NSL-KDD for the original test datasets. The test accuracies of undersampled
both of the classifiers due to its redundant records. The test KDD 99 for the two classifiers were comparable to each other

460
100
II. The 1-NN accuracies are slightly better than the results
90 as shown in Table II. For NSL-KDD the accuracies reached
80 close to 90% for both of the classifiers. For instance, Layer
70
15 provides 91.82% for SVM and 89.27% for 1-NN. Features
extracted from Layer 13 provided 94.62% for SVM classifier
Test Accuracy(%)

60
and 88.93% for 1-NN. The accuracies for NSL-KDD drop
50
as we move to lower layers. We can also observe a similar
40 pattern for mean class-wise test performances of different
30 CNN layers in Fig. 4(b). The more we go down, the less we
20 SVM-NSL-KDD 2 fold get mean test accuracy of all the classes. The best performance
KNN-NSL-KDD 2 fold
SVM-NSL-KDD 9 fold
is provided by layer 13 where the class-wise test accuracies
10
SVM-NSL-KDD 9 fold of all classifiers and datasets were above 70%. The SVM
0
Layer 10 Layer 11 Layer 13 Layer 15 classifier on KDD 99 dataset provided better results than other
CNN Layer Index classifiers.
(a)

80
E. Effect of Sampling on Low shot Deep Learning
To increase the class-wise performance of original datasets,
70
we created 2 fold and 9 fold duplicate samples of U2R class
60 and studied the performances for both datasets. The results
for 2 & 9 fold duplicate oversampling of U2R class on KDD
Test Accuracy(%)

50
dataset is depicted in Fig. 5. From Fig. 5(a) we can see
40 that the testing performances of 2 fold oversampled KDD
99 dataset for SVM classifier were 97.29%, 98.19%, 84.96%
30
and 95.51% respectively for Layers 15,13,11 and 10. The 1-
20 NN classifier performance on same oversampled dataset were
SVM-NSL-KDD 2 fold
KNN-NSL-KDD 2 fold 95.84%, 86.62%, 85.8% and 92.62%, respectively. In case
10 SVM-NSL-KDD 9 fold
SVM-NSL-KDD 9 fold
of, 9 fold oversampling of class U2R, KDD provided 97.06
0 %, 97.3%, 95.19% and 93.152% testing accuracies for SVM
Layer 10 Layer 11 Layer 13 Layer 15
CNN Layer Index
classifier. On the other hand, 1-NN scored 95.62%, 94.25%,
(b)
88.65% and 53.5% on the same oversampled KDD dataset. We
then studied mean class-wise accuracies for 2 fold and 9 fold
Fig. 6. (a) Test Accuracy performance & (b) Mean Class-wise accuracy
performance of features from different layers for 2 & 9 fold oversampled duplicate evidences of U2R class on KDD which is depicted
NSL-KDD dataset. in Fig. 5(b). The best results are provided by the features
from Layer 13. Overall, the mean class-wise performance is
better than the original dataset. We also observed that, the
as depicted in Fig. 2. To mitigate the effect of randomness, we performance of 9 fold oversampled outperformed its 2 fold
conducted the experiment 10 times and calculated the mean counterpart for both Layer 13 and 15. SVM classifier scores
results. The accuracies for SVM and 1-NN were 91.66% and 83.152% mean class-wise accuracy on features extracted from
87.3% respectively for KDD 99 dataset. But this undersam- Layer 13 for 9 fold U2R oversampled KDD.
pling method performed poorly on NSL-KDD dataset. The The test performances for 2 & 9 fold duplicate evidences
test accuracy reached merely highest only 13%. We were also of U2R class on NSL-KDD dataset is depicted in Fig. 6. The
curious about taking a look on class-wise performance of testing accuracies of 2 fold oversampled NSL-KDD dataset for
undersampled KDD 99. From Fig. 3, we can see the class- SVM classifier were 90.043%, 89.9%, 22.87% and 88.36%
wise test performance of the classifiers. In this case, mean respectively for layers 15,13,11 and 10 as shown in Fig.
of the class-wise test accuracies were better than original 6(a). The test accuracies of 1-NN classifier performance on
datasets, where SVM and 1-NN scored 71.94% and 75.44 % same oversampled dataset were 88.19%, 87.42%, 10.13% and
respectively. 22.60%, respectively. The performance of 9 fold oversampled
dataset is slightly better than 2 fold oversampled NSL-KDD.
D. Performance of few-shot Deep Learning In case of, 9 fold oversampling of class U2R, NSL-KDD
We trained a CNN model using KDD 99 and NSL-KDD provided 90.4%, 92.92%, 49.67%, 43.79% testing accuracies
datasets and created 4 new datasets by extracting outputs from for SVM classifier. At the same time, 1-NN scored 88.09%,
4 different layers as mentioned in the previous section. The 88.82%, 50.88% and 23.71% accuracies on the same oversam-
results are shown in Fig. 4. From Fig. 4(a), we can see that, pled dataset for features from layer 15 to 10.
the SVM test accuracies of KDD 99 for Layer 15 and Layer The mean classification accuracies of NSL-KDD are shown
13 are 97.26% and 98.71%, respectively, which are higher in Fig. 6(b). We found a decreasing pattern for mean class-wise
than original SVM considering 41 features as shown in Table test accuracy as we move downwards to CNN architecture.

461
TABLE IV [2] H. G. Kayacik, A. N. Zincir-Heywood, and M. I. Heywood, “Selecting
T EST ACCURACY COMPARISON TO LITERATURE features for intrusion detection: A feature relevance analysis on kdd
99 intrusion detection datasets,” in Proceedings of the third annual
conference on privacy, security and trust, 2005.
Algorithms Test Accuracy [3] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed
Niyaz ei al. [7] 79.10% analysis of the kdd cup 99 data set,” in 2009 IEEE Symposium on
Mahbod et al. [3] 82.02% Computational Intelligence for Security and Defense Applications, July
Tang et al. [8] 75.75% 2009, pp. 1–6.
Our work 94.62% [4] J. P. T. Srilatha Chebrolu, Ajith Abraham, “Feature deduction and
ensemble design of intrusion detection systems, computers security,
volume 24, issue 4, june 2005, pages 295-307, issn 0167-4048.”
[5] H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE
The highest mean class-wise accuracy (70.46%) was provided Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp.
1263–1284, Sept 2009.
by the SVM classifier for 9 fold oversampled U2R attack [6] B. Hariharan and R. Girshick, “Low-shot visual recognition by shrinking
class. As we move down towards the architecture, the mean and hallucinating features,” arXiv preprint arXiv:1606.02819, 2016.
class-wise accuracy decreases. In essence, we observed an [7] Q. Niyaz, W. Sun, A. Y. Javaid, and M. Alam, “A deep learning approach
for network intrusion detection system,” in Proceedings of the 9th EAI
increasing trend in mean class-wise performance but a slight International Conference on Bio-inspired Information and Communica-
degradation in test accuracy performance for both of the tions Technologies (Formerly BIONETICS), BICT-15, vol. 15, 2015, pp.
oversampled datasets. This is due to the fact that the classifiers 21–26.
[8] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho,
are detecting the minor class more accurately at the expense of “Deep learning approach for network intrusion detection in software
compromising test accuracy performances of majority classes. defined networking,” in 2016 International Conference on Wireless
Networks and Mobile Communications (WINCOM), Oct 2016, pp. 258–
263.
[9] J. Zhang and M. Zulkernine, “A hybrid network intrusion detection
F. Comparison to Literature technique using random forests,” in Availability, Reliability and Security,
For ease of comparison with previous literatures considering 2006. ARES 2006. The First International Conference on. IEEE, 2006,
pp. 8–pp.
4 attack types, we also provide Table IV which shows that our [10] N. B. Amor, S. Benferhat, and Z. Elouedi, “Naive bayes vs decision
method outperforms other results in terms of test accuracy. trees in intrusion detection systems,” in Proceedings of the 2004 ACM
SVM classifier on features extracted from layer 13 provided symposium on Applied computing. ACM, 2004, pp. 420–424.
[11] V. Kumar, H. Chauhan, and D. Panwar, “K-means clustering approach
the best result on NSL-KDD dataset. Our methods worked to analyze nsl-kdd intrusion detection dataset,” International Journal of
well for both of the datasets in terms of overall test accuracy Soft Computing and Engineering (IJSCE), 2013.
and class-wise test accuracy. The Layer 13, which is the first [12] M. Panda, A. Abraham, and M. R. Patra, “A hybrid intelligent approach
for network intrusion detection,” Procedia Engineering, vol. 30, pp. 1–9,
fully connected layer with 100 hidden units, provided the best 2012.
results. Among the two classifiers, SVM outperformed 1-NN [13] Y. Bouzida, F. Cuppens, N. Cuppens-Boulahia, and S. Gombault, “Ef-
in almost all the experiments. ficient intrusion detection using principal component analysis,” in 3éme
Conférence sur la Sécurité et Architectures Réseaux (SAR), La Londe,
France, 2004, pp. 381–395.
VI. C ONCLUSION [14] S. Mukkamala, G. Janoski, and A. Sung, “Intrusion detection using neu-
In this research, we implemented a few-shot deep learn- ral networks and support vector machines,” in Neural Networks, 2002.
IJCNN’02. Proceedings of the 2002 International Joint Conference on,
ing method for intrusion detection. Among different attack vol. 2. IEEE, 2002, pp. 1702–1707.
types, some rare attack types make machine learning based [15] W.-H. Chen, S.-H. Hsu, and H.-P. Shen, “Application of svm and ann for
detection systems difficult to identify those minority attack intrusion detection,” Computers & Operations Research, vol. 32, no. 10,
pp. 2617–2634, 2005.
types. Inspired by the few-shot image recognition work in [6], [16] J. Snell, K. Swersky, and R. S. Zemel, “Prototypical networks
we trained a deep CNN structure and used it as a general for few-shot learning,” CoRR, vol. abs/1703.05175, 2017. [Online].
feature extractor for feature extraction. We then trained a Available: http://arxiv.org/abs/1703.05175
[17] S. Ravi and H. Larochelle, “Optimization as a model for few-shot
SVM or an 1-NN classifier for intrusion detection on the new learning,” 2016.
feature representations. In addition, we incorporated a tradi- [18] D. S. Kim, H.-N. Nguyen, and J. S. Park, “Genetic algorithm to improve
tional imbalance learning technique that oversampled minority svm based network intrusion detection system,” in 19th International
Conference on Advanced Information Networking and Applications
classes before training. Our method obtained state-of-the-art (AINA’05) Volume 1 (AINA papers), vol. 2, March 2005, pp. 155–158
performances on the KDD and NSL-KDD datasets achieving vol.2.
over 94% accuracies for both datasets. We also able to achieve [19] H.-V. Nguyen and Y. Choi, “Proactive detection of ddos attacks utilizing
k-nn classifier in an anti-ddos framework,” International Journal of
better classwise accuracy using traditional imbalance learning Electrical, Computer, and Systems Engineering, vol. 4, no. 4, pp. 247–
techniques. The proposed method is a good candidate for 252, 2010.
imbalance learning and intrusion detection. In future, we plan [20] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in Neural
to use our method on various imbalanced datasets to enhance Information Processing Systems 25, F. Pereira, C. J. C. Burges,
the minority class detection rate. L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012,
pp. 1097–1105. [Online]. Available: http://papers.nips.cc/paper/4824-
R EFERENCES imagenet-classification-with-deep-convolutional-neural-networks.pdf
[21] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J.
[1] S. Potluri and C. Diedrich, “Accelerated deep neural networks for Lin, “Liblinear: A library for large linear classification,” J. Mach.
enhanced intrusion detection system,” in 2016 IEEE 21st International Learn. Res., vol. 9, pp. 1871–1874, Jun. 2008. [Online]. Available:
Conference on Emerging Technologies and Factory Automation (ETFA), http://dl.acm.org/citation.cfm?id=1390681.1442794
Sept 2016, pp. 1–8. [22] F. Chollet et al., “Keras,” https://github.com/fchollet/keras, 2015.

462

You might also like