A Novel Statistical Analysis and Autoencoder Driven (CB)
A Novel Statistical Analysis and Autoencoder Driven (CB)
Nianyin Zeng
Journal Pre-proof
PII: S0925-2312(19)31575-9
DOI: https://doi.org/10.1016/j.neucom.2019.11.016
Reference: NEUCOM 21521
Please cite this article as: Cosimo Ieracitano, Ahsan Adeel, Francesco Carlo Morabito, Amir Hussain,
A Novel Statistical Analysis and Autoencoder Driven Intelligent Intrusion Detection Approach, Neuro-
computing (2019), doi: https://doi.org/10.1016/j.neucom.2019.11.016
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
EH16 5XW, UK
c School of Computing, Edinburgh Napier University, Edinburgh EH10 5DT, UK
Abstract
In the current digital era, one of the most critical and challenging issue is ensur-
ing cybersecurity in information technology (IT) infrastructures. Indeed, with
the significant improvement of technology, hackers have been developing ever
more complex and dangerous malware attacks that make the intrusion recog-
nition a very difficult task. In this context, the existing traditional analytic
tools are facing severe challenges to detect and mitigate these threats. In this
work, we introduce a statistical analysis and autoencoder (AE) driven intelli-
gent intrusion detection (IDS) system. Specifically, the proposed IDS combines
data analytics, statistical techniques with recent advances in machine learning
theory to extract optimized and more correlated features. The validity of the
proposed IDS is tested using the benchmark NSL-KDD database. Experimental
results show that the designed IDS achieves better classification performance as
compared to deep and conventional shallow machine learning as well as recently
proposed state-of-the-art techniques.
Keywords: Anomaly Detection, Deep Learning, Autoencoder, Optimized
Features Extraction, NSL-KDD database.
1. Introduction
∗ Corresponding author
Email addresses: [email protected] (Cosimo Ieracitano),
[email protected] (Ahsan Adeel), [email protected] (Francesco Carlo Morabito),
[email protected] (Amir Hussain)
2
can be seen in Table 5 and 8 the proposed AE processor outperformed all other
approaches, reporting accuracy value of 84.21% and 87% in binary and multi-
classification, respectively.
The main contributions of this work can be summarize as follow:
The present paper is organized as follows. Section 2 presents the related works,
in particular recently proposed DL approaches based on the NSL-KDD dataset.
Section 2 introduces the used NSL-KDD dataset and explains the proposed
method, including data preprocessing, features extraction and the developed
classifiers. Section 4 reports the experimental results. Finally, Section 5 and 6
conclude this work.
2. Related work
In the literature there are several intrusion detection systems that use the
KDD99 and NSL-KDD datasets to measure the performance and effectiveness
of the proposed models. For example, Ingre et al. [18] developed a 3-layers
MLP to detect attacks classes of the NSL-KDD dataset, achieving an accuracy
of 79.9% for multi-classification and 81.2% for binary classification on test set.
Ibrahim et al. [19] proposed a novel method based on self-organizing maps
(SOMs) for binary classification and reported detection accuracy up to 75.49%
on NSL-KDD test dataset. Similarly, Mohamed et al. [20] applied conven-
tional learning techniques such as MLP and achieved accuracy up to 95.7% for
binary classification. However, the authors partitioned the dataset into k=10
folds. Gao et al. [21] developed a semi-supervised learning method based on
fuzzy and ensemble learning theory. The authors used the NSL-KDD dataset,
reporting accuracy of 84.54% on the KDD test set. Alrawashdeh et al. [22] im-
plemented a deep belief network (DBN) based on Restricted Boltzmann Machine
(RBM) architecture with softmax output layer for multi-classification purpose.
However, the proposed system was tested on 10% of the KDD99 test samples,
achieving accuracy and false alarm rate up to 98% and 2.47%, respectively. In
[23], the authors used the Software Defined Networking (SDN) environment and
proposed a deep neural network (DNN) for anomalies detection. Specifically,
a three-hidden layers neural network was trained on the NSL-KDD dataset.
However, just 6 features were used and only two-way discrimination was per-
formed (normal vs. abnormal). Experimental results reported performance of
75.75% accuracy. In [24], instead, Kim et al. proposed a deep neural network
trained on the KDD99 dataset. The DNN consisted of four hidden layers, 100
3
hidden neurons, trained with the adaptive moment estimation (Adam) method.
The authors claimed very good performances, but they used different subsets
of the original KDD99 dataset. Yan et al. [25] proposed a stacked sparse au-
toencoder (SSAE) to detect the categories of NSL-KDD dataset. The authors
claimed accuracy up to 98.63%, but they simplified the experimental process by
shuffling and reassembling the original data into several independent datasets.
Similarly, Xu et al. [26] developed an IDS based on deep neural networks to
classify samples of NSL-KDD dataset, achieving high performances. However,
they evaluated the effectiveness of the proposed model by performing the 10-fold
cross-validation method on the original NSL-KDD data. Imamverdiyev et al.
[27] developed three DL (Bernoulli-Bernoulli RBM, GaussianBernoulli RBM,
DBN) and three standard machine learning (SVM (radial basis), SVM (epsilon-
SVR), and decision tree) architectures. Experimental results reported that the
Bernoulli-Bernoulli RBM outperformed all other approches with accuracy rate
of 73.23 %. Javaid et al. [28] used sparse AE architectures and self-taught
learning (STL) for detecting anomalies of NSL-KDD dataset. The accuracy
value was 79.1% in multiclass classification. Yousefi-Azar et al. [29] developed
a Recurrent Neural Network (RNN) based system for intrusion detection. The
authors used the NSL-KDD dataset as benchmark and performed both binary
and multi-classification, achieving accuracy rates of 83.3% and 81.3%, respec-
tively. Recently, Shone et al. [30] implemented a stacked non-symmetric deep
autoencoder (SNDAE) architecture for cyber attacks detection. In this study,
the authors used the NSL-KDD dataset, reporting multi-classification perfor-
mances of 85.42%.
In this Section, firstly, the NSL-KDD dataset used in this work is introduced.
Then, the proposed methodology, (including pre-processing, feature extraction
and classification) is described.
3.1. NSL-KDD database description
The NSL-KDD is a subset of the original KDD99 dataset and it is widely used
as benchmark in several intrusion detection systems. Indeed, NSL-KDD solves
some criticisms of the previous KDD99, such as the redundant and duplicate
records in train and test set that biased the classifiers toward more frequent
samples. NSL-KDD is a dataset made freely available by the Canadian Insti-
tute of Cybersecurity [31]. It has training and testing data sets, here denoted
as KDDT rain+ and KDDT est+ , which include 125973 and 22544 instances, re-
spectively. Specifically, since the KDDT est+ dataset had seventeen additional
attacks types,which were not included in the KDDT rain+ , for a fair classifica-
tion, the instances corresponding to such categories (3751) were removed. As a
result, the KDDT est+ was composed of 22544 - 3751 = 18793 examples. Fur-
ther details on the KDDT rain+ and KDDT est+ sets are reported in Table 1.
NSL-KDD dataset has zf (f=1,2,..41) features: 38 continuous and 3 symbolic,
4
as shown in Table 2. Furthermore, the attack types of the NSL-KDD dataset
are clustered into four different attack classes:
1. DoS (Denial of Service): DoS include attacks that cause the slowing or
shutting down of a machine by sending more traffic information to the
server than the system is able to handle. DoS attacks affect legitimate
network traffic or access to services.
2. R2L (Root to Local): R2L include attacks which provide illegal local
access to a machine by sending remote deceiving packets to the system.
3. U2R (User to Root): U2R include attacks which provide root access. In
this case, the hacker finds out the system vulnerability and starts using
the system as normal user.
4. Probe (Probing): Probe include attacks able to avoid the security control
systems by gathering information about the network.
The attack categories of the NSL-KDD dataset are reported in Tables 3.
3.2. Methodology
Figure 1 shows the procedure of the proposed methodology. Firstly, the NSL-
KDD dataset is cleaned from outliers and min-max normalization technique is
used to scale data within the range 0 and 1. Afterwards, the one-hot-encoding is
applied to convert symbolic (or categorical) features into numeric values. Then,
the 38 numeric attributes are analyzed statistically in order to select the most
correlated features. Finally, shallow (MLP, L-SVM, Q-SVM, LDA, QDA) and
deep (AE, LSTM) networks are developed to measure the detection performance
both in binary and multi-classification scenario.
3.3. Data preprocessing
The proposed preprocessing stage arranges data to be proccessed by the next
modules properly. It includes three units: outliers analysis , data normalization
and one-hot encoding.
5
Table 2: Features of NSL-KDD dataset: 38 numeric (or continuous, cont) and 3 categorical
(or symbolic, symb).
zf j − min(zf )
z̃f j = (3)
max(zf ) − min(zf )
6
Table 3: Attack profiles of DoS, R2L, U2R, Probe classes.
NSL–KDD
KDDTrain+ KDDTest+
CLASSIFICATION
Normal
PRE-PROCESSING AE Dos
R2L
classifier Probe
Outlier analysis
Normal
LSTM Dos
Normalization classifier R2L
Probe
Normal
One-hot-encoding MLP Dos
R2L
classifier Probe
Normal
FEATURE EXTRACTION L-SVM Dos
R2L
classifier Probe
Normal
Q-SVM Dos
R2L
classifier Probe
Normal
102-dimentional
LDA Dos
R2L
features vector classifier Probe
Normal
QDA Dos
R2L
classifier Probe
where max(zf ) and min(zf ) represent the maximum and minimum values of
the f th (numeric) feature zf ; whereas z̃f j is the normalized feature value ranged
between [0-1].
3.3.3. One-hot-encoding
The three categorical features protocol type, service, flag (z2 , z3 , z4 , respectively)
were transformed into numerical values using the one-hot-encoding technique.
Specifically, each categorical attribute is represented by binary values. For ex-
ample, the z2 feature (protocol type) has three attributes: tcp, udp and icmp.
Applying the one-hot-encoding technique they were converted into binary vec-
tors: [1,0,0],[0,1,0],[0,0,1], respectively. Similarly, also z3 and z4 features (ser-
vice and flag) were converted into one-hot-encoding vectors. Overall, the 41-
dimensional features were mapped into 122-dimensional features (38 continuous
and 84 with binary values related to the features z2 , z3 , z4 ).
7
Table 4: NSL-KDD? dataset after discarding the outliers.
Figure 2: Histogram of null values included in the 38 numeric variables of the KDD?T rain+
set. Features with zeros greater than 80% are represented in red colour and are removed from
the present study.
3.5. Classification
Two deep architectures based on AE, LSTM and three shallow architectures
based on standard MLP, SVM (with linear and quadratic kernel) and DA (with
linear and quadratic discriminant function) are developed to detect normal and
abnormal categories of the NSL-KDD dataset (Normal, DoS, R2L and Probe).
Details are presented in the following subsections.
8
3.5.1. Autoencoder
In this study, a deep classifier based on an AE was developed. An AE architec-
ture consists of an encoder and decoder operation: first, it transforms the input
data vector into a typically lower representation (encoder); then, it attempts to
reconstruct the original input from the compressed vector (decoder). The AE is
trained in unsupervised fashion and is able to capture significant features from
unlabeled data [32]. Figure 3 shows a classic AE model with a single hidden
layer. The input data vector z is encoded into a lower representation e:
e = ς(zW + b) (4)
where W represents the weight matrix, b is the bias vector and ς denotes the ac-
tivation function for the encoder. Afterwords, the decoding operation produces
the reconstruction of the input (z ) from the encoded representation (e):
z̃ = ζ(eW T + b) (5)
where ζ denotes the activation function for the decoder and z̃ is the recon-
structed vector.
9
reconstruction operations: ς(s) = 0 when s ≤ 0, s when 0 < s < 1, 0 when
s≥1.
The reconstruction error between z and z̃ is quantified using the mean squared
error (MSE) parameter. It worth mentioning that the minimum error was de-
tected with 50 hidden neurons and was of 0.0083. After training the AE[102:50:102],
the 50 latent features are used as input of a dense fully connected layer with
softmax activation function (AE50 classifier, Figure 4 (b)). At this stage, the
softmax layer is trained in supervised fashion for binary or multi-classification
purposes. Then, the fine-tuning approach is used. Specifically, the whole archi-
tecture, shown in Figure 4 (b), is re-trained with supervised learning algorithm
in order to improve the classification performances. The AE50 classifier was de-
veloped by using MATLAB R2018a (The MathWorks, Inc., Natick, MA, USA)
and trained until the cross-entropy loss function [33] converges, that is for 3*102
epochs.
Figure 4: (a) AE based classifier. The AE [102:50:102] reduces the 102-dimensional features
vector (z ) into 50 most latent features (e) and then reconstructs the original input from
the 50 compressed features. (b) Afterwards, the feature vector e is used as input of a final
softmax layer (o) for binary or multi-classification. Finally, the whole structure is fine-tuned
using conventional back propagation algorithm. In the figure, the AE classifier referred to
multi-class detection.
10
long-term dependencies between time steps in a sequence of data. Such LSTM
layer has two states: the hidden state (or the output state) that contains the
output at the time step t; and the cell state, that stores the information learned
from the previous time steps. At each time step t the hidden and cell state are
updated by using the aforementioned gates:
ct = lt ct−1 + it gt (6)
ht = ot tanh(ct ) (7)
where:
it = σg (Wi z + Ri ht−1 + bi ) (8)
ft = σg (Wf z + Rf ht−1 + bf ) (9)
gt = tanh(Wg z + Rg ht−1 + bg ) (10)
ot = σg (Wo z + Ro ht−1 + bo ) (11)
and where σ represents the sigmoid activation function, W is the weights
matrix, R is the recurrent weights matrix and b the bias.
11
Figure 6: LSTM architecture. Similarly to the AE architecture it consists of one hidden layer
of 50 units followed by a softmax layer for binary and multi-classification. The architecture
in figure referred to multi-classification task.
12
Figure 7: MLP architecture. Similarly to the AE architecture it consists of one hidden layer
of 50 units followed by a softmax layer for binary and multi-classification. The architecture
in figure referred to multi-classification task.
4. Experimental results
TP
P recision = (12)
TP + FP
TP
Recall = (13)
TP + FN
P recision ∗ Recall
F 1score = 2 ∗ (14)
P recision + Recall
TP + TN
Accuracy = (15)
TP + FP + TN + FN
where TP (True Positive) is the number of instances correctly detected as
anomalous; TN (True Negative) is the number of instances correctly detected
as normal; FP (False Positive) refers to the number of normal traffic patterns
missclassified as anomalous; FN (False negatives) refers to the number of anoma-
lous traffic patterns erroneously identified as normal. In order to estimate
the ability of the classifier to correctly detect normal and abnormal attacks,
the performances of the proposed architectures (AE, LSTM, MLP, L-SVM, Q-
SVM, LDA, QDA) were studied in binary classification (Normal, Abnormal)
and multi-classification (Normal, Dos, Probe, R2L) modality. It is to be noted
that, since F1 score parameter includes precision and recall information (eq.
14), the following considerations are based mainly on this measurement.
13
4.1. Binary classification
Table 6 reports the outcomes of binary classification experiments, where the
abnormal class includes DoS, Probe and R2L categories. The MLP classifier
achieved F1 scores of 86.65% and 70.69% in detecting normal and abnormal
categories, respectively. As regards SVM classifiers, Q-SVM outperformed L-
SVM in terms of average F1 score rate (reporting values of 81.39% and 77.54%,
respectively). Indeed, Q-SVM classifier achieved better performance in detect-
ing both normal and abnormal class (F1 scores of 87.87% and 74.90%, respec-
tively). As regards discriminant analysis, similarly, QDA outperformed LDA
in terms of average F1 score (reporting values of 77.78% and 75.16%, respec-
tively). However, LDA classifier achieved better F-measure values in detecting
normal samples (85.26%), whereas QDA in detecting anomalies (71.66%). As
regards the deep classifiers, the LSTM architecture achieved an average F1 score
of 79.24%. In contrast, the proposed deep AE classifier showed the highest F1
score, achieving an average value of 82.00%. Moreover, the AE based classifier
outperformed the aforementioned methods also in terms of accuracy (Table 5),
with a performance rate up to 84.21% as compared to LSTM, MLP, L-SVM,
Q-SVM, LDA and QDA classifiers which achieved accuracies of 82.04%, 81.65%,
80.8%, 83.15%, 79.27%, 76.84%, respectively. Similar results were achieved also
with the area under the curve (AUC) of the Receiver Operating Curve (ROC)
[39]. Indeed, as can be seen in Figure 8 (a), the AE classifier reported the best
AUC score (AUCAE =95.55%).
4.2. Multi-classification
Table 7 reports the outcomes of multi-classification experiments. Similar to
the binary classification analysis, the shallow MLP, L-SVM, Q-SVM, LDA, QDA
and deep LSTM, AE based classifiers were compared. The simulation results
showed that: the MLP classifier reported F1 score of 87.1%, 97.08%, 77.13% for
Normal, Dos and Probe attack classes, respectively; the Q-SVM outperformed
L-SVM classifier in terms of average F1 score with values of 75.11% and 69.76%,
respectively. Specifically, L-SVM reported F1 scores of 86.55%, 96.69%, 86.22%,
whereas Q-SVM F1 values of 88.32%, 97.41%, 82.81% for Normal, Dos and
Probe attack classes, respectively; the LDA, instead, outperformed QDA classi-
fier in terms of average F1 score with values of 76.49% and 64.36% respectively.
The LDA classifier reported F-measure of 90.69%, 91.14%, 70.87%, whereas the
QDA classifier achieved values of 87.98%, 74.64%, 47.86% for Normal, Dos and
Probe attack classes, respectively. However, it is to be noted that, the MLP, L-
SVM and Q-SVM based classifiers were not able to discriminate the R2L attack
category accurately (reporting F1 score of 11.74%, 9.45% and 31.90%, respec-
tively). The DA classifiers, instead, achieved better results in detecting the R2L
anomaly with F1 score of 53.27% (LDA) and 46.96% (QDA). As regards the deep
classifiers, LSTM architecture achieved an average F-measure rate of 67.17%.
The LSTM classifier reported very good discriminating performance only in
detecting Normal, Dos and Probe attack types (F-measure values of 86.12%,
96.90%, 84.05%, respectively) and remained inadequate in detecting the R2L
14
attack class. As regards, the deep AE classifier, similarly to the binary clas-
sification experiments, outperforms all the other machine learning algorithms,
reporting F1 score rate up to 98%. Furthermore, it is worth mentioning that
the AE classifier outperformed the LSTM and conventional techniques also in
term of accuracy, achieving the highest accuracy up to 87%. In contrast, LSTM,
MLP, L-SVM, Q-SVM, LDA and QDA classifiers reported accuracies of 80.67%,
81.43%, 81.4%, 83.65%, 83.17%, 79.47% respectively. Also in this scenario, this
result was confirmed evaluating the AUC. Indeed, as can be seen in Figure 8
(b), the AE classifier reported the best performance (AUCAE =96.1%).
Table 5: Accuracies of the proposed AE, LSTM, MLP, L-SVM, Q-SVM, LDA, QDA classifier
for binary and multi-classfication.
Table 6: Binary classification performance (Precision, Recall, F1 score) of AE, LSTM, MLP,
L-SVM, Q-SVM, LDA, QDA classifiers.
Precision
Attack class AE LSTM MLP L-SVM Q-SVM LDA QDA
Normal 81.09% 79.00% 78.49% 77.68% 80.84% 75.81% 81.08%
Abnormal 92.91% 91.25% 91.57% 90.96% 91.34% 92.38% 76.34%
AVG 87% 85.13% 85.03% 84.32% 86.09% 84.09% 78.71%
Recall
Attack class AE LSTM MLP L-SVM Q-SVM LDA QDA
Normal 96.96% 96.47% 96.69% 96.55% 96.24% 97.41% 86.94%
Abnormal 63.79% 58.92% 57.57% 55.56% 63.48% 50.22% 67.52%
AVG 80.37% 77.70% 77.13% 76.06% 79.86% 73.81% 77.23%
F1 score
Attack class AE LSTM MLP L-SVM Q-SVM LDA QDA
Normal 88.32% 86.86% 86.65% 86.09% 87.87% 85.26% 83.91%
Abnormal 75.64% 71.61% 70.69% 68.99% 74.90% 65.07% 71.66%
AVG 81.98% 79.24% 78.67% 77.54% 81.39% 75.16% 77.78%
5. Discussion
15
Table 7: Multi-classification performance (Precision, Recall, F1 score) of AE, LSTM, MLP,
L-SVM, Q-SVM, LDA, QDA classifiers.
Precision
Attack class AE LSTM MLP L-SVM Q-SVM LDA QDA
Normal 85.03% 77.70% 79.46% 78.37% 81.35% 88.01% 84.28%
Dos 97.05% 96.39% 97.21% 96.52% 97.71% 95.45% 97.79%
Probe 69.82% 75.26% 64.33% 77.03% 71.15% 60.20% 31.62%
R2L 99.49% 80.00% 99.19% 95.15% 98.68% 79.62% 99.34%
AVG 87.85% 82.34% 85.05% 86.77% 87.22% 80.82% 78.26%
Recall
Attack class AE LSTM MLP L-SVM Q-SVM LDA QDA
Normal 96.19% 96.58% 96.35% 96.64% 96.59% 93.53% 92.02%
Dos 98.18% 97.42% 96.96% 96.86% 97.11% 87.19% 60.35%
Probe 94.03% 95.16% 96.29% 97.90% 99.03% 86.13% 98.39%
R2L 39.78% 0.81% 6.24% 4.97% 19.03% 40.03% 30.75%
AVG 82.04% 72.49% 73.96% 74.09% 77.94% 76.72% 70.38%
F1 score
Attack class AE LSTM MLP L-SVM Q-SVM LDA QDA
Normal 90.27% 86.12% 87.10% 86.55% 88.32% 90.69% 87.98%
Dos 97.61% 96.90% 97.08% 96.69% 97.41% 91.14% 74.64%
Probe 80.14% 84.05% 77.13% 86.22% 82.81% 70.87% 47.86%
R2L 56.83% 1.61% 11.74% 9.45% 31.90% 53.27% 46.96%
AVG 81.21% 67.17% 68.26% 69.73% 75.11% 76.49% 64.36%
employed. The strengths and effectiveness of the proposed IDS were evaluated
using standard measurements including precision, recall, F1 score and accu-
racy. The most correlated features were extracted through statistical analysis
which were used as input to the deep (AE, LSTM) and shallow ML approaches
including MLP, L-SVM, Q-SVM, LDA and QDA. Moreover, both the binary
(Normal vs Abnomal) and multi-classification (Normal vs Dos vs R2L vs Probe)
were performed. As regards the shallow ML architectures, experimental results
showed that the Q-SVM classifier achieved the best performances for both bi-
nary (83.15% accuracy) and multi-class discrimination (83.65% accuracy), as
compared with MLP, L-SVM, LDA, QDA classifiers. As regards the deep ML
architectures, the AE classifier achieved the highest performances for both bi-
nary (84.21% accuracy) and multi-class discrimination (87% accuracy), as com-
pared with LSTM classifier. Hence, as can be observed, comparative results
with shallow and deep classifiers showed that the deep autoencoder archite-
cuture outperformed the other ML approaches proposed. Furthermore, this
result was confirmed also in terms of AUC: 95.65% and 96.1% for binary and
multi classification, respectively. It is to be noted that the optimal AE struc-
ture was found by estimating the performance of different numbers of hidden
layers (HL) and hidden units. Specifically, Table 9 reports the binary and
multi-classification accuracies of different AE architectures. As can be seen, the
minimum classification perfomance of 79.56% was obtained with an AE40,25,12
(in the binary anomaly detection scenario) and of 69.04% with AE50,30,12 (in
the multi-class anomaly detection scenario). However, the highest performance
16
was achieved by the proposed AE50 : 84.21% for binary-classification and 87%
for multi-classification, respectively. For fair comparison, shallow and deep net-
works have been developed with the same structure. Indeed, both MLP and
LSTM classifier consisted of 50 hidden units. Furthermore, the proposed deep
AE was compared also with most recent approaches in the literature that used
NSL-KDD dataset. Since most of the existing works focused on discriminat-
ing NSL-KDD attack types, we compared the performance of AE classifier in
multi-classification modality.
Recently, the authors in [40] proposed a hardware-software co-design ma-
chine learning accelerator based on sequential learning algorithm, achieving ac-
curacy up to 76.04% and training time of 144.5 s. Similarly, the authors in [28],
proposed a sparse AE architecture reporting accuracy of 79.10%; whereas in
[27] the authors designed a Gaussian-Bernoulli RBM consisted of 7 layers of 100
neurons, achieving accuracy rate up to 73.23%. A stacked non-symmetric deep
AE was developed in [30]. Specifically, they proposed a 3-hidden layers AE com-
bined with a Random Forest classifier, achieving multi-classification accuracy
rate up to 85.42% and minimum training time of 644.84 s. In [29], the authors
modelled a RNN-IDS with 80 hidden untis, reporting multiclass accuracy of
81.29% and a training time of 11444 s.
In contrast to these approaches, we proposed a statistical analysis and driven
intelligent AE classifier that achieved multiclass accuracy up to and 87%. How-
ever, it is to be noted that, although the proposed IDS outperfomed the aforeme-
tioned works, a difference of about 4% in accuracy was observed when compared
with Shone et al. [30]. Nevertheless, the AE developed here had a very simple
architecture with only 1 hidden layer with just 50 hidden units. Consequently,
the proposed IDS was optimized in terms of number of learning parameters and
time. Indeed, the training process, executed on high perfomance GPU GeForce
RTX 2080 Ti installed on an Intel(R) Core(TM) i7-8000K CPU processor with
64 GB RAM, was only 22.53s.
Table 8: Performance of the proposed IDS with the recent state-of-the-art techniques.
Method Accuracy
Proposed AE 87%
Proposed LSTM 80.67%
Imamverdiyev et al. [27] 73.23%
Huang et al. [40] 76.04%
Javaid et al. [28] 79.10%
Yin et al. [29] 81.29%
Shone et al. [30] 85.42%
6. Conclusion
In this paper the authors presented a novel statistical analysis and autoencoder
driven intelligent intrusion detection approach. The proposed IDS was tested
17
Table 9: Evaluation performance of AE with different hidden layers (HL).
Accuracy Accuracy
Classifier HL1 HL2 HL3
Binary Classification Multi-Classification
AE40 40 - - 80.87% 78.12%
AE40,20 40 20 - 80.65% 77.17%
AE40,20,12 40 20 12 79.97% 77.83%
AE40,25 40 25 - 80.28% 79.36%
AE40,25,12 40 25 12 79.56% 78.23%
AE40,30 40 30 - 80.48% 79,00%
AE40,30,12 40 30 12 79.84% 76.51%
AE50 50 - - 84.24% 87%
AE50,20 50 20 - 81.07% 78.62%
AE50,20,12 50 20 12 80.77% 75.70%
AE50,25 50 25 - 82.03% 81.84%
AE50,25,12 50 25 12 81.36% 80.65%
AE50,30 50 30 - 81.42% 81.13%
AE50,30,12 50 30 12 80.84% 69.04%
AE60 60 - - 80.26% 79.82%
AE60,20 60 20 - 80.49% 79.23%
AE60,20,12 60 20 12 79.94% 77.51%
AE60,25 60 25 - 81.28% 79.26%
AE60,25,12 60 25 12 81.18% 78.68%
AE60,30 60 30 - 80.48% 79.27%
AE60,30,12 60 30 12 80.24% 74.50%
7. Acknowledgements
This work was funded by the UK EPSRC (Engineering and Physical Sciences
Research Council) grant, code: EP/M026981/1.
18
(a)
(b)
Figure 8: ROC curves of the proposed AE, LSTM, MLP, L-SVM, Q-SVM, LDA, QDA clas-
sifiers for binary (a) and multi-classification (b).
19
References
20
[13] G. Zhong, S. Yan, K. Huang, Y. Cai, J. Dong, Reducing and stretch-
ing deep convolutional activation features for accurate image classification,
Cognitive Computation 10 (1) (2018) 179–186.
[14] N. Zeng, H. Zhang, B. Song, W. Liu, Y. Li, A. M. Dobaie, Facial expression
recognition via learning deep sparse autoencoders, Neurocomputing 273
(2018) 643–649.
[15] L. Wang, B. Jiang, Z. Tu, A. Hussain, J. Tang, Robust pixelwise saliency
detection via progressive graph rankings, Neurocomputing 329 (2019) 433–
446.
[16] M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald,
E. Muharemagic, Deep learning applications and challenges in big data
analytics, Journal of Big Data 2 (1) (2015) 1.
[17] M. Tavallaee, E. Bagheri, W. Lu, A. A. Ghorbani, A detailed analysis
of the kdd cup 99 data set, in: Computational Intelligence for Security
and Defense Applications, 2009. CISDA 2009. IEEE Symposium on, IEEE,
2009, pp. 1–6.
[18] B. Ingre, A. Yadav, Performance analysis of nsl-kdd dataset using ann, in:
Signal Processing And Communication Engineering Systems (SPACES),
2015 International Conference on, IEEE, 2015, pp. 92–96.
[19] L. M. Ibrahim, D. T. Basheer, M. S. Mahmod, A comparison study for
intrusion database (kdd99, nsl-kdd) based on self organization map (som)
artificial neural network, Journal of Engineering Science and Technology
8 (1) (2013) 107–119.
[20] H. Mohamed, H. Hefny, A. Alsawy, Intrusion detection system using ma-
chine learning approaches, Egyptian Computer Science Journal 42 (3).
[21] Y. Gao, Y. Liu, Y. Jin, J. Chen, H. Wu, A novel semi-supervised learning
approach for network intrusion detection on cloud-based robotic system,
IEEE Access.
[22] K. Alrawashdeh, C. Purdy, Toward an online anomaly intrusion detection
system based on deep learning, in: Machine Learning and Applications
(ICMLA), 2016 15th IEEE International Conference on, IEEE, 2016, pp.
195–200.
[23] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, M. Ghogho, Deep
learning approach for network intrusion detection in software defined net-
working, in: Wireless Networks and Mobile Communications (WINCOM),
2016 International Conference on, IEEE, 2016, pp. 258–263.
[24] J. Kim, N. Shin, S. Y. Jo, S. H. Kim, Method of intrusion detection using
deep neural network, in: Big Data and Smart Computing (BigComp), 2017
IEEE International Conference on, IEEE, 2017, pp. 313–316.
21
[25] B. Yan, G. Han, Effective feature extraction via stacked sparse autoencoder
to improve intrusion detection system, IEEE Access 6 (2018) 41238–41248.
[26] C. Xu, J. Shen, X. Du, F. Zhang, An intrusion detection system using
a deep neural network with gated recurrent units, IEEE Access 6 (2018)
48697–48707.
[27] Y. Imamverdiyev, F. Abdullayeva, Deep learning method for denial of ser-
vice attack detection based on restricted boltzmann machine, Big Data
6 (2) (2018) 159–169.
[28] A. Javaid, Q. Niyaz, W. Sun, M. Alam, A deep learning approach for
network intrusion detection system, in: Proceedings of the 9th EAI In-
ternational Conference on Bio-inspired Information and Communications
Technologies (formerly BIONETICS), ICST (Institute for Computer Sci-
ences, Social-Informatics and Telecommunications Engineering), 2016, pp.
21–26.
[29] C. Yin, Y. Zhu, J. Fei, X. He, A deep learning approach for intrusion
detection using recurrent neural networks, IEEE Access 5 (2017) 21954–
21961.
[30] N. Shone, T. N. Ngoc, V. D. Phai, Q. Shi, A deep learning approach to
network intrusion detection, IEEE Transactions on Emerging Topics in
Computational Intelligence 2 (1) (2018) 41–50.
[31] M. Tavallaee, E. Bagheri, W. Lu, A. A. Ghorbani, Nsl-kdd dataset,
Available on http://www. unb. ca/research/iscx/dataset/iscx-NSL-KDD-
dataset. html),[Accessed on 28 Feb. 2016].
[32] G. E. Hinton, R. R. Salakhutdinov, Reducing the dimensionality of data
with neural networks, science 313 (5786) (2006) 504–507.
[33] J. Shore, R. Johnson, Axiomatic derivation of the principle of maximum
entropy and the principle of minimum cross-entropy, IEEE Transactions on
information theory 26 (1) (1980) 26–37.
[34] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computa-
tion 9 (8) (1997) 1735–1780.
[35] B. Yegnanarayana, Artificial neural networks, PHI Learning Pvt. Ltd.,
2009.
[36] V. N. Vapnik, An overview of statistical learning theory, IEEE transactions
on neural networks 10 (5) (1999) 988–999.
[37] I. Steinwart, A. Christmann, Support vector machines, Springer Science &
Business Media, 2008.
[38] A. J. Izenman, Linear discriminant analysis, in: Modern multivariate sta-
tistical techniques, Springer, 2013, pp. 237–280.
22
[39] A. P. Bradley, The use of the area under the roc curve in the evaluation of
machine learning algorithms, Pattern recognition 30 (7) (1997) 1145–1159.
[40] H. Huang, R. S. Khalid, W. Liu, H. Yu, Work-in-progress: a fast online
sequential learning accelerator for iot network intrusion detection, in: Hard-
ware/Software Codesign and System Synthesis (CODES+ ISSS), 2017 In-
ternational Conference on, IEEE, 2017, pp. 1–2.
[41] A. Adeel, H. Larijani, A. Ahmadinia, Random neural network based novel
decision making framework for optimized and autonomous power control
in lte uplink system, Physical Communication 19 (2016) 106–117.
23
Cosimo Ieracitano received the M.Eng. (summa cum laude) and Ph.D. de-
grees (with additional label of Doctor Europaeus) from the University Mediter-
ranea of Reggio Calabria (UNIRC), Italy, in 2013 and 2019, respectively. He
is currently a Research Fellow at the Neurolab group of the DICEAM Depart-
ment of the same University (UNIRC). He was a Visiting Master Student at
ETH Zurich and a Visiting PhD Student at the University of Stirling in 2013
and 2018, respectively. He is author/co-author of publications in peerreviewed
national/international journals and conference contributions. He is Local Ar-
rangements Chair for IEEE WCCI 2020. His main research interests include:
information theory, machine learning, deep learning techniques and biomedical
signal processing, in particular EEG signals of subjects affected by neuropatholo-
gies.
24
Ahsan Adeel holds B. Eng. (Hons), MSc (EEE), and PhD (Cognitive Com-
puting) degrees. Following an EPSRC/MRC prestigious fellowship at the Uni-
versity of Stirling (2016-18), he is currently a Lecturer (Assistant Professor) in
Computing Science at the University of Wolverhampton, UK, where he is lead-
ing the Conscious Multisensory Integration (CMI) Lab. He is a Visiting Fellow
at MIT Synthetic Intelligence Lab and Computational Neuroscience Lab (Uni-
versity of Oxford). His ongoing multidisciplinary research aims to explore and
exploit the power of advanced AI to design unorthodox brain-inspired cognitive
computing architectures by integrating suitable deep machine learning, reason-
ing, and optimization algorithms. His focused approaches include biophysical
and hardware-efficient neural models, explainable artificial intelligence, opti-
mized resource management, multimodal fusion, context-aware decision-making,
low power 5G IoT devices, and neuromorphic chips.
Francesco C. Morabito (M’89 - SM’00) was the Dean with the Faculty
25
of Engineering and Deputy Rector with the University Mediterranea of Reg-
gio Calabria, Reggio Calabria, Italy, where he is currently a Full Professor of
Electrical Engineering. He is also serving as the Vice-Rector for International
and Institutional Relations. He has authored or co-authored over 400 papers
in international journals/conference proceedings in various fields of engineering
(radar data processing, nuclear fusion, biomedical signal processing, nondestruc-
tive testing and evaluation, machine learning, and computational intelligence).
He has co-authored 15 books and holds three international patents. Prof. Mora-
bito is a Foreign Member of the Royal Academy of Doctors, Spain, in 2004, and
a member of the Institute of Spain, Barcelona Economic Network, in 2017. He
served as the Governor of the International Neural Network Society for 12 years
and as the President of the Italian Network Society from 2008 to 2014. He is
a member on the editorial boards of various international journals, including
the International Journal of Neural Systems, Neural Networks, International
Journal of Information Acquisition, and Renewable Energy.
Amir Hussain obtained his B.Eng. (with the highest 1st Class Honors)
and Ph.D. (in novel neural network architectures and algorithms) from the Uni-
versity of Strathclyde in Glasgow, Scotland, UK, in 1992 and 1997 respectively.
Following postdocoral and academic positions at the University of West of Scot-
land (1996-98), University of Dundee (1998-2000), and University of Stirling
(2000-2018) respectively, he joined Edinburgh Napier University, in Scotland,
UK, in 2018, as Professor of Computing Science, and founding Director of the
Cognitive Big Data and Cybersecurity (CogBiD) Research Laboratory. His re-
search interests are cross-disciplinary and industry focussed, and include secure
and context-aware 5G-IoT driven AI, and multi-modal cognitive and sentic com-
puting techniques and applications. He has published more than 400 papers,
including over a dozen books and around 150 journal papers. He has led ma-
jor national, European and international projects and supervised more than 30
PhD students, He is founding Editor-in-Chief of two leading journals: Cogni-
tive Computation (Springer Nature), and BMC Big Data Analytics (BioMed
Central); and Chief-Editor of the Springer Book Series on Socio-Affective Com-
puting, and Cognitive Computation Trends. He has been appointed invited
Associate Editor of several prestigeous journals, including the IEEE Transac-
tions on Neural Networks and Learning Systems, the IEEE Transactions on
Emerging Topics in Computational Intelligence, and (Elsevier) Information Fu-
sion. He is Vice-Chair of the Emergent Technologies Technical Committee of
26
the IEEE Computational Intelligence Society (CIS), and Chapter Chair of the
IEEE UK and RI Industry Applications Society.
27
Declaration of interests
The authors declare that they have no known competing financial interests or
personal relationships that could have appeared to influence the work reported
in this paper.
28