Network Intrusion Detection Using Feature
Network Intrusion Detection Using Feature
*Correspondence:
[email protected] Abstract
1
School of Computing Network intrusion detection systems (NIDSs) are one of the main tools used to defend
and Digital Technology, against cyber-attacks. Deep learning has shown remarkable success in network intru-
Birmingham City University, sion detection. However, the effect of feature fusion has yet to be explored in how to
Birmingham, UK
2
METCLOUD LTD, Birmingham, boost the performance of the deep learning model and improve its generalisation
UK capability in NIDS. In this paper, we propose novel deep learning architectures with dif-
3
Faculty of Computers ferent feature fusion mechanisms aimed at improving the performance of the multi-
and Information, Assiut
University, Assiut, Egypt classification components of NIDS. We propose three different deep learning models,
4
Department of Computer which we call early-fusion, late-fusion, and late-ensemble learning models using
Science, University of Exeter, feature fusion with fully connected deep networks. Our feature fusion mechanisms
Exeter, UK
were designed to encourage deep learning models to learn relationships between dif-
ferent input features more efficiently and mitigate any potential bias that may occur
with a particular feature type. To assess the efficacy of our deep learning solutions
and make comparisons with state-of-the-art models, we employ the widely accessible
UNSW-NB15 and NSL-KDD datasets specifically designed to enhance the develop-
ment and evaluation of improved NIDSs. Through quantitative analysis, we demon-
strate the resilience of our proposed models in effectively addressing the challenges
posed by multi-classification tasks, especially in the presence of class imbalance issues.
Moreover, our late-fusion and late-ensemble models showed the best generalisation
behaviour (against overfitting) with similar performance on the training and validation
sets.
Keywords: Feature fusion, Deep learning, Fully-connected networks, Network
intrusion detection
Introduction
According to the Gartner report, “Market Guide for AIOps Platforms” [1], the rapid
growth in event data cannot wait for humans to derive insights. There is a need for auto-
mation and support from machine learning (ML) in IT security operations. The report
also mentions that rule-based event correlation has given way to AI-based correlation
due to the speed at which the correlation rules must be updated. In fact, any modern
cybersecurity vendor website states that traditional signature-based solutions can be
beaten by advanced threats such as polymorphic malware, hence the need for more
adaptive solutions using ML.
© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits
use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original
author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third
party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the mate-
rial. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or
exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://
creativecommons.org/licenses/by/4.0/.
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 2 of 24
One of the main tools used to defend against cyber-attacks is the Network Intru-
sion Detection System (NIDS). A NIDS is an appliance that monitors network traffic
from a cybersecurity point of view. It is installed in a strategic location on a network,
often just inside a perimeter firewall. It takes in a stream of packets and sends alerts
to operators, Security Information and Event Management systems (SIEMs) and/or
other devices and applications, whenever it detects events of potential cybersecurity
significance. Devices that can take appropriate action in addition to detecting poten-
tially malicious activity are known as Network Intrusion Prevention or Detection and
Prevention Systems (NIPS or NIDPS) [2].
A NIDS commonly uses three complementary event detection methods, individu-
ally or in combination:
• Signature-based: the NIDS examines individual packets that comply with specified
conditions, looking for textual patterns that are characteristic of known malicious
activity. Such ‘smoking gun’ characteristics are known as Indicators of Compro-
mise (IoC).
• Anomaly-based: the NIDS compares the traffic conditions it observes with known
profiles representing the normal behaviour of entities such as users, hosts, net-
work connections and applications. An alert is issued if the observed behaviour
is significantly different from the profiles. Profiles are generated using machine
learning techniques by observing traffic over time under typical usage conditions
when malicious activity is not believed to be present.
• Stateful protocol analysis-based: The NIDS models the state of dialogues between
hosts based on the packets they exchange. An alert is issued if an exchange devi-
ates from what is generally expected for the protocol in question.
Anomaly detection techniques are not dependent on signatures, but rather on the
(reasonable) assumption that malicious activity will reveal itself in changes to the
behaviour of the system being monitored. They should be able to detect novel attacks,
not be fooled by variations in existing attacks, and not be subject to the time lag
between the first use of an attack and the ability to detect it. They are complementary
to signature-/rule-based techniques, each detecting attacks that the other may miss.
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 3 of 24
1. Early-fusion scenario
• Integration of various feature types into a unified feature vector, serving as input
for a fully-connected network.
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 4 of 24
2. Late-fusion scenario
Related work
There is considerable interest in applying machine learning techniques to improve threat
detection performance, and many publicly available datasets have been generated to
facilitate this research, including 1998 DARPA, KDDCup99, NSL-KDD, and UNSW-
NB15 [5–8]. In [9], researchers propose a generic layered framework for applying
machine learning to threat detection and discuss the different unsupervised and super-
vised machine learning techniques that can be applied. In [10], three different machine
learning techniques (support vector machine, decision trees, and deep belief network)
were used in several cybersecurity datasets for malware detection, spam detection,
and intrusion detection. It has been pointed out that there is no silver bullet to tackle
cyberthreats, different machine learning techniques need to be used depending on the
type of threat, and there is a need for more open data, which currently lacks diversity
and advanced attacks. Ref. [11] proposes a combination of supervised and unsupervised
machine learning using random forest and k-means on an intrusion detection dataset.
The research suggests that this hybrid approach yields better results than traditional
methods. Ref. [8] used the NSL-KDD dataset to evaluate different supervised machine
learning approaches, including Random Forest Classifier (RFC), comparing performance
metrics such as precision, recall, and F1-Score to determine the most effective classifier
for intrusion detection. According to the results of the experiment for the NSL-KDD
dataset and the set of parameters, the random forest classifier outperformed other sta-
tistical machine learning classifiers. Ref. [12] proposed to use a Random Effects Logistic
Regression (RELR) model to forecast the discovery of anomalies. It employed a ran-
dom-effects model to account for network environment features and unaccounted for
uncertainty, introducing a wrapper feature selection phase based on fixed-effects logistic
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 5 of 24
regression (FELR). The UNSW-NB15 dataset has been used to conduct a study to deter-
mine the types of cyber attacks that have occurred, using both k-means and correlation
analysis for feature selection, followed by Naive Bayes and decision trees for classifica-
tion [13]. As a result of this hybrid feature selection procedure, there was a noticeable
improvement in the Naives Bayes classifier’s accuracy, but the decision trees performed
similarly with or without feature selection. The proposed approach was able to identify
more uncommon threats, including BackDoor, Shellcode, and Worms. Ref. [14] provides
a novel technique based on the k-Nearest Neighbour classifier approach for simulating
program behaviour in intrusion detection systems. Early trials using 1998 DARPA BSM
audit data indicate that this technique is capable of detecting invasive software activity
efficiently. Compared to previous approaches that use short system call sequences, the
kNN classifier does not need distinct profiles of short system call sequences for vari-
ous programs, significantly reducing the work required to identify new program activ-
ity. Furthermore, the findings demonstrate that a low false positive rate is achievable.
While this conclusion may not hold for more complicated datasets, text classification
approaches seem to be well suited for use in the intrusion detection sector.
In [15], a multilevel hybrid intrusion detection model (based on a support vector
machine (SVM) and an extreme learning machine) has been designed to improve the
efficiency of network intrusion detection, where the KDDCup99 dataset was used to
evaluate the performance of the model. An SVM-based intrusion detection system was
previously proposed in [16] that employs a hierarchical clustering for feature selection
and SVM model, to speed up the training time on the KDDCup99 dataset. Motivated
by many reported shortcomings in the KDDCup99 datasets, such as the curse of high
dimensionality, in [17], different machine learning models have been used with multi-
ple sets of features to study the importance of features and improve intrusion detection
rates in the UNSW-NB15 and KDDCup99 datasets. In [18], many generative and dis-
criminative approaches (such as XGBoost, Support Vector Machine, k-Nearest-Neigh-
bour, Logistic Regression, Artificial Neural Network, and Decision Tree) have been used
with/without feature selection on the UNSW-NB15 dataset, studying the effect of fea-
ture selection stage. An integrated classification-based IDS was proposed in [19], where
the performance has been evaluated on the UNSW-NB15 dataset showing better accu-
racy compared to traditional models such as the decision tree model. With the aim of
reducing the efficiency of detecting attacks and increasing false alarm rate, features sig-
nificance and characteristics of UNSW-NB15 and KDDCup99 datasets were examined
in [20] using An Association Rule data mining approach. Similarly, an Association Rule
approach under rough set theory [21] was proposed to model IDSs. In [22], a fuzzy rule-
based system was proposed, which is designed to find the optimal feature subset using
a genetic feature selection wrapper and providing interpretable fuzzy IF-THEN rules on
the KDDCup99 dataset. In [23], a multilevel semi-supervised ML (MSML) approach was
proposed to cope with the network traffic class imbalance problem in the KDDCup99
dataset. In [24], a dual ensemble model has been proposed that combines bagging and
gradient boosting decision tree (GBDT) techniques, showing its superiority in reducing
false alarms and increasing detection rates for anomaly-based intrusion detection sys-
tems compared to existing approaches. Moreover, [25] provides a comprehensive over-
view of how ensemble learners are utilised in intrusion detection systems (IDSs) through
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 6 of 24
feature selection method, IGRF-RFE, which combines information gain (IG) and random
forest (RF) filter methods with recursive feature elimination (RFE) wrapper method,
demonstrating its effectiveness in improving anomaly detection accuracy on the UNSW-
NB15 dataset. However, the study only considers six classes during training and removes
minority classes in the reprocessing stage. Salim et al. [37] propose a novel deep learning
strategy using bidirectional long short-term memory (LSTM) and a symmetric logarith-
mic loss function to address limitations of current intrusion detection systems (IDS) on
the Internet of Things (IoT), achieving high accuracy rates on benchmark datasets such
as NSL-KDD and UNSW-NB15, but limited to binary classification tasks.
Having reviewed the state-of-the-art related models, it is evident that despite the nota-
ble success of classical machine learning and deep learning in network intrusion detec-
tion, feature selection remains a challenging problem and feature fusion mechanisms
have not been explored explicitly. Due to the presence of different feature types, it is still
not clear how the different features affect the resulting accuracy of deep learning mod-
els. As a consequence, this work focuses on guiding the learning process of deep learn-
ing models using features of the same type with different feature fusion mechanisms to
encourage the deep learning model to learn the relationships between the different types
of features, as detailed in the following section.
Datasets
In this work, we used the UNSW-NB15 and NSL-KDD datasets to validate and evaluate
the performance of our deep learning solutions.
UNSW-NB15 contains real normal network traffic and synthetic attack behaviours.
Packet data representing the mixture of normal and abnormal traffic was passed through
a network audit tool (Argus) to produce flow data (a flow represents a collection of pack-
ets exchanged between two end-points), and a network traffic analyser (Bro-IDS, now
known as Zeek) to produce connection-related data. The connections and flows were
aligned, resulting in a total of 35 packet-based and flow-based features, from which a
further 12 features were extracted using bespoke algorithms. Of the 47 features, 2 are
timestamps, 2 are binary, 5 are strings, 28 are integers, and 10 are floating-point. In addi-
tion, each entry has a binary label according to whether it represents normal or abnor-
mal traffic, and a string label categorising it as Normal or as being an example of one of
9 types of attack (Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance,
Shellcode, Worms). This work uses a version of the dataset in which the source/destina-
tion addresses/ports and start/end times features have been removed and a new float
feature (e.g. rate) has been introduced to describe the average packet transmission rate.
Consequently, a total of 42 features have been selected to train and validate our models:
11 float features, 28 integer features (including two binary features), and 3 string fea-
tures, see Fig. 1. The target variable has 10 unique classes, see Fig. 2, which highlights
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 8 of 24
that the dataset has imbalanced classes, with the “Normal” being the most represented
with over 90,000 observations and “Worm” being the least represented with fewer than
50 observations.
NSL-KDD is an edited version of the KDD-Cup99 intrusion detection dataset. NSL-
KDD addresses problems present in KDD-Cup99 such as duplicated records that could
lead to high bias when classifiers are trained on it. The original dataset was created based
on information obtained during the DARPA 1998 IDS assessment program. It includes
41 features, among which 3 are strings, 23 are integers (including 5 binary features), and
15 are floating-point. Each record is labelled as one of 39 distinct attack types. These
attack types fit into one of four categories, which we use as the target variable: Denial
of Service (DoS), Unauthorized Access to Local Superuser Privileges (U2R), Unauthor-
ized Access from a Remote Machine (R2L), and Probing or Surveillance (probe), see
Fig. 3. Normal traffic is also represented in the data, labelled as “normal”. The dataset is
provided already split between training and testing sets with distinct distributions. The
training set contains 125,973 observations and the test set 22,544 observations. An inter-
esting characteristic of the test set is the presence of 17 attack types absent from the
training set, making 16.6% of the test set or 3750 observations. To observe the impact of
these “unknown” attack types on our classifier, we created a distinct test set not includ-
ing these observations.
predictions. The early-fusion model can cope with the different raw input features by
fusing the different pre-processed signals into a single network to learn shared local and
hierarchical features for the classification of the different classes. Algorithm 1 represents
the Early Fusion Model, which aims to generate classification predictions for input data
consisting of different signals S1 , S2 , ..., Sn. The algorithm begins by applying various pro-
cessing layers, including the integer layer, normalisation layer, and string lookup layer,
to transform each input signal. These transformed signals are then concatenated into a
single vector. Subsequently, a fully connected network is created with three fully con-
nected layers, each using a ReLU activation function and a dropout layer for regularisa-
tion. Finally, a classification layer with a softmax classifier is added, and the algorithm
returns the classification predictions for the input data.
which is then fed into a classification layer with a softmax classifier to generate predic-
tions. The number of output nodes in the classification layer corresponds to the number
of classes in the classification problem. Finally, the algorithm returns the classification
predictions for the input data.
Training settings
In the training stage of all the proposed deep learning models, a stochastic gradient
descent approach called Adam optimisation is used to minimise a categorical cross-
entropy loss function. Adam optimisation is computationally and memory efficient,
invariant to diagonal rescaling of gradients, and well suited for extensive data and
parameter tuning situations.
The categorical cross-entropy loss function that we used for all the proposed mod-
els is defined as
c
ECCE yi , y′ (xj , W ) = − yi ln y′ xj , W , (1)
i=1
where xj is the set of input examples (or rows), yi is the class labels, c is the number of
classes, while y′ xj , W is the predicted output from a Softmax function, where W is the
c
1 TPi
Precision = , (3)
c TPi + FPi
i=1
c
1 TPi
Recall = , (4)
c TPi + FNi
i=1
where c is the number of classes in the dataset (i.e., 10 different classes), TP is the true
positive and TN is the true negative, while FP and FN are incorrect model predictions.
The TP, TN, FP, and FN are defined based on a specific class i as:
n
TPi = xii (5)
i=1
c
c
TNi = xjk , j �= i, k �= i (6)
j=1 k=1
c
FPi = xji , j �= i (7)
j=1
c
FNi = xij , j �= i, (8)
j=1
where xii is an element in the diagonal of the multi-classes confusion matrix (as pointed
out in [38]).
Finally, for a fair comparison, although the deep learning architectures of our models
were different, the number of neurons in each layer was generated by the same hyperpa-
rameter tuning mechanism. The Keras tuner package was used in this work to investigate
the best hyper-parameters for deep learning models. Furthermore, the L2 regularisation
method (at 0.01) was used with all models and each model was trained with 100 epochs
and a batch size of 32.
Experimental results
This section analyses and discusses the performance of the proposed deep learning
models demonstrates the effectiveness of our data fusion approaches and the robustness
of combining multiple models over a single deep learning model and finally compares
the performance of our models to the state-of-the-art models.
Table 1 The performance of our deep learning models on the UNSW-NB15 dataset
DL model Train Validation Test
Performance (%)
Early-fusion model
Accuracy 90.99 84.44 76.47
Precision 96.29 90.0 83.53
Recall 87.46 80.87 71.59
Late-fusion model
Accuracy 85.92 83.92 77.09
Precision 93.54 91.49 86.04
Recall 81.45 79.20 69.50
Late-Ensemble model
Accuracy 83.67 83.09 76.84
Precision 91.75 90.88 85.92
Recall 78.55 77.64 68.18
our late-fusion model than in the early-fusion model. This confirms the capability of the
late-fusion model to learn more generic features. It also shows better performance on
the testing set.
Our ensemble model showed an accuracy of 83.6% on the training set, 83.09% on the
validation set, and 76.8% on the testing set. Although the late-fusion model achieved the
highest accuracy in the test set, the ensemble deep learning model showed more gener-
alisation capability, with the behaviour of the model almost the same for both training
and validation sets (see Fig. 7).
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 14 of 24
In terms of precision and recall, the late fusion model outperformed all other models
with a testing precision of 86.04%, while the early fusion model showed the highest recall
of 71.5%. Moreover, the late-ensemble model showed the best generalisation behaviour
(against over-fitting) with similar performance (in terms of precision and recall) on the
training and validation sets while the early-fusion model is the most sensitive to the
over-fitting problem.
To demonstrate the sensitivity of our models to the individual classes, Figs. 8, 9, and 10
show the ROC curves of our models for the 10 classes. Both early- and late-fusion mod-
els show robust behaviour towards all the classes but they are less robust to the (Analy-
sis) class while the late-ensemble model shows low performance toward both Analysis
and Worms classes.
Ablation study
To better demonstrate the behaviour of our proposed deep learning models, we applied
the same feature-fusion mechanisms without the proposed processing layer (resulting
in three special versions of the early-fusion, late-fusion, and late-ensemble models). We
also compared the performance of our models with the traditional fully connected net-
work. We report the results of the four models in Table 2. The ROC curves for the four
models are demonstrated in Fig. 11 to demonstrate the behaviour of the four models on
the individual classes.
Table 2 The performance of special versions of our deep learning models and other architectures
Model Train Validation Test
Performance (%)
Table 3 The performance of our deep learning models on the NSL-KDD dataset
DL model Train Validation Test
Performance (%)
Early-fusion model
Accuracy 99.74 99.66 84.50
Precision 99.76 99.69 84.98
Recall 99.72 99.63 84.31
Late-fusion model
Accuracy 99.50 99.51 83.73
Precision 99.57 99.57 84.01
Recall 99.41 99.42 83.52
Late-Ensemble model
Accuracy 99.55 99.43 86.81
Precision 99.56 99.46 86.86
Recall 99.54 99.41 86.80
Table 4 The testing performance of the state-of-the-art machine learning models on the multi-
classification task of the UNSW-NB15 and NSL-KDD Dataset
Model Refs. Dataset Accuracy Precision Recall
(%) (%) (%)
without any preprocessing requirements. Regarding the NSL-KDD dataset, all the
compared models exhibit comparable accuracy, with the quadratic discrimination
function (QDA) method yielding the minimum accuracy of 64.36%, while the AE
method attains the highest precision of 87.85%. However, our model surpasses all
other methods in terms of both accuracy (86.81%) and recall (86.80%).
Discussion
In this section, we discuss practical considerations in making use of the ML-based
approach described above to improve the effectiveness of a NIDS in detecting and
characterising malicious activity. As mentioned above, the current NIDS combines
detection mechanisms that are signature-based, anomaly-based, and protocol analysis-
based. In contrast, the model we have proposed is a multi-way classifier, in which one
of the classes represents normal behaviour and the others, different types of malicious
behaviour.
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 17 of 24
Fig. 4 The overall architecture of the early-fusion model, showing the connection between the different
input signals, processing layer, the densely/fully connected layers (where neurons are actively connected
using black arrows while red arrows are for the dropped out neurons or connections), and the classification
layer
Fig. 5 The overall architecture of the late-fusion model, each data type was passed to a sub-model and each
sub-model output has 128 neurons with ReLU non-linear activation function. Then, the average predictions
of the sub-models define the output layer with the Softmax classifier
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 18 of 24
Fig. 6 The integration of early-fusion and late-fusion models as an ensemble learning mechanism
Fig. 7 Training and validation accuracy and loss curves of our deep learning models on the for the
UNSW-NB15 dataset
Fig. 8 The ROC curves of our early-fusion model in the UNSW-NB15 data set
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 20 of 24
Fig. 9 The ROC curves of the late-fusion model in the UNSW-NB15 dataset
Fig. 10 The ROC curves of the late-ensemble model on the UNSW-NB15 dataset
Fig. 11 The ROC curves of the four special versions of our deep learning models as a part of our ablation
study
Fig. 12 The ROC curves of the late-ensemble model on the NSL-KDD dataset
environment. Generating or capturing and labelling training data for the attack types
is therefore an issue. Further investigation is required to establish the degree of gen-
eralisation that is occurring, e.g. whether the classifier is merely learning attack sig-
natures, is successfully generalising to discover more abstract characteristics of the
various attack types used to create a training set, or at a higher level is discovering
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 22 of 24
some inherent difference between normal and malicious traffic. This final eventuality
would greatly reduce the need for context/environment-specific training.
Our view is that a multi-way classifier is complementary to the existing mecha-
nisms, and is best seen as a fourth element in a hybrid system. Hybrid schemes need
to combine the outputs of their constituent mechanisms in some way, taking into
account their strengths and weaknesses, in order to decide whether to issue an alert.
As far as possible, both false positives and false negatives should be minimised, but
a degree of trade-off is inevitable. In a NIDS context, a false positive is an irritation
and waste of time, whereas a false negative is potentially catastrophic. Therefore, one
needs to be very confident before rejecting a positive signal from any mechanism. On
the other hand, a naïve conservative approach whereby an alert is generated if any
single mechanism reports a positive is likely to overwhelm SOC analysts. It is difficult
to say what levels are acceptable, but the vast majority of attacks should be detected,
(recall ≈ 1), and sources suggest that a real alarm rate (precision) of between 60% and
90% is achievable.
Conclusions
This paper proposes novel deep learning architectures using deep fusion mechanisms
called early-fusion, late-fusion, and late-ensemble deep learning models. Our feature
fusion mechanisms have been designed to encourage the deep learning model to capture
the relationships between the features. The late-fusion and late-ensemble models have
shown better performance than the early-fusion model and other state-of-the-art mod-
els due to their ability to learn the relationships between more specialised features. Our
deep learning architectures based on feature fusion provide a generic solution that can
be extended to other deep learning models such as LSTM or CNN. As a future develop-
ment, one can study the effect of feature fusion with recurrent units to encode the long
dependencies between the features. Another research direction is to employ post-hoc
explainable AI models to better understand the behaviour of the proposed models in
distinguishing between the different classes by highlighting the most significant attrib-
utes that are contributing to the final prediction.
Acknowledgements
At the time the work was performed, both MMA and PK were with Birmingham City University.
Author contributions
Conceptualisation, MMA.; investigation, MMA and AA; methodology, MMA; administration, MMA; software, AA, Ako. and
XS; supervision, MMA; validation, XS, AK, AKo; visualisation, AA, AKo; analysis: XS and PK; writing—original draft, MMA;
writing—review and editing, XS, PK, IV, FS and MMA. All authors have read and agreed to the published version of the
manuscript.
Funding
This work was supported by funding from Innovate UK under the Knowledge Transfer Partnership 12328.
Declarations
Competing interests
The authors declare that they have no competing interests.
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 23 of 24
References
1. Prasad P, Rich C. Market guide for AIOps platforms; 2018. https://tekwurx.com/wp-content/uploads/2019/05/Gartn
er-Market-Guide-for-AIOps-Platforms-Nov-18.pdf. Retrieved 12 Mar 2020.
2. Latha KM. Learn about intrusion detection and prevention. USA: Juniper Networks; 2016.
3. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
4. Ramachandram D, Taylor GW. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal
Process Mag. 2017;34(6):96–108.
5. Moustafa N, Slay J. Unsw-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15
network data set). In: Military communications and information systems (MilCIS). Canberra: IEEE; 2015. p. 6.
6. Moustafa N, Slay J. The evaluation of network anomaly detection systems: statistical analysis of the unsw-nb15 data
set and the comparison with the kdd99 data set. Inform Sec J Glob Perspect. 2016;25(1–3):18–31.
7. Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE symposium
on computational intelligence for security and defense applications. Ottawa: IEEE; 2009. p. 6.
8. Belavagi MC, Muniyal B. Performance evaluation of supervised machine learning algorithms for intrusion detection.
Proc Comp Sci. 2016;89:117–23.
9. Sarker IH, Kayes A, Badsha S, Alqahtani H, Watters P, Ng A. Cybersecurity data science: an overview from machine
learning perspective. J Big Data. 2020;7(1):1–29.
10. Shaukat K, Luo S, Chen S, Liu D. Cyber threat detection using machine learning techniques: A performance evalua-
tion perspective. In: 2020 International conference on cyber warfare and security (ICCWS). Islamabad: IEEE; 2020. p.
6.
11. Soheily-Khah S, Marteau P-F, Béchet N. Intrusion detection in network systems through hybrid supervised and
unsupervised machine learning process: a case study on the ISCX dataset. In: 2018 1st International conference on
data intelligence and security (ICDIS). South Padre Island: IEEE; 2018. pp. 19–226.
12. Mok MS, Sohn SY, Ju YH. Random effects logistic regression model for anomaly detection. Exp Syst Appl.
2010;37(10):7162–6.
13. Bagui S, Kalaimannan E, Bagui S, Nandi D, Pinto A. Using machine learning techniques to identify rare cyber-attacks
on the UNSW-NB15 dataset. Sec Priv. 2019;2(6):91.
14. Liao Y, Vemuri VR. Use of k-nearest neighbor classifier for intrusion detection. Comp Sec. 2002;21(5):439–48.
15. Al-Yaseen WL, Othman ZA, Nazri MZA. Multi-level hybrid support vector machine and extreme learning machine
based on modified k-means for intrusion detection system. Exp Syst Appl. 2017;67:296–303.
16. Horng S-J, Su M-Y, Chen Y-H, Kao T-W, Chen R-J, Lai J-L, Perkasa CD. A novel intrusion detection system based on
hierarchical clustering and support vector machines. Exp Syst Appl. 2011;38(1):306–13.
17. Janarthanan T, Zargari S. Feature selection in UNSW-NB15 and KDDCUP’99 datasets. In: 2017 IEEE 26th International
symposium on industrial electronics (ISIE). Edinburgh: IEEE; 2017. pp. 1881–1886.
18. Kasongo SM, Sun Y. Performance analysis of intrusion detection systems using a feature selection method on the
UNSW-NB15 dataset. J Big Data. 2020;7(1):1–20.
19. Kumar V, Sinha D, Das AK, Pandey SC, Goswami RT. An integrated rule based intrusion detection system: analysis on
UNSW-NB15 data set and the real time online dataset. Clust Comp. 2020;23(2):1397–418.
20. Moustafa N, Slay J. The significant features of the unsw-nb15 and the kdd99 data sets for network intrusion detec-
tion systems. In: 2015 4th International workshop on building analysis datasets and gathering experience returns for
security (BADGERS). Kyoto: IEEE; 2015. pp. 25–31.
21. Xuren W, Famei H, Rongsheng X. Modeling intrusion detection system by discovering association rule in rough
set theory framework. In: 2006 International conference on computational inteligence for modelling control and
automation and international conference on intelligent agents web technologies and international commerce
(CIMCA’06). Sydney: IEEE; 2006. pp. 24–24.
22. Tsang C-H, Kwong S, Wang H. Genetic-fuzzy rule mining approach and evaluation of feature selection techniques
for anomaly intrusion detection. Pattern Recognit. 2007;40(9):2373–91.
23. Yao H, Fu D, Zhang P, Li M, Liu Y. Msml: a novel multilevel semi-supervised machine learning framework for intrusion
detection system. IEEE Internet Things J. 2018;6(2):1949–59.
24. Louk MHL, Tama BA. Dual-ids: a bagging-based gradient boosting decision tree model for network anomaly intru-
sion detection system. Expert Syst Appl. 2023;213: 119030. https://doi.org/10.1016/j.eswa.2022.119030.
25. Tama BA, Lim S. Ensemble learning for intrusion detection systems: a systematic mapping study and cross-bench-
mark evaluation. Comp Sci Rev. 2021;39: 100357. https://doi.org/10.1016/j.cosrev.2020.100357.
26. Ieracitano C, Adeel A, Morabito FC, Hussain A. A novel statistical analysis and autoencoder driven intelligent intru-
sion detection approach. Neurocomputing. 2020;387:51–62.
27. Vinayakumar R, Soman KP, Poornachandran Prabaharan, Akarsh S. Application of deep learning architectures for
cyber security. In: Hassanien A, Elhoseny M, editors. Cybersecurity and secure information systems. Advanced sci-
ences and technologies for security applications. Cham: Springer; 2019. p. 125–60.
28. Choi Y-H, Liu P, Shang Z, Wang H, Wang Z, Zhang L, Zhou J, Zou Q. Using deep learning to solve computer security
challenges: a survey. Cybersecurity. 2020;3(1):1–32.
29. Javaid A, Niyaz Q, Sun W, Alam M. A deep learning approach for network intrusion detection system. EAI Endorsed
Transact Sec Saf. 2016;3(9):2.
30. Alrawashdeh K, Purdy C. Toward an online anomaly intrusion detection system based on deep learning. In: 2016
15th IEEE International conference on machine learning and applications (ICMLA). Anaheim: IEEE; 2016. pp.
195–200.
Ayantayo et al. Journal of Big Data (2023) 10:167 Page 24 of 24
31. Potluri S, Ahmed S, Diedrich C. Convolutional neural networks for multi-class intrusion detection system. In: Interna-
tional conference on mining intelligence and knowledge exploration. Cham: Springer; 2018. pp. 225–238.
32. Shone N, Ngoc TN, Phai VD, Shi Q. A deep learning approach to network intrusion detection. IEEE Transact Emerg
Top Comput Intell. 2018;2(1):41–50.
33. Vinayakumar R, Alazab M, Soman K, Poornachandran P, Al-Nemrat A, Venkatraman S. Deep learning approach for
intelligent intrusion detection system. IEEE Access. 2019;7:41525–50.
34. Altwaijry N, ALQahtani A, AlTuraiki I. A deep learning approach for anomaly-based network intrusion detection. In:
Big data and security: first international conference, ICBDS 2019, Nanjing, China, December 20–22, 2019, revised
selected papers 1. Singapore: Springer; 2020. pp. 603–615 .
35. Al-Turaiki I, Altwaijry N. A convolutional neural network for improved anomaly-based network intrusion detection.
Big Data. 2021;9(3):233–52.
36. Yin Y, Jang-Jaccard J, Xu W, Singh A, Zhu J, Sabrina F, Kwak J. IGRF-RFE: a hybrid feature selection method for MLP-
based network intrusion detection on UNSW-NB15 dataset. J Big Data. 2023;10(1):1–26.
37. Salim S, Lahcen O. Accuracy improvement of network intrusion detection system using bidirectional long-short
term memory (bi-lstm). In: Digital technologies and applications: proceedings of ICDTA’23, Fez, Morocco. Cham:
Springer; 2023. pp. 143–152.
38. Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inform Proc Manag.
2009;45(4):427–37.
39. Papamartzivanos D, Mármol FG, Kambourakis G. Dendron: genetic trees driven rule induction for network intrusion
detection systems. Future Gener Comp Syst. 2018;79:558–74.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.