Machine - Learning - and - Deep - Learning - Approaches - FOR DEEP LEARNING - For - CyberSecuriy - A - Review
Machine - Learning - and - Deep - Learning - Approaches - FOR DEEP LEARNING - For - CyberSecuriy - A - Review
net/publication/358558358
CITATIONS READS
109 1,334
6 authors, including:
All content following this page was uploaded by Mohamed Hadi Habaebi on 21 March 2022.
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number
ABSTRACT The rapid evolution and growth of the internet through the last decades led to more concern
about cyber-attacks that are continuously increasing and changing. As a result, an effective intrusion detection
system was required to protect data, and the discovery of artificial intelligence’s sub-fields, machine learning,
and deep learning, was one of the most successful ways to address this problem. This paper reviewed intrusion
detection systems and discussed what types of learning algorithms machine learning and deep learning are
using to protect data from malicious behavior. It discusses recent machine learning and deep learning work
with various network implementations, applications, algorithms, learning approaches, and datasets to develop
an operational intrusion detection system.
INDEX TERMS Cybersecurity, Machine Learning, Deep Learning, Intrusion Detection System.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
indicate the same thing: a machine programmed to learn and deep learning models, which feature multiple connected
find the best solution to a problem. DL is a subfield of machine layers, shallow learning models are built up of a few hidden
learning, whereas machine learning is a subfield of AI. As a layers. By stacking layers on top of layers, DL will be able to
result, ML and DL are employed to create an efficient and express increasing complexity functions more effectively. DL
effective intrusion detection system. This paper provides an is used to learn representations with many abstraction levels
overview of machine learning and deep learning applications [5]. Deep neural networks are capable of finding and learning
and approaches in intrusion detection systems by representations from raw data and performing feature learning
concentrating on network security technologies, and classification [6]. Machine learning methodologies are
methodologies, and implementation. also utilized in deep learning. However, other ways are
Alan Turing stated that general use computers could learn employed in deep learning, such as Transfer Learning, as
and qualify originality, which has paved the way to whether shown in Figure 3.
computers should look at data to develop rules rather than
allow humans to do it. Machine learning algorithms are
algorithms that can learn and adapt based on data. Machine
learning algorithms are designed to generate output based on
what is learned from data and examples. For example, such
algorithms will allow a computer to choose and perform a
particular task on novel traffic detection without explicit
information [2].
Automatic analyses of attacks and security events, such as
spam mail, user identification, social media analytics, and
FIGURE 3. Deep Learning Approaches
attack detection may be performed efficiently using machine
learning [1]. As indicated in Figure 2, there are three main
techniques to machine learning: supervised, unsupervised, The remainder of the paper is organized as follows: Section
semi-supervised, and reinforcement learning. Supervised 2 discusses the intrusion detection system concept. Section 3
learning is based on labeled data, unsupervised learning is summarises the most frequently utilized datasets for the
based on unlabelled data, and semi-supervised learning is intrusion detection system. Section 4 discusses recent
based on both. advances in machine learning and deep learning-based
intrusion detection systems, while Section 5 concludes this
paper.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
C. EVALUATION METRICS
Some indications are used to assess an intrusion detection
system’s performance, either machine learning or deep
learning-based. These indicators are based on the confusion
FIGURE 4. NIDS Versus HIDS matrix component that contains four metrics: True Positive
(TP), True Negative (TN), False Positive (FP), and False
A. INTRUSION DETECTION SYSTEM APPROACH Negative (FN), and the assessment indicators are as follows
Intrusion detection techniques are classified into Anomaly [1]:
Detection Methods and Misuse Detection Methods [8], as • Accuracy - The ratio of correct predictions to records; a
shown in Table I. higher accuracy indicates a more accurate prediction by
1) ANOMALY DETECTION the learning model.
This model assumes that specific abnormal traffic has a low • Recall - The model’s capacity to locate all positive
probability and can be distinguished from regular traffic with records is the detection rate, as it quantifies the correctly
a high probability [9]. Unsupervised learning and statistical predicted records.
learning-based anomaly detection algorithms can detect • Precision - The capacity to avoid mislabeling negative
unique and undiscovered assaults. records as positive; a high precision rate equates to a low
2) MISUSE DETECTION rate of false positives.
This approach is a signature-based technique. While • F1-Score (F1) - The sum of Precision and Recall; a higher
monitoring threats in an IDS, detection can occur based on F1 indicates a more effective learning model.
known attack signatures [1]. This strategy is based on • False Positive Rate (FPR) - To compute the False Alarm
supervised learning and can detect illegal or suspicious Rate, divide the total number of normal records identified
behaviors that can be used to defend against similar assault as attacks by the total number of normal records.
behaviors.
TABLE I TABLE II
DIFFERENCES BETWEEN INTRUSION DETECTION SYSTEM APPROACHES CONFUSION MATRIX
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
and testing. Following that, a summary of the most often used Unique and obfuscated traffic
DoS Hulk 231,073 produced by Hulk tool to
dataset in intrusion detection systems will be discussed. perform DoS
DoS Slow lorries tool implemented
5796
A. KDD Cup 1999 slowloris to perform DoS
This dataset is the most widely used dataset for intrusion Collecting data such as
services and type of operating
detection, based on the DARPA dataset. This dataset includes PortScan 158,930 system through sending
basic and high-level TCP connection information such as the packets with different
connection window but no IP addresses. In addition, this destination port
dataset contains over 20 different types of attacks and a record Injects malicious data through
XSS 652 web applications into normal
for the test subset [10]. websites
Web Brute Force 1507
B. UNSW-IDS15 Attack Method to attack application
Founded in 2015 by Australian Centre for Cyber Security SQL that involves inserting
21
Injection malicious SQL statements into
(ACCS). Samples in this dataset contain normal and malicious the entry field for execution
traffic [12], and it has been collected from three real-world FTP Patator 7938
Attacks to guess the password
websites; BID (Symantec Corporation), CVE (Common Brute- of FTP login
Force Attacks to guess the password
Vulnerabilities and Exposures), and MSD (Microsoft Security SSH-Patator 5897
of SSH login
Bulletin) and then to generate the dataset, it emulated in a Trojan used to breach the
laboratory environment. This dataset has nine attack families, security of many devices to
such as worms, DoS, and fuzzers [9]. gain control and organize all
Bot 1966
devices in Bot network so it
can be operated remotely by
TABLE III. the attacker
ATTACK TYPES IN UNSW-IDS15 Infiltration techniques and
No. of tools used to gain
Attack Class Description Infiltration 36
records unauthorized access to
networked system data
Normal 93,000 Natural traffic data
Attack to make resources inaccessible for
DoS 16,353 D. NSL-KDD
legitimate users
It is the improved KDD dataset, where a large amount of
Port-based intrusion attacks, HTML redundancy has been removed, and an advanced sub-dataset
Analysis 2,677
penetrations, and spam has been created [10]. This dataset utilizes the same KDD99
Scan-based intrusion attacks. Using attributes and belongs to four attack categories: DoS, U2R,
Fuzzers 24,246 software testing to discover flaws in the R2L, and Prob [8].
operating system or network.
Attack aims to collect information about TABLE V
Reconnaissance 13,987 ATTACK TYPES IN NSL-KDD
flaws in system security
Attack No. of records
Penetration remote attacks to access the Attack Types
Class Training Testing
Backdoors 2,329 computer by avoiding background Normal 67,343 9,711 Natural traffic data
security Worm, Land, Smurf, Udpstorm,
Teardrop, Pod, Mailbomb,
Generic 58,871 Penetration attack for block cipher attacks DoS 5,927 7,456
Neptune, Process table,
Apache2, Back
C. CIC-IDS2017 Ipsweep, Nmap, Satan,
Prob 11,656 2,421
The dataset was generated in 2017 by the Canadian Institute Portsweep, Mscan, Saint
WarezClient, Worm,
for Cybersecurity. This dataset contains normal and attack SnmpGetAttack, WarezMaster,
scenarios and includes an abstract behavior for 25 users based R2L 995 2,756
Imap, SnmpGuess, Named,
on SSH, HTTPS, HTTP, FTP, and email protocols [8, 13]. MultiHop, Phf, Spy, Sendmail,
Ftp_Write, Xsnoop, Xlock,
Guess_Password
TABLE IV. Buffer_Overflow, SQLattack,
ATTACK TYPES IN CIC-IDS2017 U2R 2 200 Rootkit, Perl, Xterm,
No. of LoadModule, Ps, Httptuneel
Attack Class Description
records
Benign 2,358,036 Natural traffic data
Multiple users operate E. PU-IDS
DDoS 41,835 simultaneously to attack one A derivative dataset from NSL-KDD is generated to extract a
service statistic from an input data and then utilized to create new
DoS
Unauthorized access gained by
Heartbleed 11 inserting malicious data into synthetic instances. The traffic generator of this dataset
OpenSSL memory
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
obtained the same format and attributes of the NSL-KDD number of data since a significant amount is required for
dataset [8][11]. accurate interpretation and understanding.
Table VI shows a comparison of several deep learning • Feature processing – This is a method of extracting
methods, the year the dataset was created, whether it was features to generate patterns that contribute to the
publicly available, the number of characteristics that were implementation of learning algorithms and reduce the
utilized for analysis, and lastly, how much traffic the data complexity of the data. In other words, the feature process
handled. is used to do categorization and feature detection on raw
data. While in machine learning, the expert must
TABLE VI. determine the necessary representations, in deep learning,
COMPARISON BETWEEN DATASETS
the representations are identified automatically through
No. of Kind of
Data Set Year Availability the use of deep learning algorithms.
features traffic
KDD Cup99 1998 Public 41 Emulated • Interpretability – This is described as a model’s capacity
NSL-KDD 1998 Public 41 Emulated to comprehend human language. An interpretable model
ISOT 2010 Public 49 Emulated
ISCX 2012 2012 Public 8 Emulated can be understood without extra tools or procedures. On
UNSW-NB15 2015 Public 42 Emulated the other hand, it is difficult to specify how neurons
KYOTO 2015 Public 24 Real traffic should be modeled and how the layers should interact in
CIC-IDS2017 2017 Public 84 Emulated
deep learning, making it difficult to explain how the result
was obtained.
IV. INTRUSION DETECTION SYSTEMS IN RECENT
• Problem-solving – In conventional machine learning, the
WORKS USING MACHINE LEARNING AND DEEP
LEARNING problem is divided into sub-problems, each of which is
Methodologies and algorithms have undergone significant solved independently, and then the final answer is
change and evolution to produce the most acceptable intrusion obtained. On the other hand, deep learning will resolve
detection system in many applications that attempt to identify the issue completely [4].
constantly changing threats and attacks. Initially, classification TABLE VII.
was based on machine learning, but as performance needed to COMPARISON BETWEEN MACHINE LEARNING AND DEEP LEARNING
be further improved, deep learning was utilized to produce Machine Learning Deep learning
higher accuracy and a lower false alarm rate. Millions of data (Big
Input Thousands of data
data)
Numerical values, text,
Output Numerical values
sounds
Hardware Low-end machines like
Machines with GPU
requirements CPU
Neural networks are
Different algorithms
used to pass the data
are used to learn and
How it works through processing
predict future data
layers to interpret
from past data
relations and features
Human Require human Does need much
Intervention intervention a lot human intervention
Data analysts direct the Once the process
How its algorithms to examine starts, the algorithms
managed specific variables in the will be self-directed to
dataset analyze the dataset
Works well with the Works well with a big
Dataset size
FIGURE 5. Machine Learning Vs. Deep Learning small-medium dataset dataset
A shallow network that A deep network that
The primary distinction between machine learning and deep consists of input, consists of input,
No. of layers
output, and one hidden output, and at least
learning is illustrated in Figure 5, and it is based on the method layer three hidden layers
by which the system gets input. It depends on how the data is Automatic
Manual identification
trained by machine learning, but it depends on the connections Features
of the features
identifications of the
between artificial neural networks in deep learning to train important features
Processing Time Few seconds or hours Few hours or weeks
data without requiring many human interactions. Additional Training Time Long time Short time
differences between machine learning and deep learning are With the help of an
summarised here and in Table VII. The machine takes a
artificial neural
Decision decision based on the
• Data dependencies – This metric indicates the volume of past data
network, machines
take the decision
data. In traditional machine learning, based on rules,
Hyperparameter The capability of It can be tuned in
performance is improved when the data set is limited. In tuning tuning is limited many ways
comparison, deep learning performs better with a vast Implementations
Prediction and simple
Complex applications
applications
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
A. Machine Learning IDS Algorithm The requirement for a horizontal platform for IoT
This subsection discusses recent research into IDS applications/M2M resulted in creating the worldwide standard
implementations that utilize a variety of machine learning OneM2M [18], which aims to address the requirement for an
algorithms. Machine learning algorithms, such as support M2M service layer that enables communication across
vector machine (SVM) and random forest (RF), have been heterogeneous apps and devices seen in Figure 7.
used to investigate the binary categorization of IDS using a Additionally, the authors investigated the second line of
supervised learning approach [14]. SVM outperformed RF defense for oneM2M IoT networks that can identify and
throughout the training process, whereas RF outperformed prevent threats and intrusions, dubbed Machine Learning-
SVM during the test procedure. Additionally, they concluded based Intrusion Detection and Prevention System, which can
that a classifier’s performance would vary based on the dataset detect and prevent not only known but also unexpected
and attributes. attacks.
An IDS model based on a decision tree, naïve Bayes, and They developed their dataset from real-world IoT networks
the random forest was proposed by [15] to classify Prob, R2L, and implemented a detection model with three machine
and U2R on the NSL-KDD dataset. It is discovered that the learning levels to identify and detect assaults and threats. They
highest accuracy was achieved in detecting DOS attacks using obtained 99.93 percent accuracy for the second detection level
the RF algorithm. Additionally, when they compared their when using a decision tree-based machine learning algorithm
hybrid model with its 14 features to other hybrid models with and 99.34 percent accuracy when using an encoder-based
varying features, the hybrid model had a greater accuracy for machine learning strategy. However, this model obtained a
DOS, Probe, and U2R and a nearly identical accuracy for R2L. high degree of accuracy and can detect and respond to risks
In order to increase the performance of the attack detection associated with the oneM2M service layer.
model, an intrusion detection strategy utilizing SVM ensemble
with the feature was presented in [16]. They examined
validated training data and discovered that it might be used to
improve the detection process resulting in the fast training
time, high accuracy, and low false alarm rate. However,
because this strategy trains classifiers independently of feature
spaces and then combines judgments via an ensemble, some
correlations across feature spaces will be missed during
classifier learning, lowering the model’s accuracy.
Three datasets comprising high-level network features were
explicitly created for non-payload-based network intrusion FIGURE 7. OneM2M architecture
detection systems in [17] by enabling machine learning
classifiers to use Advanced Security Network Metrics The use of Artificial Neural Networks (ANNs) was
(ASNM) features. It was the first dataset to include adversarial proposed by [18] to detect malicious traffic by training them
obfuscation techniques and benign traffic samples that were on a large variety of benign and malicious traffic data. ANNs
applied to the malicious traffic execution of TCP network create weights that are adaptively tuned during the training
connections. While such classifiers can detect a sizable phase by a learning rule. Their methodology outperformed
percentage of unknown threats, some unknown attacks may be signature-based detection, with an accuracy of 98 percent.
undetectable, as illustrated in Figure 6. Table VIII analyses the learning method, performance metric,
dataset, attack type, strengths, and limits of machine learning
techniques based on intrusion detection systems.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
TABLE VIII.
MACHINE LEARNING ALGORITHMS FOR IDS
Learning Performance Attack
Author Dataset Strengths Limitation
algorithm metric targeted
Accuracy,
detection rate,
Farnaaz & DoS, Prob, The model provides a Increasing number of
false alarm rate,
Jabbar, 2016 RF NSL-KDD R2L, and low false alarm rate trees will slow the real-
and Mathews
[19] U2R, and high detection rate time prediction process
correlation
coefficient
The model was able to
Rao & DoS, Prob, Authors did not consider
Accuracy, increase the accuracy
Swathi, 2017 KNN NSL-KDD R2L, U2R, the precision and recall
detection rate and faster
[20] and normal rate.
classification time
The model provides Depending on
Logistic UNSW- high accuracy with KDDCup99 may lead to
Khammassi & Accuracy,
Regression NB15 DoS, U2R, only 20 features of misleading the
Krichen, 2017 detection rate, and
with Genetic and R2L UNSW-NB15 and 18 evaluation as this
[21] false alarm rate
Algorithm KDD Cup99 features of dataset is outdated and
KDDCup99 contains redundant data
Authors did not
The model provides
Verma & Accuracy, implement cross-
KNN and K- Network the best performance
Ranga, 2018 detection rate, and CIDDS-001 validation to measure
means traffic attacks of TP rate and low
[22] false-positive rate the robustness of their
false alarm rate
model
SVM with Dealing with a large
The model ignores class
Recursive Accuracy, number of features and
Hamed et al., Network distribution as it only
Feature detection rate, and ISCX 2012 a small number of
2018 [12] traffic attacks works for binary
Addition false alarm rate samples to avoid
classification.
(RFA) overfitting
No feature selection is
SVM Accuracy,
UNSW- DT has the best implemented, and that
Belouch et al., RF sensitivity, Network
NB15 performance of all cause increase in
2018 [23] DT specificity, and traffic attacks
other ML algorithms detection and training
NB execution time
time
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
Their methodology, however, is ineffective in detecting threats in the absence of tagged harmful data. Their model was
complex attacks due to its high false alarm rate. 99.96 percent accurate for UNB-ISCX 2012 and 99.96 percent
Convolutional neural networks with the NSL-KDD dataset accurate for CIC-IDS 2017. Additionally, their research
were investigated in [28] and are depicted in Figure 10. In established the critical nature of the datasets needed to
addition, the authors investigated a method for detecting construct an IDS and the efficacy of Autoencoder for anomaly
threats in a vast real-time network by converting the raw data detection.
to an image data format, which aids in resolving the To enhance detection accuracy in IDS, the author
unbalanced dataset issue by computing the cost function for incorporated big data, deep learning approaches, and natural
each class from the training sample. As a result, they were able language processing in [28]. They worked with KDD CUP99
to reduce the number of computing parameters in their model, and achieved an accuracy of 94.32 percent with their model.
but their model’s accuracy was low compared to other In addition, another deep neural network method was
machine learning and neural network models. Table IX introduced in [29] to detect risks and attacks in the cloud
summarizes various deep learning algorithms for IDS. environment. Their approach used Simulated Annealing and
Improved Genetic Algorithms to create the hybrid
optimization framework IGASAA using the datasets NSL-
KDD2015, CIC-IDS2017, and CIDDS-001. Compared to the
Simulated Annealing Algorithm (SAA), their model
demonstrated a higher detection rate, increased accuracy, and
a lower false alarm rate.
Web application security is highly reliant on detecting
malicious HTTP traffic, which needs a significant investment
FIGURE 9. Stacked NDAE Classification Model in training data gathering and a large dataset. To detect
malicious HTTP traffic, the authors in [29] introduced the
DeepPTSD method based on a deep transfer semi-supervised
learning methodology. The construction of their model is
given in Figure 11. They used two raw public datasets from
FSecurify and another from their lab via a honeypot server.
When a little training dataset is available, their model exceeds
other existing baselines, with a precision of 93.33 percent
compared to 86.67 percent and 86.61 percent for CNN and
RNN, respectively.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
model's accuracy increased compared to other classifiers, to the comparison, deep learning outperformed machine
resulting in enhanced detection of unknown threats and a learning in the accuracy test. The best performance was first
decrease in false alert rates. A feedforward deep neural achieved by the RNN, then by the CNN, and finally by the
network was proposed by [1] for an intrusion detection system Autoencoder. A comparison of deep learning methods based
to perform binary classification on the NSL-KDD dataset. Due on intrusion detection systems is presented in Table IX, which
to the dense structure of this model, it beat the usual machine- compares the learning algorithm, performance metric, dataset,
learning technique in terms of scalability with big datasets and attack targeted, strengths, and limits of the algorithms.
time for training data. As a result, there was a high proportion
of true positives and accurate categorization records, with this
model achieving an accuracy of 89 percent. In [31], an RNN-
based IDS binary and multiclass classification technique were
investigated. This model outperformed convolutional machine
learning algorithms and demonstrated that it is suited for
classification with high accuracy. The authors trained and
tested their model on the NSL-KDD dataset. Figure 12
illustrates the RNN structure and the proposed RNN-IDS
model.
Deep neural networks were used in [32] to investigate the
applicability of anomaly-based intrusion detection systems. FIGURE 12. RNN and RNN-IDS architecture
Based on the NSL-KDD dataset, the authors studied a variety
of machine learning and deep learning frameworks. According
TABLE IX.
DEEP LEARNING ALGORITHMS FOR IDS
Learning Performance
Author Dataset Attack targeted Strengths Limitation
algorithm metrics
The model provides a R2L and U2R have a
Accuracy, DoS, Prob, R2L,
Xiao et al., short classification time low detection rate
CNN detection rate, and KDD CUP99 U2R, and
2019 [33] for real-time traffic and compared to other
false alarm rate normal
high accuracy attacks
The model provides
Papamartziva Accuracy, DoS, Prob, R2L, Low detection
KDD CUP99 autonomous misuse
nos et al., Autoencoder precision, recall, U2R, and accuracy for U2R and
NSL-KDD detection for large scale
2019 [34] F1-score normal R2L attacks
networks
By using feature
Accuracy,
Mayuranatha DoS and DDoS selection, the model High computational
detection rate,
n et al., 2019 RBM KDD CUP99 in the cloud improved the resources for IoT
precision, and
[35] environment performance of detecting devices
recall
attacks
The model outperformed
Accuracy, The model does not
Jiang et al., Network traffic the accuracy of other
LSTM-RNN detection rate, and NSL-KDD detect new types of
2020 [36] attacks machine learning
false alarm rate attacks
algorithms
Accuracy, F1- The accuracy of the
The model is robust and
Tian et al., score, precision, NSL-KDD Network traffic model may be affected
DBN provides a low false alarm
2020 [37] recall, and false- UNSW-NB15 attacks due to the uncertainty
rate
positive rate of selecting parameters
NES Boundary
CNN Accuracy, F1- The model was
Zhang et al., CSE-CIC- HopSkipJu The model provides a
MLP score, precision, vulnerable against
2020 [38] IDS2018 Pointwise high detection rate
C-LSTM and recall adversarial instances
Opt-Attack
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
CNN and then temporal characteristics using LSTM. Finally, utilized were a combination of machine learning and deep
they evaluated the performance of their proposed model using learning techniques, including Random Forest (RF), Deep
the ISCX2012 and DARPA datasets. Although the Neural Network (DNN), and Gradient Boosting Tree (GBT).
hierarchical CNN-LSTM model beats pure CNN or LSTM The authors evaluated their strategy using the CIC-IDS2017
models and gives higher accuracy for IDS, it is and UNSW-NB15 datasets. DNN has the highest accuracy at
computationally expensive because of its complicated 99.19 percent based on UNSW-NB15 and 99.99 percent based
architecture. on CIC-IDS2017. Although all three classifiers achieved good
accuracy, training the model was difficult due to the features'
wide variety of numerical data.
In wireless sensor networks, IDS was performed using a
combination of machine learning and deep learning [43]. The
authors proposed the limited Boltzmann machine-based
clustered RBC-IDS approach as a deep learning technique.
They used the KDD Cup99 dataset and Network Simulator-3
to compare their model against adaptive machine learning-
based IDS (NS-3). While RBC-IDS has high accuracy, the
detection time was comparable to that of the adaptive machine
learning model, resulting in overhead expenses. A hybrid
network IDS was utilized in [6] using the UNSW-15 dataset
that utilized the CNN-LSTM algorithm. When applied to real-
world devices, they employed a transfer learning approach to
optimise the IDS model's efficiency. Their model was 98.43
percent accurate.
CBR-CNN (Channel Boosted and Residual Learning) was
created in [44], employing deep Convolutional Neural
Networks for intrusion detection using the NSL-KDD dataset.
FIGURE 13. Hierarchy of HAST-IDS Training is carried out using an unsupervised learning
approach, and normal traffic is modeled using stacked
Security attacks in smart connected vehicles an intrusion autoencoders (SAE). Their model had an accuracy of 89.41
detection system based on continuous automated secure percent for KDD-Test+ and 80.36 percent for KDD-Test-21,
service availability framework was proposed in [41]. The respectively. Table X analyses the learning method,
model classifies attacks and reduces their dimensionality using performance metric, dataset, attack type, strengths, and limits
a decision tree and deep belief machine learning. A model for of hybrid learning algorithms based on intrusion detection
enhancing IDS performance was provided by [42] by systems.
integrating three classifiers with big data. The methods
TABLE X. HYBRID LEARNING ALGORITHMS FOR IDS
Learning Performance Attack
Author Dataset Strengths Limitation
algorithm metrics targeted
Real online Network The model increased F1-score for some training
Yang et al., F1-score and
SVM and RBM network traffic training speed and sizes had a high false-
2017 [48] precision
traffic attacks improved traffic detection negative rate
The performance may be
DBN with
Accuracy, UNSW- Network The model outperformed affected because the model
Yang et al., density peak
recall, precision NB15 traffic other algorithms in was not able to learn low-
2019 [49] clustering
and F1-score NSL-KDD attacks accuracy and detection rate level feature
algorithm
representations
Accuracy,
Genetic detection rate, The model was able to
Zhang et al., IoT network The model needs more time
algorithm and precision, recall NSL-KDD select the optimal
2019 [50] layer for training the dataset
DBN and false alarm parameters to be trained
rate
Blacklist
Evaluate a new dataset that
Spam Scan
Accuracy, UNSW- The combined algorithms contains recent attacks, and
Rajagopal et SVM, RF, LR, SSHscan
precision, recall, NB15 increased the accuracy and their work only focuses on
al., 2020 [51] and KNN UDPscan
false alarm rate UGR’16 detection rate the classifiers, not
DOS
metadata.
DDOS
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3151248, IEEE
Access
A. Halbouni et al.: ML and DL approaches for Cybersecurity: A review
[13] N. Moustafa, J. Hu, and J. Slay, "A holistic review of network [35] M. Mayuranathan, M. Murugan, and V. Dhanakoti, "Best features
anomaly detection systems: A comprehensive survey," Journal of based intrusion detection system by RBM model for detecting DDoS
Network and Computer Applications, vol. 128, pp. 33-55, 2019. in cloud environment," Journal of Ambient Intelligence and
[14] L. Arnroth and J. Fiddler Dennis, "Supervised Learning Techniques: Humanized Computing, vol. 12, no. 3, pp. 3609-3619, 2021.
A comparison of the Random Forest and the Support Vector [36] F. Jiang et al., "Deep learning based multi-channel intelligent attack
Machine," Uppsala University, 2016. detection for data security," IEEE Transactions on Sustainable
[15] D. H. Lakshminarayana, "Intrusion detection using machine learning Computing, vol. 5, no. 2, pp. 204-212, 2018.
algorithms," Master of Science, East Carolina University, 2019. [37] Q. Tian, D. Han, K.-C. Li, X. Liu, L. Duan, and A. Castiglione, "An
[16] J. Gu, L. Wang, H. Wang, and S. Wang, "A novel approach to intrusion detection approach based on improved deep belief
intrusion detection using SVM ensemble with feature augmentation," network," Applied Intelligence, vol. 50, no. 10, pp. 3162-3178, 2020.
Computers & Security, vol. 86, pp. 53-62, 2019. [38] C. Zhang, X. Costa-Pérez, and P. Patras, "Tiki-taka: Attacking and
[17] I. Homoliak, K. Malinka, and P. Hanacek, "ASNM Datasets: A defending deep learning-based intrusion detection systems," in
Collection of Network Attacks for Testing of Adversarial Classifiers Proceedings of the 2020 ACM SIGSAC Conference on Cloud
and Intrusion Detectors," IEEE Access, vol. 8, pp. 112427-112453, Computing Security Workshop, 2020, pp. 27-39.
2020, doi: 10.1109/access.2020.3001768. [39] M. K. Putchala, "Deep learning approach for intrusion detection
[18] A. Shenfield, D. Day, and A. Ayesh, "Intelligent intrusion detection system (IDS) in the internet of things (IoT) network using gated
systems using artificial neural networks," ICT Express, vol. 4, no. 2, recurrent neural networks (GRU)," Master of Science, Wright State
pp. 95-99, 2018. University, 2017.
[19] N. Farnaaz and M. Jabbar, "Random forest modeling for network [40] W. Wang et al., "HAST-IDS: Learning hierarchical spatial-temporal
intrusion detection system," Procedia Computer Science, vol. 89, pp. features using deep neural networks to improve intrusion detection,"
213-217, 2016. IEEE Access, vol. 6, pp. 1792-1806, 2017.
[20] B. B. Rao and K. Swathi, "Fast kNN classifiers for network intrusion [41] M. Aloqaily, S. Otoum, I. A. Ridhawi, and Y. Jararweh, "An intrusion
detection system," Indian Journal of Science and Technology, vol. detection system for connected vehicles in smart cities," Ad Hoc
10, no. 14, pp. 1-10, 2017. Networks, vol. 90, 2019, doi: 10.1016/j.adhoc.2019.02.001.
[21] C. Khammassi and S. Krichen, "A GA-LR wrapper approach for [42] O. Faker and E. Dogdu, "Intrusion Detection Using Big Data and
feature selection in network intrusion detection," Computers & Deep Learning Techniques," presented at the Proceedings of the 2019
Security, vol. 70, pp. 255-277, 2017. ACM Southeast Conference, 2019.
[22] A. Verma and V. Ranga, "Statistical analysis of CIDDS-001 dataset [43] S. Otoum, B. Kantarci, and H. T. Mouftah, "On the Feasibility of
for network intrusion detection systems using distance-based Deep Learning in Sensor Network Intrusion Detection," IEEE
machine learning," Procedia Computer Science, vol. 125, pp. 709- Networking Letters, vol. 1, no. 2, pp. 68-71, 2019, doi:
716, 2018. 10.1109/lnet.2019.2901792.
[23] M. Belouch, S. El Hadaj, and M. Idhammad, "Performance [44] N. Chouhan, A. Khan, and H.-u.-R. Khan, "Network anomaly
evaluation of intrusion detection based on machine learning using detection using channel boosted and residual learning based deep
Apache Spark," Procedia Computer Science, vol. 127, pp. 1-6, 2018. convolutional neural network," Applied Soft Computing, vol. 83,
[24] X. Wang, S. Chen, and J. Su, "Real Network Traffic Collection and 2019, doi: 10.1016/j.asoc.2019.105612.
Deep Learning for Mobile App Identification," Wireless [45] S. Rastegari, "Intelligent network intrusion detection using an
Communications and Mobile Computing, vol. 2020, pp. 1-14, 2020, evolutionary computation approach," PhD, Edith Cowan University,
doi: 10.1155/2020/4707909. 2015.
[25] G. Thamilarasu and S. Chawla, "Towards Deep-Learning-Driven [46] J. Yang, J. Deng, S. Li, and Y. Hao, "Improved traffic detection with
Intrusion Detection for the Internet of Things," Sensors (Basel), vol. support vector machine based on restricted Boltzmann machine,"
19, no. 9, Apr 27 2019, doi: 10.3390/s19091977. Soft Computing, vol. 21, no. 11, pp. 3101-3112, 2017.
[26] N. Shone, T. N. Ngoc, V. D. Phai, and Q. Shi, "A deep learning [47] N. Chaabouni, "Intrusion Detection and Prevention for IoT Systems
approach to network intrusion detection," IEEE Transactions on using Machine Learning," PhD, Université de Bordeaux, 2020.
Emerging Topics in Computational Intelligence, vol. 2, no. 1, pp. 41-
50, 2018.
[27] R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-
Nemrat, and S. Venkatraman, "Deep Learning Approach for
Intelligent Intrusion Detection System," IEEE Access, vol. 7, pp.
41525-41550, 2019, doi: 10.1109/access.2019.2895334.
[28] Y. Dong, R. Wang, and J. He, "Real-time network intrusion
detection system based on deep learning," in 2019 IEEE 10th
International Conference on Software Engineering and Service
Science (ICSESS), 2019: IEEE, pp. 1-4.
[29] T. Chen et al., "A Payload Based Malicious HTTP Traffic Detection
Method Using Transfer Semi-Supervised Learning," Applied
Sciences, vol. 11, no. 16, 2021, doi: 10.3390/app11167188.
[30] G. Liu and J. Zhang, "CNID: research of network intrusion detection
based on convolutional neural network," Discrete Dynamics in
Nature and Society, vol. 2020, 2020.
[31] C. Yin, Y. Zhu, J. Fei, and X. He, "A deep learning approach for
intrusion detection using recurrent neural networks," IEEE Access,
vol. 5, pp. 21954-21961, 2017.
[32] S. Naseer et al., "Enhanced Network Anomaly Detection Based on
Deep Neural Networks," IEEE Access, vol. 6, pp. 48231-48246,
2018, doi: 10.1109/access.2018.2863036.
[33] Y. Xiao, C. Xing, T. Zhang, and Z. Zhao, "An intrusion detection
model based on feature reduction and convolutional neural
networks," IEEE Access, vol. 7, pp. 42210-42219, 2019.
[34] D. Papamartzivanos, F. G. Mármol, and G. Kambourakis,
"Introducing deep learning self-adaptive misuse network intrusion
detection systems," IEEE Access, vol. 7, pp. 13546-13560, 2019.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
View publication stats