Artificial Intelligence in Cybersecurity A Comprehensive Review and Future Direction
Artificial Intelligence in Cybersecurity A Comprehensive Review and Future Direction
An International Journal
To cite this article: Lizzy Ofusori, Tebogo Bokaba & Siyabonga Mhlongo (2024) Artificial
Intelligence in Cybersecurity: A Comprehensive Review and Future Direction, Applied Artificial
Intelligence, 38:1, 2439609, DOI: 10.1080/08839514.2024.2439609
Introduction
The rapid advancement in smart technologies, the Internet of Things (IoT)
and other computing devices has generated enormous amounts of data that
require robust security measures (Sarker et al. 2020). While this advancement
has made human life and business practices easier, it has brought about a wave
of security breaches (Künzler 2023). Studies have shown that due to the
increasing dependency on digitalization, many security incidents, such as
CONTACT Lizzy Ofusori [email protected] Centre for Applied Data Science, University of Johannesburg, PO
BOX 524, Auckland Park, 2006, Gauteng, South Africa
© 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/
licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s)
or with their consent.
e2439609-2 L. OFUSORI ET AL.
RQ1. What are the related scholarly articles about AI and cybersecurity from
2014 to 2024?
RQ2. What are the various AI techniques and datasets used in cybersecurity?
RQ3. What are the AI tools used for data extraction, analysis, and
optimization?
RQ4. What are the future research directions for the application of AI in
cybersecurity?
Research methodology
The study adopted a hybrid review (bibliometric and systematic analysis) of
relevant academic literature related to the application of AI in cybersecurity.
e2439609-4 L. OFUSORI ET AL.
Records excluded.
Records screened. Titles and Keywords (n= 1901)
(n=5103) Abstract (n= 2200)
Non-English (n=63)
(n= 4164)
Screening
final review.
(n=939)
Scopus and WoS are major abstract and citation databases, providing
access to thousands of peer-reviewed journals, books, and conference
proceedings across diverse disciplines. These two databases are among
the most credible platforms that offer comprehensive access to a wide
range of research resources. The search strategy employed the following
basic search string applied to all searchable fields of the Scopus and
WoS databases:
The subject area of our search was limited to the disciplines “Computer
Science,” “Information Technology,” “Engineering,” “Decision Science,”
“Mathematics” and “Multidisciplinary.” In addition to using the search
string to identify articles, further criteria were then applied to refine the
search results to include records that met the following conditions: (i)
authored exclusively in English; (ii) of all types, excluding News Item,
Correction, Meeting Abstract, Data Paper, Book Review, and Letter; and
(iii) published by the top 10 publishers ranked by the number of records
returned from the search string. These publishers include IEEE, Springer
Nature, Elsevier, Sage, Taylor & Francis, MDPI, Wiley, and Emerald Group
Publishing. Also, only articles that fell within the period 2014 to 2024 were
included in the study.
The date range 2014 to 2024 was chosen to ensure that the literature review
captures the most relevant and impactful developments over the past decade.
This period aligns with significant advancements in both AI and cybersecurity,
including breakthroughs in ML, DL, and the rise of novel cybersecurity threats
like ransomware and advanced persistent threats (APTs) (Sarker 2023). By
focusing on the last ten years, the review reflects current trends, practices, and
challenges while also ensuring that foundational studies and early implemen
tations of key technologies are not overlooked. This balance allows the
research to contextualize the latest findings within a decade of evolving
innovations and threats, making the analysis both current and comprehensive.
The 939 articles obtained were subsequently classified into four major article
types (as depicted in Table 1).
Table 1 presents the distribution of the reviewed research articles. It shows
the number of papers on the application of AI in cybersecurity from 2014 to
2024. The majority of articles (71.4%) were published in journals, followed by
conference proceedings (26.4%), symposiums (1.91%), and workshops
(0.31%). It should be noted that 52 of the 670 journal articles were used for
the SLR analysis.
Findings
The section presents a hybrid analysis, which includes the bibliometric analy
sis and the SLR findings. The bibliometric analysis was carried out using
APPLIED ARTIFICIAL INTELLIGENCE e2439609-7
R-software (Biblioshiny) and was used to answer RQ1. The systematic analysis
employed a qualitative research design to explore the various AI techniques
used in cybersecurity. Qualitative research provides an in-depth analysis and
a comprehensive overview by examining patterns and developing insights
from the data content (Huberman 2014). The data were then synthesized
using the core themes identified.
Bibliometric analysis
The following section answers RQ1 regarding the scholarly articles on AI and
cybersecurity.
meets high standards of data integrity and quality. Furthermore, it shows that
the articles from the database can be used for academic research since the
researcher has been provided access to complete and accurate bibliographic
records. However, while the metadata C1-Affiliation, DE-Keywords and DI-
Keyword Plus have a missing count of 3, 47 and 56 respectively, and the status
is “Good,” ID and RP status is “Poor” with 210 and 292 missing counts
respectively. A “Good” status indicates that, despite the few missing entries,
the overall quality and completeness of these metadata fields are acceptable
and generally reliable. Thus, in this study, C1, DE and DI indicate that the
majority of records have author affiliations, keyword descriptors and digital
object identifiers, which are generally considered sufficient and reliable. The
“Poor” status indicates the lack of document identifiers and reprint authors
which makes it difficult to uniquely identify the records within the database
and to contact the responsible author for additional information or copies of
the work. Only CR-Cited References is completely missing with 939 missing
counts. This implies that none of the 939 records include any information
about the references that were cited within those publications.
Country citation
Table 4 is a compilation of the top 10 countries that have been actively engaged
in the application of AI in cybersecurity research over the past 10 years in
terms of their total citation. China leads with 1379 total citations, followed
closely by Australia with 915 citations, Algeria with 744 citations, the USA
with 427 citations, and India with 286 citations. With the rapid global tech
nological advancements, it has become imperative for every country to keep
pace with the technological landscape. China, despite having the highest total
e2439609-10 L. OFUSORI ET AL.
Keyword analysis
Keywords constitute one of the major aspects of search engines. The selection
of correct keywords is extremely important for a higher number of relevant
citations. The keywords that are most searched for when AI and cybersecurity
are the main keywords are given below (Table 7 and Figure 3). Analysis shows
Thematic map
Thematic evolution
This section presents an analysis of the thematic evolution as it relates to AI in
cybersecurity based on author keywords. Exploring thematic evolution pro
vides an interesting broad picture of the development in the field. Such
APPLIED ARTIFICIAL INTELLIGENCE e2439609-15
longitudinal analyses allow for highlighting how topics merge or split into
several themes (Madsen, Berg, and Nardo 2023). Figure 5 shows the thematic
evolution of AI in cybersecurity research in the period 2014–2024 by dividing
it into two time slices or subperiods. Accordingly, we are looking at the
evolution of keywords during two different periods (2014–2019 and
2020–2024). The reason for choosing these time slices is that there was an
uptick in the volume of research in 2020, coinciding with the COVID-19
pandemic, which some researchers suggested has fueled the shift toward the
application of AI in cybersecurity (Hofstetter et al. 2020). Therefore, exploring
whether these surges were associated with the emergence and development of
new research themes is interesting.
In the first phase, which is the pre-COVID-19 pandemic (2014–2019), the
dominant themes are “ai,” “classification,” “iot,” “security,” “machine learn
ing,” “cyber security” and “cyber-attack.” This implies that there is an empha
sis on foundational technologies (AI, ML) and their applications (IoT,
cybersecurity) in an increasingly digital world (Jallouli et al. 2019). In
the second phase, which is during and post-COVID-19 pandemic (2020–
2024), the dominant themes are “machine learning,” “xai” and “data mining.”
This implies that there is a greater focus on enhancing the interpretability and
trustworthiness of AI (XAI) and leveraging data to address unprecedented
global challenges (data mining) (Oropeza-Valdez et al. 2024). The critical need
for reliable and understandable AI systems during crises highlighted the
importance of XAI (Hofstetter et al. 2020; Oropeza-Valdez et al. 2024).
Moreover, the pandemic highlighted the necessity of data mining in making
informed decisions in real-time (Li et al. 2022). The shift toward ML, XAI, and
data mining highlights the need for powerful, transparent, and data-driven
solutions in an increasingly complex world.
e2439609-16 L. OFUSORI ET AL.
Trend topics
Figure 6 highlights the trending topics based on author keywords. In the years
2023–2024, the most trending topics are “natural language processing,” “rein
forcement learning, “data privacy,” “machine learning,” “cybersecurity” and
“artificial intelligence.” This trend is obviously moving toward the application
of AI (Nievas et al. 2024). This implies that the landscape has shifted dramati
cally in recent years, with a more proactive approach to cybersecurity and
a broader application of AI technologies (Sarker, Furhad, and Nowrozy 2021).
However, the years 2020–2022 (Figure 6) reflect a reactive approach to cyberse
curity, focusing on detecting and responding to threats after they occur. This is
based on the trending topics in that period which are “intrusion detection,”
“information security,” “intrusion detection systems,” “cyber threat intelli
gence,” “social media,” “explainability,” “big data,” “data mining” and “mal
ware.” This implies that the period between 2020 and 2022 was characterized by
a strong emphasis on defensive cybersecurity strategies (Dawson et al. 2021).
Hence, while 2023–2024 focuses on cutting-edge AI and data privacy,
2020–2022 was more concentrated on foundational cybersecurity measures
and the initial phases of AI explainability and big data analytics. These trends
reflect a more sophisticated approach to cybersecurity, leveraging AI to predict
and prevent threats, rather than simply reacting to them. It is important to
note the absence of trending topics between the years 2014–2019. This may
imply that the prior research focus might have been different, with different
keywords or terminologies used, making it difficult to identify using current
search parameters. Furthermore, there can be a significant delay between
research completion and publication. This might affect the visibility of older
trends.
Figure 7 presents the number of articles published each year from 2014 to
2024. The number of articles increased gradually, from 2 in 2014 to 6 in 2017,
indicating limited research interest or early exploration in the field during
these years. There is noticeable growth in publications, with 17 articles in 2018
and 43 in 2019. This suggests increasing awareness or relevance of the topic.
The numbers spike significantly, especially during the pandemic years, from
72 articles in 2020 to 376 in 2023. This suggests a surge in research interest,
possibly due to the growing reliance on digital solutions and the increased
importance of AI and cybersecurity during and after the COVID-19 pan
demic. With 376 articles, 2023 marks the highest number of publications in the
dataset, reflecting the topic’s growing importance, rapid development, or
increased funding and innovation in these areas. However, there is no report
of publication in 2024 as of the time of this analysis. Overall, the table indicates
a sharp upward trend, especially in recent years, showing that AI and cyber
security are becoming central topics of academic and practical importance.
Systematic analysis
Systematic analysis is a methodical and structured approach to examining
a system, process, or problem question (Mallett et al. 2012). It involves break
ing down complex systems or issues into smaller, more manageable compo
nents and then evaluating each component systematically to gain
a comprehensive understanding. In this section, systematic analysis provides
a structured and rigorous approach to understanding the various AI techni
ques used in cybersecurity and how they are applied to curb cyberattacks.
Table 8 presents summaries of the top 10 articles identified in the SLR. These
Table 8. Top 10 articles on the application of AI in cybersecurity.
Rank Authors Aims/objective Techniques Findings Recommendations
e2439609-18
1 Chowdhury et al. To explore how Self-Organizing Maps (SOM) SOM The study found that SOM was effective in Given its effectiveness and efficiency, the
(2017) can be used to detect botnet activities distinguishing between botnet traffic and study recommended implementing SOM in
within network traffic data, specifically normal traffic. real-time bot detection systems to enhance
focusing on the CTU-13 dataset. The implementation of SOM achieved high cybersecurity measures.
To evaluate the efficiency of SOM in terms of detection rates for botnet traffic within the The authors suggested that further
computational requirements and CTU-13 dataset. This demonstrated SOM’s research should be conducted to refine
performance when applied to large-scale capability to identify various botnet SOM algorithms and explore their
network traffic datasets. activities accurately. application to other types of cyber threats.
To create a reliable and efficient framework The proposed SOM-based framework was
L. OFUSORI ET AL.
using SOM for identifying botnet traffic validated through extensive testing on the
within the CTU-13 dataset. CTU-13 dataset, showcasing its robustness
and reliability in botnet detection.
2 Kozik et al. (2014) To improve the detection of web attacks by DT, SVM, Naive The algorithms showed high accuracy in DT was recommended for scenarios where
employing advanced ML techniques. Bayes detecting various types of web attacks. model interpretability and rule extraction
To compare the performance of DT, SVM, DT demonstrated high accuracy in are important. However, regularization
and Naive Bayes algorithms in terms of classifying network intrusions, making it techniques such as pruning should be
detection accuracy, false positive rates, and a reliable algorithm for intrusion detection applied to avoid overfitting.
computational efficiency. systems (IDS). SVM: Preferred for its robustness and
SVM showed strong performance, balanced performance, but careful tuning
especially in handling complex and non- of hyperparameters is essential. It is
linear data patterns. This makes it suitable suitable for applications where
for detecting sophisticated intrusion computational resources are available for
attempts. Naive Bayes was found to be fast training.
and simple to implement. It requires less Naive Bayes: Suitable for real-time
computational resources compared to DT detection systems due to its speed but may
and SVM, making it efficient for real-time require additional techniques to handle
intrusion detection. feature correlations and improve accuracy.
3 Bhuyan, To evaluate the effectiveness of various ML C4.5, SVM, The findings highlighted the strengths and The recommendations focused on algorithm
Bhattacharyya, algorithms in detecting different types of K-Nearest weaknesses of each algorithm, with C4.5 selection, feature engineering, hybrid
and Kalita botnets. Neighbors and SVM emerging as top performers. approaches, regular updates, scalability,
(2015) (KNN), and real-time detection, providing
Bayesian a comprehensive framework for improving
Networks botnet detection systems in practical
network environments.
(Continued)
Table 8. (Continued).
Rank Authors Aims/objective Techniques Findings Recommendations
4 Bhamare et al. To compare the performance of different ML NNC, ANN, Both NNC and ANN demonstrated high For applications requiring high accuracy and
(2016) classifiers, including Neural Network SVM, NBC, accuracy and effectiveness in detecting robustness, GBC and SVM were
Classifier (NNC), Artificial Neural Network GBC cyber threats. ANN, in particular, showed recommended due to their superior
(ANN), SVM, Naive Bayes Classifier (NBC), superior performance due to its ability to performance in handling complex datasets,
and Gradient Boosting Classifier (GBC). model complex patterns in the data. ability to manage high-dimensional data,
To assess the effectiveness of these SVM also performed well, especially in and strong generalization capabilities
algorithms in accurately detecting cyber handling high-dimensional data, but its across a variety of tasks.
threats within the ISOT dataset. performance was slightly lower compared ANN was recommended for scenarios
to ANN. The NBC showed good results in where complex patterns and relationships
terms of speed and simplicity but was need to be learned from the data, provided
outperformed by more complex algorithms that sufficient computational resources are
like ANN and SVM in terms of accuracy. available. NBC and NNC were
The GBC showed robust performance, recommended for quick, preliminary
combining accuracy with efficiency in analysis due to their simplicity and speed,
processing the data. though they may not provide the highest
accuracy.
5 Kozik et al. (2014) To evaluate the effectiveness of the Naive Naïve Bayes, The Naive Bayes algorithm demonstrated It was recommended to apply PCA as
Bayes algorithm in classifying network PCA robust performance in classifying different a preprocessing step before classification,
intrusion data from the KDD 99 dataset. algorithm types of network intrusions within the KDD especially when dealing with high-
To determine how the application of PCA 99 dataset. The application of PCA dimensional datasets like KDD 99. This
impacts the performance of the Naive Bayes significantly reduced the dimensionality of reduces the computational burden and
classifier in terms of accuracy, speed, and the dataset, which in turn improved the improves classification accuracy. Naive
resource utilization. processing speed and efficiency of the Bayes was suggested as a suitable
Naive Byes classifier. algorithm for initial deployment due to its
simplicity and accuracy, especially in
resource-constrained environments.
6 Kato and Klyuev To develop a robust and efficient method for Statistical The application of statistical methods such as It was recommended to implement these
(2014) detecting DDoS attacks using statistical methods entropy-based analysis, standard deviation, statistical methods in real-time detection
techniques. and correlation analysis proved effective in systems. The ability to process and analyze
APPLIED ARTIFICIAL INTELLIGENCE
To identify statistical patterns associated detecting DDoS attacks. By analyzing traffic data in real-time is crucial for timely
with DDoS attacks. network traffic patterns and anomalies, detection and mitigation of DDoS attacks. It
these methods were able to identify was also recommended that future studies
abnormal behaviors indicative of DDoS should focus on real-time implementation
attacks. and validation of these methods in
operational network environments.
(Continued)
e2439609-19
Table 8. (Continued).
Rank Authors Aims/objective Techniques Findings Recommendations
7 Xie, Hu, and Slay To enhance the detection of anomalies in Naïve Bayes, The short sequence model outperformed all The study recommended exploring hybrid
e2439609-20
(2014) network traffic using advanced ML Bayes compared algorithms (Naïve Bayes, Bayes models combining the strengths of the
techniques. Network, Network, Decision Stump, RBF Network) in short sequence model with other ML
To compare the effectiveness of the short Decision terms of accuracy and false positive rates. techniques. Further research was
sequence model against other algorithms Stump RBF It showed a balanced performance, suggested to test the model’s scalability
(Naïve Bayes, Bayes Network, Decision Network combining accuracy with computational and robustness on larger and more diverse
Stump, RBF Network) using the ECML-PKDD efficiency, making it suitable for real-time datasets.
2007 dataset. applications.
Traditional methods like Decision Stump
and Naïve Bayes were less effective in
capturing short-term dependencies critical
L. OFUSORI ET AL.
articles were selected based on their direct relevance to the research questions
of the study. Only those articles that provide significant insights related to the
key topics of interest are considered. By presenting the summaries the table
provides unique perspectives into the application of AI in cybersecurity
research. It also provides insights into the diversity of each study. This
diversity enriches our knowledge and provides a well-rounded view of the
application of AI in cybersecurity (Sarker et al. 2020). Furthermore, it allows
for the comparison of different authors, AI techniques, aims/objectives, find
ings, and recommendations used in AI and cybersecurity research. This
includes comparing the effectiveness of ML algorithms, feature selection
methods, and evaluation metrics across various datasets and scenarios.
Studies such as Chowdhury et al. (2017) and Kozik et al. (2014) have
demonstrated the effectiveness of ML algorithms, including Self-Organizing
Maps (SOM), Decision Trees (DT), Support Vector Machines (SVM), and
Naive Bayes, in detecting botnet and web attack activities. These algorithms
excel at identifying patterns within network traffic, enabling accurate clas
sification of normal and malicious behavior. However, their performance
can vary depending on the specific dataset and type of attack (refer to
Table 8). To enhance detection capabilities, researchers have also investi
gated the use of feature selection techniques, such as Principal Component
Analysis (PCA), to reduce data dimensionality and improve algorithm
efficiency. Studies by Kozik et al. (2014) and Wang and Paschalidis
(2016) highlight the benefits of PCA in improving classification accuracy
and computational speed.
Beyond ML, statistical methods have shown promise in detecting
Distributed Denial of Service (DDoS) attacks. Hoque, Bhattacharyya, and
Kalita (2016) explored the use of information metrics like entropy to identify
anomalies associated with DDoS attacks. These methods offer complementary
approaches to ML, providing additional insights into network traffic patterns.
While individual studies have yielded promising results, the consensus among
researchers is that a combination of techniques is often necessary for robust
and effective intrusion detection. Hybrid approaches, incorporating multiple
algorithms and feature engineering, can enhance overall performance.
Additionally, the importance of real-time detection, continuous model
updates, and scalability cannot be overstated.
make predictions or decisions based on data (Ladosz et al. 2022). There are
three major types of ML, namely, supervised learning, unsupervised learning,
and reinforcement learning (RL). Supervised learning is a type of ML where
the algorithm learns from labeled data, which means it is given inputs along
with corresponding outputs during training measures (Sarker et al. 2020).
Algorithms such as DT, SVM), and neural networks are trained on labeled
data to detect known threats. Unsupervised learning is a type of ML where the
algorithm learns patterns and structures from unlabeled data without human
intervention (Dixit and Silakari 2021). Techniques such as clustering (e.g.,
k-means) and anomaly detection (e.g., Isolation Forests) are used to identify
unusual patterns that may indicate novel threats. RL is a ML technique that
trains algorithms by trial and error rather than using sample data (Ladosz et al.
2022). Reinforcement ML in cybersecurity has been used in several studies to
identify and respond to attacks in real-time, autonomous intrusion detections,
cyber-physical systems, and DDoS defenses (Moerland et al. 2023).
DL is a subset of ML that utilizes ANNs with multiple layers to learn
complex patterns from data (Sarker et al. 2020). It performs learning on
a multi-layer feed-forward neural network consisting of an input layer, one
or more hidden layers, and an output layer (Sarker et al. 2020).
It is essential to note that high-quality datasets are necessary for training AI
models effectively. Some of these datasets include KDD 99, CTU-13 and CIC-IDS
-2018. These datasets and AI techniques collectively advance the field of cyberse
curity by providing robust tools and data for identifying, analyzing, and mitigating
various cyber threats. Hence, this section highlights the various AI techniques and
datasets used in cybersecurity, and these are summarized in Table 9.
(1) KDD 99 Dataset: This is one of the most widely used datasets for
evaluating the performance of anomaly detection methods (Wang and
Paschalidis 2016). It was derived from the DARPA 1998 dataset and
contains a variety of simulated intrusions in a network environment.
The data includes four main categories of attacks, namely, DoS,
Remote to Local (R2L), User to Root (U2R), and Probing attacks are
simulated (Hoque, Bhattacharyya, and Kalita 2016). Wang and
Paschalidis (2016) conducted a study using the KDD 99 dataset to
explore the effectiveness of Genetic Algorithms (GAs) for feature
selection and intrusion detection. The study findings suggest that
incorporating GAs for dimensionality reduction can lead to more
accurate and efficient IDS. Their recommendations focus on adopting
GA for feature selection, combining it with other algorithms, ensuring
real-time application and scalability, continuously updating models,
and exploring further research opportunities to maintain and improve
intrusion detection. Furthermore, Hoque, Bhattacharyya, and Kalita
(2016) conducted a study utilizing the KDD 99 dataset and introduced
APPLIED ARTIFICIAL INTELLIGENCE e2439609-23
1 Chowdhury et al. CTU-13 SOM DDoS attack Accuracy Source IP address, destination IP address and protocol-specific features
(2017) detection
2 Bhuyan, Zeus (Snort), Zeus C4.5, SVM, KNN Botnet Detection Source IP, destination IP, and basic features (protocols connection)
Bhattacharyya, (NETRESEC), Zeus-2 (NIMS), Bayesian Networks detection rate(DR),
and Kalita Conficker (CAIDA) and false
(2015) ISOT-Uvic positive rate
3 Bhamare et al. ISOT NNC, ANN, SVM, NBC, Botnet True detection Network traffic, statistical
(2016) GBC detection rate, error protocol and time interval
rate
L. OFUSORI ET AL.
4 Kozik et al. (2014) KDD Cup 1999 Naïve Bayes, PCA Intrusion False positive 1) Basic features, (2) Traffic features, and (3) Content features. The
algorithm detection rate basic features are extracted from a TCP/IP connection. The traffic
features are divided into two groups (i.e., “same host” features, and
“same service” features). The content features concerns suspicious
behavior in the data portion
5 Kato and Klyuev CAIDA, DDoS 2007, MIT Statistical method DDoS attack Accuracy IP address, destination IP address, time interval in seconds between
(2014) DARPA detection packets, and packet size in bytes from the database
6 Xie, Hu, and Slay ECML-PKDD 2007 hTTP, Naïve Bayes, Bayes Web False positive N/A
(2014) CSIC HTTP 2010 Network, Decision applications rate
Stump, RBF attack
network
7 Moustafa and UNSW-NB15 k-means Network Accuracy, DR, Flow features (e.g., client-to-serve or server-to-client); Basic features
Slay (2016) anomaly FPR (protocols connections); Content features (attributes of TCP/IP,
detection attributes of http services); and
Time features (arrival time between packets, start/end packet time,
and round-trip time of TCP protocol)
8 Hoque, KDD 99, CAIDA, TUIDS Information metrics DDoS attack N/A N/A
Bhattacharyya, DDoS detection
and Kalita
(2016)
9 Wang and KDD 99 Genetic algorithm Intrusion DR 1) The cluster label of the source IP address; 2) the cluster label of the
Paschalidis detection destination IP address; 3) the source port number; 4) the destination
(2016) port number; 5) the flow duration; 6) the data bytes sent from
source to destination; and 7) the data bytes sent from destination to
source
(Continued)
Table 9. (Continued).
Problem Evaluation
Rank Authors Dataset used Techniques Domain Metrics Feature Selection
10 Ferrag et al. CSE-CIC IDS2018 Deep Intrusion Accuracy Network flow features
(2020) discriminative detection
models. (DNN, RNN,
CNN)
11 Injadat et al. ISCX2012 SVM, KNN, RF Anomaly Accuracy, Bayesian optimization
(2018) detection precision,
recall, F1-
score
12 Kanimozhi and CSE-CIC IDS2018 ANN, RF, k-NN, SVM, Network Accuracy, General information, quality of data, data volume, recording
Jacob (2019b) Adaboost, NB anomaly precision, environment, and evaluation.
detection recall, F1-
score
13 Verma and Ranga CIDDS-001 KNN, SVM, DT, RF, NB, Network Accuracy Binary feature encoding
(2023) DL, ANN, SOMs, EM, intrusion
k-means detection
14 Kim, Shin, and CSE-CIC IDS2018 CNN and RNN Intrusion Accuracy Traffic features
Choi (2019) detection
15 Moustafa and UNSW-NB15 DT, LR, NB, ANN, EM Anomaly Accuracy, FAR Flow features, basic features, content features, time features
Slay (2016) KDD99 clustering detection
16 Kanimozhi and CSE-CIC IDS2018 RF and ANN Intrusion Accuracy, Network flow features
Jacob (2019a) detection precision,
APPLIED ARTIFICIAL INTELLIGENCE
recall
e2439609-27
e2439609-28 L. OFUSORI ET AL.
DDoS attacks using the CAIDA DDoS 2007 and MIT DARPA data
sets. The study highlighted the robustness and scalability of these
methods and provided valuable insights into feature selection and
anomaly detection. The recommendations focused on real-time imple
mentation, dynamic feature selection, reducing false positives, opti
mizing performance, and continuous monitoring to enhance the
detection and mitigation of DDoS attacks in real-world network
environments.
(11) UNSW-NB15 Dataset: This dataset was created by the Australian Centre
for Cybersecurity (ACCS). During its creation, tools such as IXIA Perfect-
Storm, Tcpdump, Argus, and Bro-IDS were used (Moustafa and Slay
2016). The IXIA tool, which serves as a generator for both normal and
abnormal traffic, was deployed on three virtual servers. Moustafa and Slay
(2016) examined the complexity of the UNSW-NB15 dataset in their
study. For this purpose, in the first step, statistical analysis of qualifications
was explained; in the second step, feature correlations were examined;
and, in the last step, the performance of the dataset with five classifiers was
measured and compared with the KDD99 dataset. The UNSW-NB15
dataset was observed to be more complex than KDD99 dataset.
The third research question (RQ3) concerned the AI tools used for data
extraction, analysis, and optimization. The application domain in Table 10
relies on various types of data and tools to monitor and protect network and
system security. For example, IDSs analyze the Network Traffic Data, Log Data
or Behavioral Data using AI tools such as SQL, Python or R to identify
malicious activities (Sultana and Jilani 2021). Likewise, Imaging and
CAPTCHA systems involve processing visual data (image or text data) using
AI tools such as Python or R to either display images or distinguish human
users from bots (Dinh and Hoang 2023). By leveraging these data types and
tools, organizations can effectively process and analyze images, generate
robust CAPTCHA systems, and ensure accurate differentiation between
human users and automated bots (Challagundla, Reddy Gogireddy, and
Reddy Peddavenkatagari 2024). Table 10 presents a summary of the major
domains where AI tools and techniques have been utilized for data extraction,
analysis, and optimization in cybersecurity models for various purposes.
Adaptive Boosting R Libraries To improve the performance of classification algorithms. Divakar et al. (2021)
(AdaBoost) Python Libraries
(Continued)
e2439609-29
Table 10. (Continued).
e2439609-30
R libraries To extract relevant features and identify complex patterns. et al. (2022)
Imaging and SVM Python Libraries To classify distorted characters or objects present in CAPTCHA images. Dinh and Hoang (2023)
CAPTCHA R libraries Kumar, Jindal, and Kumar (2022)
Sachdev (2020)
CNN Python Libraries Leverages convolutional layers to extract hierarchical features from images. Challagundla, Reddy Gogireddy, and
R libraries To capture spatial relationships and patterns effectively. Reddy Peddavenkatagari (2024)
Wang, Shi, and Uddin (2021)
Singular Python Libraries, To extract important features from images. Kaur and Jindal (2020)
Value R Libraries Ranjan, Patidar, and Kushwaha
Decomposition MATLAB (2020).
(SVD)
Phishing/malware ANN and CNN Apache spark MLlib To learn patterns and features from large datasets of phishing emails. Soon et al. (2020)
detection R Libraries To capture spatial relationships and patterns in images. Verma et al. (2019)
Python Libraries Hassan and Fakharudin (2023).
SVM R Libraries To classify instances into phishing or legitimate categories based on features extracted Anupam and Kumar Kar (2021)
Python Libraries from URLs, email headers, or network traffic.
Q-learning R Libraries To detect malicious content without hampering the critical attributes. Gill et al. (2021)
Python Libraries Kamal et al. (2024)
Traffic classification K-means Python Libraries To analyze clusters, identify patterns and distinguish between different classes of traffic. Liao and Li (2022)
clustering SQL Jain, Kaur, and Saxena (2022)
R Libraries
CNN Python Libraries To classify traffic based on its content, enabling fine-grained classification of Salman et al. (2021)
R libraries applications or protocols.
(Continued)
Table 10. (Continued).
Application domain Techniques AI Tools Purpose Authors
Anomaly and DoS SVM Python Libraries To classify network traffic based on features such as packet size, frequency, and protocol Bhati and Shekhar Rai (2020)
detection R libraries type. Abuali, Nissirat, and Al-Samawi
To find the hyperplane that best separates normal from abnormal instances in a high- (2023)
dimensional feature space.
DT and KNN R Libraries To classify incoming traffic based on the majority class of its nearest neighbours in the Alharbi et al. (2021)
Python Libraries feature space. Ramadhan, Sukarno, and Nugroho
SQL To detect deviations from expected patterns or rules learned from training data. (2020)
PCA Python Libraries To identify relevant features and reduce noise, making it easier to detect anomalies in Divakar et al. (2021)
R Libraries the data.
MATLAB
APPLIED ARTIFICIAL INTELLIGENCE
e2439609-31
e2439609-32 L. OFUSORI ET AL.
majority class among its nearest neighbors (Veena et al. 2022). Naive Bayes is
a probabilistic classifier that assumes independence among features (Yilmaz,
Taspinar, and Koklu 2022). In IDS, Naive Bayes is used to classify network
traffic by computing the probability of a packet being normal or malicious
based on its attributes (Rekha et al. 2020). DTs are used in IDS to model the
decision-making process based on features extracted from network traffic
(Melvin et al. 2022). They recursively partition the feature space based on
attribute values, leading to a tree-like structure (Kumari and Mehta 2020).
K-Means is a clustering algorithm used in IDS for unsupervised learning tasks
(Saheed, Arowolo, and Tosho 2022). It groups network traffic data into
clusters based on similarities in feature space. K-Means clustering helps in
identifying anomalies and detecting new types of attacks by finding patterns in
unlabeled data (Bohara et al. 2020). In IDS, ANN is used for both supervised
and unsupervised learning tasks. It learns complex patterns from network
traffic data to detect intrusions by adjusting the weights and biases of inter
connected neurons (Saranya et al. 2020). Likewise, in IDS, AdaBoost is used to
improve the performance of classification algorithms by sequentially training
classifiers on different subsets of the training data (Divakar et al. 2021). It
focuses on misclassified instances, thus enhancing the overall accuracy of
intrusion detection.
Phishing/malware detection
ANN and CNN are two of the most crucial neural networks used for detecting
phishing and malware by learning patterns and features from large datasets of
phishing e-mails, URLs, or malware samples (Soon et al. 2020). A study by
Verma et al. (2019) utilized deep belief networks and ANNs. The ANN
achieved 89.95% accuracy with five hidden layers and five nodes per layer,
while the deep belief network achieved 96.32% accuracy using similar settings
(Hassan and Fakharudin 2023). CNNs can capture spatial relationships and
patterns in images, enabling them to detect phishing website layouts or mal
ware signatures effectively. Likewise, SVMs and CNNs have been used for
phishing and malware detection by classifying instances into phishing or
legitimate categories based on features extracted from URLs, e-mail headers,
or network traffic (Anupam and Kumar Kar 2021). Similarly, Q-Learning has
been efficient in yielding desirable accuracy for the recognition of malicious
content without hampering the critical attributes (Gill et al. 2021). Extensive
studies on different and large-scale phishing datasets indicate the usefulness of
the Q-Learning-based RL technique in outperforming standard ML models in
detection performance (Kamal et al. 2024).
Traffic classification
K-means clustering has been used in traffic classification for unsupervised
learning tasks (Liao and Li 2022). The K-means clustering process generates
cluster centroids for normal and anomalous traffic, which can then be used to
e2439609-34 L. OFUSORI ET AL.
detect anomalies in new flow records monitored within the same network
(Jain, Kaur, and Saxena 2022). Similarly, a recent study has shown that CNN is
being utilized in classifying traffic based on its content, enabling fine-grained
classification of applications or protocols (Salman et al. 2021). Compared with
the traditional classification method, CNN traffic classification can improve
the accuracy and reduce the time of classification (Salman et al. 2021).
there are no themes/clusters in the lower left quadrant, which suggests that
there are no currently identifiable new or fading trends in the dataset analyzed.
This could also indicate a period of stability in the research field (Foody 2020).
Moreover, the absence of motor themes in the upper right quadrant suggests
a recent shift in focus, where previously dominant themes have either become
too broad to be categorized as a single theme or the research community is in
transition, looking for new directions (Tennekes 2018). It might also imply
limitations in the dataset as the parameter for this study dataset was readjusted
from the default setting of 250 for the number of words to 100 (Choudhri et al.
2015). As noted, the readjustment was necessary because the default setting
produced overlapping clusters that were muddled together, making it difficult
to see and interpret. Future research may readjust the parameters for the
datasets to identify emerging or declining patterns as well as motor themes.
Since the literature on the application of AI in cybersecurity is growing
rapidly, it is likely that some of the current basic and niche themes will
move to the upper right quadrant and become motor themes (Madsen, Berg,
and Nardo 2023).
Likewise, the thematic evolution (refer to Figure 5) highlights two phases,
namely, 2014–2019 (pre-COVID-19 pandemic) and 2020–2024 (during and
post- COVID-19 pandemic). In the first phase, the dominant themes were AI,
Classification, IoT, Security, ML, Cybersecurity and Cyberattack. This implies
that there is an emphasis on foundational technologies (AI, ML) and their
applications (IoT, cybersecurity) in an increasingly digital world (Jallouli et al.
2019). In the second phase, which was during and post-COVID-19 pandemic
(2020–2024), the dominant themes are ML, XAI and Data mining. This
implies that there is a greater focus on enhancing the interpretability and
trustworthiness of AI (XAI) and leveraging data to address unprecedented
global challenges (Oropeza-Valdez et al. 2024). This suggests that the future
directions for the application of AI in cybersecurity involve leveraging
advanced AI technologies such as ML to enhance security measures, detect
threats, and respond to incidents more effectively (Holzinger 2018). There are
emerging algorithms and practical applications that are driving the interest in
ML (Sarker, Furhad, and Nowrozy 2021). Likewise, cybersecurity remains
a critical concern, due to the increasing number of cyber threats, data
breaches, and the need for secure systems as more services go digital.
Although there are ongoing efforts to develop better security protocols,
tools, and strategies to protect information and systems from malicious
activities (Oropeza-Valdez et al. 2024), future research should focus on opti
mizing AI models for effective deployment across various cybersecurity envir
onments. Also, based on the bibliometric analysis (refer to Figure 6), there is
a high demand for expertise in ML and AI, indicating that professionals with
skills in this area are in high demand (Verma et al. 2019). Moreover, AI
continues to be a major area of focus due to its wide range of applications,
e2439609-36 L. OFUSORI ET AL.
Discussion
The findings of this research reveal both promising developments and chal
lenges in the application of AI within the cybersecurity domain. As AI
technologies continue to evolve, they are playing an increasingly critical role
in addressing sophisticated cyber threats. However, the analysis also highlights
several gaps and underexplored areas in existing literature, which have impli
cations for both researchers and practitioners.
and Internet of Things (IoT), dominated earlier research, reflecting the grow
ing dependence on digital technologies and the need for advanced security
solutions (Jallouli et al. 2019). The second phase emphasizes XAI and data
mining, indicating an emerging need for greater interpretability of AI models.
This is aligned with growing concerns about the “black-box” nature of many
AI systems, particularly in critical fields like cybersecurity, where trust, trans
parency, and accountability are essential (Oropeza-Valdez et al. 2024). These
shifts in thematic focus highlight the increasing importance of developing AI
models that not only detect threats effectively but also provide understandable
insights that human operators can act on confidently. Future research should
focus on developing XAI models that can clearly explain how decisions are
made without compromising performance. This will help build trust among
stakeholders and ensure that AI systems are used responsibly and ethically in
cybersecurity applications.
The findings indicate that XAI has become a focal point in recent years,
reflecting the need for transparency in AI-based cybersecurity systems. As
AI takes on more responsibilities in threat detection and response, the ability
to explain decisions becomes essential for both regulatory compliance and
end-user trust (Oropeza-Valdez et al. 2024). This aligns with industry-wide
efforts to improve the interpretability of AI models, especially in critical
sectors like finance and healthcare, where AI-driven decisions can have sig
nificant consequences. Future research should focus on developing XAI mod
els that can clearly explain how decisions are made without compromising
performance. This will help build trust among stakeholders and ensure that AI
systems are used responsibly and ethically in cybersecurity applications.
e2439609-38 L. OFUSORI ET AL.
Metadata quality
The findings indicate that the high quality of most metadata fields such as
abstracts, authors, and titles ensure reliable core research analysis. However,
gaps in document identifiers, reprint authors, and cited references present
challenges that limit the study’s depth particularly in citation analysis and
network mapping. While incomplete metadata in areas like keywords and
affiliations may slightly affect thematic analyses, the impact on overall trend
detection remains minimal due to the availability of other key metadata fields.
In addition, the thematic evolution and research areas are primarily captured
through the combination of keywords and article abstracts, reducing the
dependency on secondary data fields like document identifiers (Donthu
et al. 2021). Nevertheless, future research should focus on addressing the
gaps identified in the metadata to enhance the depth and scope of analysis.
key works and trends within the field. It is also essential that future research
should broaden database coverage to include other databases to capture
a more diverse range of relevant literature.
Conclusion
This comprehensive review of AI’s application in cybersecurity highlights its
significant potential to enhance threat detection, response, and overall security
posture. While significant progress has been made in leveraging AI for cyber
security (Khan, Malik, and Nazir 2024), future research must focus on opti
mizing specific AI techniques, improving algorithm interpretability and
transparency, and addressing the challenges of deployment in diverse envir
onments. By continuing to advance AI methodologies and their applications,
the cybersecurity field can achieve greater resilience and adaptability against
evolving threats.
Disclosure statement
No potential conflict of interest was reported by the author(s).
ORCID
Lizzy Ofusori http://orcid.org/0000-0002-6036-619X
Tebogo Bokaba http://orcid.org/0000-0003-3710-2513
Siyabonga Mhlongo http://orcid.org/0000-0001-8203-5984
References
Abuali, K., L. Nissirat, and A. Al-Samawi. 2023. Advancing network security with AI:
SVM-Based deep learning for intrusion detection. Sensors (Switzerland) 23 (21):8959. doi:
10.3390/s23218959.
Aflalo, A., S. Bagon, T. Kashti, and Y. Eldar. 2023. Deepcut: Unsupervised segmentation using
graph neural networks clustering. In Proceedings of the IEEE/CVF International Conference
on Computer Vision, Paris, France, 32–41. IEEE. doi: 10.48550/arXiv.2212.05853.
Albhirat, M., A. Rashid, R. Rasheed, S. Rasool, S. Zulkiffli, and H. Muhammad Zia-Ul-Haq.
2024. The PRISMA statement in enviropreneurship study: A systematic literature and
a research agenda. Cleaner Engineering and Technology 18 (February):100721. Elsevier.
doi: 10.1016/j.clet.2024.100721.
e2439609-40 L. OFUSORI ET AL.
Alharbi, Y., A. Alferaidi, K. Yadav, G. Dhiman, S. Kautish, and J. Xia. 2021. Denial-of-service
attack detection over IPv6 network based on KNN algorithm. Wireless Communications and
Mobile Computing 2021 (1):1–6. doi: 10.1155/2021/8000869.
Anupam, S., and A. Kumar Kar. 2021. Phishing website detection using support vector
machines and nature-inspired optimization algorithms. Telecommunication Systems
76 (1):17–32. doi: 10.1007/s11235-020-00739-w.
Bertero, C., M. Roy, C. Sauvanaud, and G. Trédan. 2017. Experience report: Log mining using
natural language processing and application to anomaly detection. In 2017 IEEE 28th
International Symposium on Software Reliability Engineering (ISSRE), Toulouse, France,
351–60. IEEE. doi: 10.1109/ISSRE.2017.43.
Bhamare, D., T. Salman, M. Samaka, A. Erbad, and R. Jain. 2016. Feasibility of supervised
machine learning for cloud security. IN 2016 International Conference on Information
Science and Security (ICISS), Pattaya, Thailand, 1–5. IEEE. doi: 10.1109/ICISSEC.2016.
7885853.
Bhati, B., and C. Shekhar Rai. 2020. Analysis of support vector machine-based intrusion
detection techniques. Arabian Journal for Science & Engineering 45 (4):2371–83. doi: 10.
1007/s13369-019-03970-z.
Bhuyan, M., D. Bhattacharyya, and J. Kalita. 2015. An empirical evaluation of information
metrics for low-rate and high-rate DDoS attack detection. Pattern Recognition Letters
51:1–7. doi: 10.1016/j.patrec.2014.07.019.
Bohara, B., J. Bhuyan, F. Wu, and J. Ding. 2020. A survey on the use of data clustering for
intrusion detection system in cybersecurity. International Journal of Network Security & Its
Applications 12 (1):1. doi: 10.5121/ijnsa.2020.12101.
Camacho, N. 2024. The role of AI in cybersecurity: Addressing threats in the digital age.
Journal of Artificial Intelligence General Science (JAIGS) ISSN: 3006-4023 3 (1):143–54. doi:
10.60087/jaigs.v3i1.75.
Challagundla, B., Y. Reddy Gogireddy, and C. Reddy Peddavenkatagari. 2024. Efficient
CAPTCHA image recognition using convolutional neural networks and long short-term
memory networks. International Journal of Scientific Research in Engineering and
Management (IJSREM) 8 (3):1–5. doi: 10.55041/IJSREM29450.
Charbuty, B., and A. Abdulazeez. 2021. Classification based on decision tree algorithm for
machine learning. Journal of Applied Science and Technology Trends 2 (1):20–28. doi: 10.
38094/jastt20165.
Choudhri, A., A. Siddiqui, N. Khan, and H. Cohen. 2015. Understanding bibliometric para
meters and analysis. Radiographics 35 (3):736–46. doi: 10.1148/rg.2015140036.
Chowdhury, S., M. Khanzadeh, R. Akula, F. Zhang, S. Zhang, H. Medal, M. Marufuzzaman,
and L. Bian. 2017. Botnet detection using graph-based feature clustering. Journal of Big Data
4 (1):1–23. doi: 10.1186/s40537-017-0074-7.
Cisco. 2022. Cybersecurity resilience emerges as top priority as 62 percent of companies say
security incidents impacted business operations. Accessed June 13, 2024. https://investor.
cisco.com/news/news-details/2022/Cybersecurity-resilience-emerges-as-top-priority-as-62-
percent-of-companies-say-security-incidents-impacted-business-operations/default.aspx.
Cobo, M., A. Gabriel López-Herrera, E. Herrera-Viedma, and F. Herrera. 2011. An approach
for detecting, quantifying, and visualizing the evolution of a research field: A practical
application to the fuzzy sets theory field. Journal of Informetrics 5 (1):146–66. doi: 10.
1016/j.joi.2010.10.002.
Cremer, F., B. Sheehan, M. Fortmann, A. Kia, M. Mullins, F. Murphy, and S. Materne. 2022.
Cyber risk and cybersecurity: A systematic review of data availability. The Geneva Papers on
Risk and Insurance-Issues and Practice 47 (3):698–736. doi: 10.1057/s41288-022-00266-6.
APPLIED ARTIFICIAL INTELLIGENCE e2439609-41
Dasgupta, D., Z. Akhtar, and S. Sen. 2022. Machine learning in cybersecurity: A comprehensive
survey. The Journal of Defense Modeling and Simulation 19 (1):57–106. doi: 10.1177/
15485129209512.
Dawson, M., R. Bacius, L. B. Gouveia, and A. Vassilakos. 2021. Understanding the challenge of
cybersecurity in critical infrastructure sectors. Land Forces Academy Review 26 (1):69–75.
doi: 10.2478/raft-2021-0011.
Dinh, N. T., and V. T. Hoang. 2023. Recent advances of captcha security analysis: A short
literature review. Procedia Computer Science 218 :2550–62. doi: 10.1016/j.procs.2023.01.229.
Divakar, S., R. Priyadarshini, R. Kumar Barik, and D. Sinha Roy. 2021. An intelligent intrusion
detection scheme powered by boosting algorithm. In 2021 11th International Conference on
Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 205–09. doi: 10.
1109/Confluence51648.2021.9377076.
Dixit, P., and S. Silakari. 2021. Deep learning algorithms for cybersecurity applications:
A technological and status review. Computer Science Review 39 :100317. doi: 10.1016/j.
cosrev.2020.100317.
Dong, S., P. Wang, and K. Abbas. 2021. A survey on deep learning and its applications.
Computer Science Review 40:100379. doi: 10.1016/j.cosrev.2021.100379.
Donthu, N., S. Kumar, D. Mukherjee, N. Pandey, and W. Marc Lim. 2021. How to conduct
a bibliometric analysis: An overview and guidelines. Journal of Business Research
133:285–96. doi: 10.1016/j.jbusres.2021.04.070.
Ferdiana, R. 2020. A systematic literature review of intrusion detection system for network
security: Research trends, datasets and methods. In 2020 4th International Conference on
Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, 1–6. IEEE. doi: 10.
1109/ICICoS51170.2020.9299068.
Ferrag, M. A., L. Maglaras, S. Moschoyiannis, and H. Janicke. 2020. Deep learning for cyber
security intrusion detection: Approaches, datasets, and comparative study. Journal of
Information Security & Applications 50:102419. doi: 10.1016/j.jisa.2019.102419.
Foody, G. M. 2020. Explaining the unsuitability of the kappa coefficient in the assessment and
comparison of the accuracy of thematic maps obtained by image classification. Remote
Sensing of Environment 239:111630. doi: 10.1016/j.rse.2019.111630.
Gill, S., S. Gogte, C. Sharma, P. Pathwar, V. Desai, A. Gupta, A. Vyas, and O. P. Vyas. 2021.
Exploring deep reinforcement learning for android malware detection. EasyChair Preprint
6594:1–8.
Guezzaz, A., S. Benkirane, M. Azrour, and S. Khurram. 2021. A reliable network intrusion
detection approach using decision tree with enhanced data quality. Security and
Communication Networks 2021 (8):1230593. doi: 10.1155/2021/1230593.
Hassan, N. H., and A. S. Fakharudin. 2023. Web phishing classification model using artificial
neural network and deep learning neural network. International Journal of Advanced
Computer Science & Applications 14 (7):535–42. doi: 10.14569/ijacsa.2023.0140759.
Hofstetter, M., R. Riedl, T. Gees, A. Koumpis, and T. Schaberreiter. 2020. Applications of AI in
cybersecurity. In 2020 second International Conference on Transdisciplinary AI (TransAI),
Irvine, CA, USA, 138–41. IEEE. doi: 10.1109/TransAI49837.2020.00031.
Holzinger, A. 2018. From machine learning to explainable AI. In 2018 world symposium on
digital intelligence for systems and machines (DISA), Košice, Slovakia, 55–66. IEEE. doi: 10.
1109/DISA.2018.8490530.
Hoque, N., D. K. Bhattacharyya, and J. K. Kalita. 2016. A novel measure for low-rate and
high-rate DDoS attack detection using multivariate data analysis. In 2016 8th International
Conference on Communication Systems and Networks (COMSNETS), Bangalore, India, 1–2.
IEEE. doi: 10.1109/COMSNETS.2016.7439939.
e2439609-42 L. OFUSORI ET AL.
Huberman, A. 2014. Qualitative data analysis a methods sourcebook. Thousand Oaks, CA: Sage
Publications.
Injadat, M., F. Salo, A. B. Nassif, A. Essex, and A. Shami. 2018. Bayesian optimization with
machine learning algorithms towards anomaly detection. In 2018 IEEE global communica
tions conference (GLOBECOM), Abu Dhabi, United Arab, 1–6. IEEE. doi: 10.1109/
GLOCOM.2018.8647714.
Jain, M., G. Kaur, and V. Saxena. 2022. A K-Means clustering and SVM based hybrid concept
drift detection technique for network anomaly detection. Expert Systems with Applications
193:116510. doi: 10.1016/j.eswa.2022.116510.
Jallouli, R., M. A. Tobji, D. Bélisle, S. Mellouli, F. Abdallah, and I. Osman. 2019. Digital
economy. In Emerging Technologies and Business Innovation: 4th International Conference,
ICDEc 2019, 166–76, Beirut, Lebanon,: Springer International Publishing, Cham. doi: 10.
1007/978-3-030-30874-2.
Janati, F., F. Abdollahi, S. S. Ghidary, M. Jannatifar, J. Baltes, and S. Sadeghnejad. 2017. Multi-
robot task allocation using clustering method. In Robot intelligence technology and applica
tions, ed. J. H. Kim, F. Karray, J. Jo, P. Sincak, and H. Myung, vol. 233, 247. Cham: Springer.
doi: 10.1007/978-3-319-31293-4_19.
Kamal, H., S. Gautam, D. Mehrotra, and M. S. Sharif. 2024. Reinforcement learning model for
detecting phishing websites. In Cybersecurity and artificial intelligence: Transformational
strategies and disruptive innovation, ed. H. Jahankhani, G. Bowen, M. S. Sharif, and
O. Hussien, 309–26. Cham: Springer. doi: 10.1007/978-3-031-52272-7_13.
Kanimozhi, V., and T. P. Jacob. 2019a. Artificial intelligence based network intrusion detection
with hyper-parameter optimization tuning on the realistic cyber dataset CSE-CIC-IDS2018
using cloud computing. In 2019 international conference on communication and signal
processing (ICCSP), Chennai, India, 0033–0036. IEEE. doi: 10.1109/ICCSP.2019.8698029.
Kanimozhi, V., and T. P. Jacob. 2019b. Calibration of various optimized machine learning
classifiers in network intrusion detection system on the realistic cyber dataset CSE-CIC-
IDS2018 using cloud computing. International Journal of Engineering Applied Sciences &
Technology 4 (6):209–13. doi:10.33564/IJEAST.2019.v04i06.036.
Kato, K., and V. Klyuev. 2014. An intelligent ddos attack detection system using packet analysis
and support vector machine. International Journal of Intelligent Computing Research
5 (3):464–71. doi:10.20533/ijicr.2042.4655.2014.0060.
Kaur, S., and A. Jindal. 2020. Singular value decomposition (SVD) based image tamper
detection scheme. In 2020 International Conference on Inventive Computation
Technologies (ICICT), Coimbatore, India, 695–99. IEEE. doi: 10.1109/ICICT48043.2020.
9112432.
Khan, A. I., and S. Al-Habsi. 2020. Machine learning in computer vision. Procedia Computer
Science 167:1444–51. doi: 10.1016/j.procs.2020.03.355.
Khan, H. U., M. Z. Malik, and S. Nazir. 2024. Identifying the ai-based solutions proposed for
restricting money laundering in financial sectors: Systematic mapping. Applied Artificial
Intelligence 38 (1):2344415. doi: 10.1080/08839514.2024.2344415.
Kim, J., J. Kim, H. Kim, M. Shim, and E. Choi. 2020. Cnn-based network intrusion detection
against denial-of-service attacks. Electronics 9 (6):916. doi: 10.3390/electronics9060916.
Kim, J., Y. Shin, and E. Choi. 2019. An intrusion detection model based on a convolutional
neural network. Journal of Multimedia Information System 6 (4):165–72. doi: 10.33851/JMIS.
2019.6.4.165.
Kozik, R., M. Choraś, R. Renk, and W. Hołubowicz. 2014. A proposal of algorithm for web
applications cyber attack detection. In Computer information systems and industrial man
agement, ed. K. Saeed and V. Snášel, 8838. Berlin, Heidelberg: Springer. doi: 10.1007/978-
3-662-45237-0_61.
APPLIED ARTIFICIAL INTELLIGENCE e2439609-43
Kumar, M., M. K. Jindal, and M. Kumar. 2022. A systematic survey on CAPTCHA recognition:
Types, creation and breaking techniques. Archives of Computational Methods in Engineering
29 (2):1107–36. doi: 10.1007/s11831-021-09608-4.
Kumari, A., and A. K. Mehta. 2020. A hybrid intrusion detection system based on decision tree
and support vector machine. In 2020 IEEE 5th International conference on computing
communication and automation (ICCCA), Greater Noida, India, 396–400. IEEE. doi: 10.
1109/ICCCA49541.2020.9250753.
Künzler, F. 2023. Real cyber value at risk: An approach to estimate economic impacts of
cyberattacks on businesses. Master thesis, University of Zurich.
Ladosz, P., L. Weng, M. Kim, and H. Oh. 2022. Exploration in deep reinforcement learning: A
survey. Information Fusion 85:1–22. doi:10.1016/j.inffus.2022.03.003.
Li, T., Z. Zeng, J. Sun, and S. Sun. 2022. Using data mining technology to analyse the
spatiotemporal public opinion of COVID-19 vaccine on social media. Electronic Library
40 (4):435–52. doi: 10.1108/EL-03-2022-0062.
Liao, N., and X. Li. 2022. Traffic anomaly detection model using k-means and active learning
method. International Journal of Fuzzy Systems 24 (5):2264–82. doi: 10.1007/s40815-022-
01269-0.
Liberati, A., D. G. Altman, J. Tetzlaff, C. Mulrow, P. C. Gøtzsche, J. P. A. Ioannidis, M. Clarke,
P. J. Devereaux, J. Kleijnen, and D. Moher. 2009. The PRISMA statement for reporting
systematic reviews and meta-analyses of studies that evaluate health care interventions:
Explanation and elaboration. PLOS Medicine 6 (7):e1000100. doi: 10.1371/journal.pmed.
1000100.
Madsen, D. Ø., T. Berg, and M. D. Nardo. 2023. Bibliometric trends in industry 5.0 research:
An updated overview. Applied System Innovation 6 (4):63. doi: 10.3390/asi6040063.
Mallett, R., J. Hagen-Zanker, R. Slater, and M. Duvendack. 2012. The benefits and challenges of
using systematic reviews in international development research. Journal of Development
Effectiveness 4 (3):445–55. doi: 10.1080/19439342.2012.711342.
Marr, B. 2019. Artificial intelligence in practice: How 50 successful companies used AI and
machine learning to solve problems. New York, USA: John Wiley & Sons.
McIntosh, T., J. Jang-Jaccard, P. Watters, and T. Susnjak. 2019. The inadequacy of
entropy-based ransomware detection. In Neural Information Processing: 26th International
Conference, ICONIP 2019, ed. T. Gedeon, K. Wong, and M. Lee, 181–89, Sydney, Australia:
Springer, Cham. doi: 10.1007/978-3-030-36802-9-20.
Melvin, A. A. R., G. J. W. Kathrine, S. S. Ilango, S. Vimal, S. Rho, N. N. Xiong, and Y. Nam.
2022. Dynamic malware attack dataset leveraging virtual machine monitor audit data for the
detection of intrusions in cloud. Transactions on Emerging Telecommunications Technologies
33 (4):e4287. doi: 10.1002/ett.4287.
Moerland, T. M., J. Broekens, A. Plaat, and C. M. Jonker. 2023. Model-based reinforcement
learning: A survey. Foundations and Trends 16 (1):1–118. doi: 10.1561/2200000086.
Mohammadpour, L., T. C. Ling, C. S. Liew, and A. Aryanfar. 2022. A survey of cnn-based
network intrusion detection. Applied Sciences 12 (16):8162. doi:10.3390/app12168162.
Moustafa, N., and J. Slay. 2016. The evaluation of network anomaly detection systems:
Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data
set. Information Security Journal: A Global Perspective 25 (1–3):18–31. doi: 10.1080/
19393555.2015.1125974.
Nievas, N., A. Pagès-Bernaus, F. Bonada, L. Echeverria, and X. Domingo. 2024. Reinforcement
learning for autonomous process control in industry 4.0: Advantages and challenges.
Applied Artificial Intelligence 38 (1):2383101. doi: 10.1080/08839514.2024.2383101.
Oropeza-Valdez, J. J., C. Padron-Manrique, A. Vazquez-Jimenez, X. Soberon-Mainero, and
O. Resendis-Antonio. 2024. Exploring metabolic anomalies in COVID-19 and
e2439609-44 L. OFUSORI ET AL.
Sommer, R., and V. Paxson. 2010. Outside the closed world: On using machine learning for
network intrusion detection. In 2010 IEEE Symposium on Security and Privacy, 305–16,
Oakland, CA, USA: IEEE. doi: 10.1109/SP.2010.25.
Soon, G. K., C. K. On, N. M. Rusli, T. S. Fun, R. Alfred, and T. T. Guan. 2020. Comparison of
simple feedforward neural network, recurrent neural network and ensemble neural net
works in phishing detection. Journal of Physics: Conference Series 1502 (1):012033. doi: 10.
1088/1742-6596/1502/1/012033.
Sultana, J., and A. K. Jilani. 2021. Classifying cyberattacks amid covid-19 using support vector
machine. In Security incidents & response against cyber attacks, ed. A. Bhardwaj and V.
Sapra, 161–75. Cham: Springer. doi: 10.1007/978-3-030-69174-5_8.
Sun, N., J. Zhang, P. Rimba, S. Gao, L. Y. Zhang, and Y. Xiang. 2018. Data-driven cybersecurity
incident prediction: A survey. IEEE Communications Surveys & Tutorials 21 (2):1744–72.
doi: 10.1109/COMST.2018.2885561.
Tareq, W. Z. T., and M. Davud. 2024. Classification and clustering. Decision-Making Models,
351–59. New York: Academic Press. doi: 10.1016/B978-0-443-16147-6.00024-4.
Tennekes, M. 2018. Tmap: Thematic maps in R. Journal of Statistical Software 84 (6):1–39. doi:
10.18637/jss.v084.i06.
Veena, K., K. Meena, Y. Teekaraman, R. Kuppusamy, A. Radhakrishnan, and D. K. Jain. 2022.
C SVM classification and KNN techniques for cyber crime detection. Wireless
Communications and Mobile Computing 2022:1–9. doi: 10.1155/2022/3640017.
Verma, A., and V. Ranga. 2023. On evaluation of network intrusion detection systems:
Statistical analysis of CIDDS-001 dataset using machine learning techniques. Pertanika
Journal of Science & Technology 26 (3):1307–32. doi: 10.36227/techrxiv.11454276.v1.
Verma, M. K., S. Yadav, B. K. Goyal, B. R. Prasad, and S. Agarawal. 2019. Phishing website
detection using neural network and deep belief network. In Recent findings in intelligent
computing techniques, advances in intelligent systems and computing, ed. P. Sa, S. Bakshi,
I. Hatzilygeroudis, and M. Sahoo, vol. 707. Singapore: Springer. doi: 10.1007/978-981-10-
8639-7_30.
Wang, J., and I. C. Paschalidis. 2016. Botnet detection based on anomaly and community
detection. IEEE Transactions on Control of Network Systems 4 (2):392–404. doi: 10.1109/
TCNS.2016.2532804.
Wang, Z., P. Shi, and M. I. Uddin. 2021. CAPTCHA recognition method based on CNN with
focal loss. Complexity 2021 (1):1–10. doi: 10.1155/2021/6641329.
Wei, L., X. Li, T. Cao, Q. Zhang, L. Zhou, and W. Wang. 2019. Research on optimization of
CAPTCHA recognition algorithm based on SVM. In Proceedings of the 2019 11th
International Conference on Machine Learning and Computing, Zhuhai, China, 236–40.
doi: 10.1145/3318299.3318355.
Wirkuttis, N., and H. Klein. 2017. Artificial intelligence in cybersecurity. Cyber, Intelligence and
Security 1 (1):103–19. doi: 10.1006/jesp.1996.0006.
World Bank. 2024. Cybersecurity multi-donor trust fund. Accessed June 13, 2024. https://www.
worldbank.org/en/programs/cybersecurity-trust-fund/overview.
Xie, M., J. Hu, and J. Slay. 2014. Evaluating host-based anomaly detection systems: Application
of the one-class SVM algorithm to ADFA-LD. In 2014 11th International Conference on
Fuzzy Systems and Knowledge Discovery (FSKD), 978–82, Xiamen, China. doi: 10.1109/
FSKD.2014.6980972.
Yilmaz, A. B., Y. S. Taspinar, and M. Koklu. 2022. Classification of malicious android applica
tions using naive Bayes and support vector machine algorithms. International Journal of
Intelligent Systems and Applications in Engineering 10 (2):269–74.
e2439609-46 L. OFUSORI ET AL.
You, J., J. Jia, X. Pang, J. Wen, Y. Shi, and J. Zeng. 2023. A novel multi-robot task assignment
scheme based on a multi-angle K-means clustering algorithm and a two-stage
load-balancing strategy. Electronics 12 (18):3842. doi: 10.3390/electronics12183842.
Zhang, Z., H. Al Hamadi, E. Damiani, C. Y. Yeun, and F. Taher. 2022a. Explainable artificial
intelligence applications in cyber security: State-of-the-art in research. Institute of Electrical
and Electronics Engineers Access 10:93104–39. doi: 10.1109/ACCESS.2022.3204051.
Zhang, Z., H. Ning, F. Shi, F. Farha, Y. Xu, J. Xu, F. Zhang, and K. R. Choo. 2022b. Artificial
intelligence in cyber security: Research advances, challenges, and opportunities. Artificial
Intelligence Review 55 (2):1029–53. doi: 10.1007/s10462-021-09976-0.