0% found this document useful (0 votes)
181 views31 pages

Email Spam A Comprehensive Review of Optimize Detection Methods Challenges and Open Research Problems

Uploaded by

adarsh4arun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views31 pages

Email Spam A Comprehensive Review of Optimize Detection Methods Challenges and Open Research Problems

Uploaded by

adarsh4arun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Received 25 August 2024, accepted 19 September 2024, date of publication 25 September 2024, date of current version 11 October 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3467996

Email Spam: A Comprehensive Review of


Optimize Detection Methods, Challenges,
and Open Research Problems
EKRAMUL HAQUE TUSHER 1 , MOHD ARFIAN ISMAIL 1,2 ,
MD ARAFATUR RAHMAN 3 , (Senior Member, IEEE),
ALI H. ALENEZI 4 , AND MUEEN UDDIN 5 , (Senior Member, IEEE)
1 Faculty of Computing, Universiti Malaysia Pahang Al-Sultan Abdullah, Pekan, Pahang 26600, Malaysia
2 Center of Excellence for Artificial Intelligence and Data Science, Universiti Malaysia Pahang Al-Sultan Abdullah, Gambang 26300, Malaysia
3 School of Mathematics and Computer Science, University of Wolverhampton, WV1 1LY Wolverhampton, U.K.
4 Remote Sensing Unit, Electrical Engineering Department, Northern Border University, Arar 73213, Saudi Arabia
5 College of Computing and Information Technology, University of Doha for Science and Technology, Doha, Qatar

Corresponding authors: Mohd Arfian Ismail ([email protected]) and Mueen Uddin ([email protected])
This work was supported in part by the Fundamental Research Grant (FRGS) with FRGS/1/2022/ICT02/UMP/02/2 from the Ministry of
Higher Education Malaysia under Grant RDU220134; in part by Qatar National Library—QNL (Open Access Research); and in part by the
Deanship of Scientific Research at Northern Border University, Arar, Suadi Arabia, under Project NBU-FFR-2024-2159-07.

ABSTRACT Nowadays, emails are used across almost every field, spanning from business to education.
Broadly, emails can be categorized as either ham or spam. Email spam, also known as junk emails or
unwanted emails, can harm users by wasting time and computing resources, along with stealing valuable
information. The volume of spam emails is rising rapidly day by day. Detecting and filtering spam presents
significant and complex challenges for email systems. Traditional identification techniques like blocklists,
real-time blackhole listing, and content-based methods have limitations. These limitations have led to the
advancement of more sophisticated machine learning (ML) and deep learning (DL) methods for enhanced
spam detection accuracy. In recent years, considerable attention has focused on the potential of ML and
DL methods to improve email spam detection. A comprehensive literature review is therefore imperative
for developing an updated, evidence-based understanding of contemporary research on employing these
methods against this persistent problem. The review aims to systematically identify various ML and DL
methods applied for spam detection, evaluate their effectiveness, and highlight promising future research
directions considering gaps. By combining and analyzing findings across studies, it will obtain the strengths
and weaknesses of existing methods. This review seeks to advance knowledge on reliable and efficient
integration of state-of-the-art ML and DL into identifying email spam.

INDEX TERMS Email spam, machine learning, deep learning, fuzzy system, feature selection, spam
detection.

I. INTRODUCTION Emails have facilitated collaboration among individuals by


Emails have become an essential component of the contem- offering a cost-effective and expeditious mode of commu-
porary lifestyle, which is heavily influenced by technology. nication [1]. They have greatly facilitated communication
Since its introduction to the public in the mid-1990s, the and information exchange on both personal and professional
use of emails has had a noticeable positive effect on various levels. However, the increasing usage and reliance on emails
sectors such as business, healthcare, education, and industry. have also exposed users to greater cybersecurity risks in
the form of spam attacks, malware infections, and other
The associate editor coordinating the review of this manuscript and modes of exploitation [2]. As emails continue to play a
approving it for publication was Parul Garg. pivotal role across domains, it is critical for users as well
2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
VOLUME 12, 2024 For more information, see https://creativecommons.org/licenses/by/4.0/ 143627
E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

as organizations to adopt safe email practices and robust


security measures against emerging threats. Cybercriminals
utilize email channels as a launchpad for assaults that have
the potential to seriously hurt both people and organizations.
Indeed, it is claimed that emails are responsible for as much as
90% of cyberattacks [3]. Even though there have been efforts
to improve email security, vulnerabilities still exist. For
the purpose of exploiting organizations and compromising
their systems, attackers utilize a variety of strategies, such
as social engineering, hacking email accounts, and the
fabrication of bogus emails [4]. Social engineering initiatives
are among the most misleading of these tactics since they
are designed to trick personnel, accomplish unauthorized
access, disclose sensitive information, disseminate malware,
and disrupt essential activities [5]. It is therefore absolutely
necessary to take action against these growing dangers that FIGURE 1. Worldwide everyday spam emails [8].
are based on email and to boost cybersecurity prevention
measures [6]. There are vulnerabilities in email networks that
are routinely exploited by malicious individuals. The most
common methods of attack that these individuals use are
spam and phishing emails.
Emails have mostly made conversation and connection eas-
ier, but a big problem is that people keep getting spam emails.
Segregating legitimate emails from unwanted spam has there-
fore become a critical task. Studies show that spam accounts
FIGURE 2. Common types of spam email [8].
for over 50% of global email traffic [7], with healthcare and
dating scams being highly common. The volume of spam is
rising in line with the overall growth in emails worldwide.
By 2025, an estimated 376 billion emails will be sent daily, According to the secure list report, Figure 3 reveals that
to over 4.6 billion users [8]. This torrent of spam incurs Russia is the foremost country in terms of outgoing spam,
significant economic and social costs. From consuming accounting for 23.5% of the total. Germany follows closely
network resources to jeopardizing privacy, dealing with spam in second place with 11%, while the United States ranks third
leads to major technical and infrastructure expenditures [9]. with 10.85% [12].
Additionally, research indicates the frustration from spam can Efficient and secure digital communication relies heavily
negatively impact mental well-being [10]. on the detection of email spam. Efficient spam detection
Every day, more than 320 billion unsolicited emails are preserves users from undesired and potentially dangerous
produced, and this method is utilised to disseminate 94% emails that can cause time wastage, resource consumption,
of malicious software. The projected financial impact was and jeopardize personal or corporate data. Email systems
estimated to be $12 billion due to the dissemination of unso- boost user experience, increase productivity, and protect
licited commercial emails to corporate email recipients [11]. against security concerns like phishing and malware by
Figure 1 from Statista’s report on January 16, 2023 shows effectively filtering out spam. Researchers have suggested
that a large amount of spam emails were sent globally on many approaches and techniques, including the utilisation of
that particular day. Approximately 8.6 billion emails were Real-Time Blackhole List [13], Blocklist [14], and Content-
received by the United States, with the Czech Republic and Based Filters [15], to detect and eliminate spam from
the Netherlands following with 7.7 and 7.6 billion emails, legitimate messages for more than twenty years. Ongoing
respectively. [8]. research is currently being performed to design techniques
There are five primary categories of spam: mobile spam, that are more efficient and precise. Specifically, researchers
messaging spam, e-mail spam, search engine optimization have shown significant interest in artificial intelligence (AI)-
(SEO) spam, and social networking spam. Figure 2 provides based approaches in recent years [16]. Significant attention
an overview of prevalent categories of e-mail spam. Based has been given to the ML based methods [7], [17], [18]. More-
on a virus analysis, it was determined that 94% of malware over, spam email detection has lately witnessed successful
was transmitted by email. A majority of spam emails have implementation of DL methods [19], [20], [21]. The results
an attachment, with approximately 45% of these attachments of these studies demonstrate that ML and DL methods offer
being Office document files. Windows programmers ranked an efficient framework for effectively addressing the issue of
second, accounting for 26% of virus transmission through spam identification, but they also suffer difficulties such as
spam e-mail [8]. managing incorrect positive and negative results, adjusting

143628 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

rect, and Scopus, we discovered a noteworthy pattern. There


were a number of surveys that addressed the more general
topic of email spam; however, there was a significant void in
the literature that explicitly focused on detection approaches.
Recognising that there has been very little attention paid
to the identification of spam in email, our objective was to
make a significant contribution by limiting the scope of our
investigation and presenting a comprehensive analysis of the
most effective ML and DL techniques that are currently being
used in this area. Our work places an emphasis on the most
recent developments in email spam detection methods, with
a particular focus on specialised and optimised approaches.
Table 1 offers a comparative analysis between existing
survey papers on ML and DL applications in the broader
email spam domain and our focused exploration of detection
techniques. Through this paper, we sought to illuminate every
potential application of ML and DL methods in email spam
detection, presenting a comprehensive overview of the field.
Furthermore, we have meticulously highlighted the impacts
of optimized methods and addressed the challenges of scaling
up these innovative solutions within spam detection systems.
By doing so, we aim to provide researchers and practitioners
with a thorough understanding of the current landscape,
potential improvements, and future directions in email spam
FIGURE 3. Leading countries in sending spam [12].
detection using advanced ML and DL techniques.

B. CONTRIBUTION
to novel spam strategies, and guaranteeing computational
effectiveness. Furthermore, maintaining a proper balance There are gaps in understanding the effectiveness, limitations,
between precision and the utilization of resources continues and potential improvements needed for current spam detec-
to be a crucial concern. Researchers and practitioners in tion techniques. With the field evolving rapidly and accurate
the field face an ongoing challenge to keep ahead of identification of spam being crucial, an updated comprehen-
spammers as they constantly improve their techniques [22]. sive review is needed. This review would synthesize available
However, there is still potential for additional enhancement evidence on existing methods, highlight literature gaps to
and advancement. address through new research, and provide the following key
The use of ML and DL to find spam is an area that is contributions:
changing quickly and has gained a lot of attention lately • The paper presents a comprehensive review of the
because it has the potential to get around the problems with crucial characteristics used to identify email spam,
traditional methods and make detection more accurate. But as well as significant advancements in this field. The
there needs to be a comprehensive overview of the present survey identifies significant research gaps and outlines
research in this field. This can help us figure out the pros and future research goals in the field of email spam
cons of different ML and DL methods and guide the growth of detection, based on a comprehensive analysis of existing
future research. It is possible to get a clear and evidence-based literature.
view of how ML and DL can be used to find spam through • This review paper focuses on the various ML and DL
gathering together the results of all the relevant studies and methods utilized for spam email detection and analyzes
finding any gaps in the literature. In particular, this kind of the effectiveness of existing techniques in accurately
study can show how well different ML and DL methods work, identifying spam messages.
as well as their flaws and possible ways to make them better. • The review presents an elaborate study of several
A thorough study can also find gaps in the research and help methods applied to email spam detection over the period
with coming up with new research questions and areas to 2005-2024.
focus on. • Analyses the performance of ML and DL methods
by examining the findings reported in recent research.
A. REVIEW SCOPE Presents a concise summary of these findings in well-
In our comprehensive investigation of more than one hundred organised tables.
research publications sourced from renowned scientific • The review identifies the strengths and limitations of
databases such as IEEE Xplore, Web of Science, ScienceDi- various spam detection methods. Analysing the current

VOLUME 12, 2024 143629


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 1. Summary of previous reviews in email spam detection.

literature highlights the key challenges that need to


be solved to improve the accuracy and efficacy of
identifying spam emails.
• Focuses the scope on an understudied niche area of
ML and DL methods in email spam detection to fill a
literature gap and make novel contributions.
Overall, this review paper offers valuable insights by
presenting a concentrated technical summary, performance
evaluation, and future prospects primarily aimed at email
spam detection. The discoveries are intended to accelerate
progress in this promising domain.
The structure of this paper is as follows: Section II
provides an overview of the existing literature and discusses
the results obtained from the survey. Section III offers
a comprehensive examination of the prominent methods
employed for the identification of email spam. Section IV
provides an overview of the data pre-processing and specific
information regarding the datasets that are accessible to
the public. Section V for the implication of the research.
Section VI presents the challenges that were observed during
the research. Section VII presents the research gaps and
open research problems. The conclusion is presented in
Section VIII. Figure 4 below shows the paper’s structure,
which should help readers grasp it better and make the paper
easier to read:

II. RELATED WORKS


A. EMAIL SPAM DETECTION
Email refers to electronic mail sent from one device to another
over the internet. Since innovating online communication
in the early days of networking, email continues playing a
vital role in both personal and professional realms despite
competition from messaging apps and social media [23].
Since its inception, email has evolved to become a versatile
and indispensable tool in both personal and professional
spheres [24]. Figure 5 presents three major email service
providers that most people utilize - Gmail, Yahoo, and FIGURE 4. A Visualization of the organization of this paper’s structure.
Outlook. Each email platform has distinct advantages and is
better suited for certain use cases over the others.
Gmail, offered by Google, is one of the most widely ideal for personal communication and works well with most
adopted email services globally. Benefits of using Gmail email clients.
include high inbox storage capacity, excellent search func- Yahoo Mail is also a popular free email platform.
tionality, seamless integration with other Google Workspace Key advantages include custom domains to maintain a
apps like Drive and Calendar, and robust spam detection professional brand, disposable email addresses to protect
powered by artificial intelligence. The ads-supported model identity, automatic data download in case of account hacking,
enables providing these features free of cost [25]. Gmail is and seamless communication tools like chat and SMS

143630 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

spam [30]. Spammers utilise spam emails to spread their


spam for a variety of purposes, including hacking, phishing
and banking fraud. The ideal platform for spammers to
obtain user personal data and send spam emails is social
media. Junk emails are another term used to describe spam.
Spam emails are used to spread trojans, phishing websites,
malware that looks like a virus, offers, and other types of
content advertising. The term Spam stands for Self-Propelled
Advertising Material [31]. Over 280 billion spam emails were
sent and received worldwide in 2019. Google reports that
64% of emails sent and received in 2019 are spam emails,
up from the prior years’ 2%-3% rate [18]. Two different
kinds of spam detection methods exist. These include spam
FIGURE 5. Types of the email. detection based on sender and spam detection based on
content [32]. Content-Type, MIME-Version, Message-ID,
Return-Path, and Authentication-Results were the major
elements used to detect spam sent by a specific sender [33].
built-in with the interface [26]. These features make Yahoo When doing content-based spam filtering, the email’s subject
Mail suitable for business use cases like newsletters, mass and URL are compared to the email’s text to determine its text
communication and privacy protection. classification.
Outlook refers to the email client offered by Microsoft, Taking advantage of the advancements in technology,
often bundled with Office suite subscriptions. It works a large number of cybercriminals create hazardous scam
great with Microsoft apps like Word, Excel and Teams. communications every day and send them to millions of
Outlook provides top-notch calendar organization features, individuals around the world. An easy, free, and poten-
robust communication and collaboration functionalities for tially anonymous method of spreading the scams online
enterprises, and high security including encryption that is through email services. Spam is increasingly linked
complies with financial and healthcare regulations [27]. with a problematic and dangerous concern for the security,
These capabilities make Outlook popular among corporations integrity, and dependability of email users on the internet,
and businesses. even though they typically only perceive it as annoying,
The capacity to serve edu mail is a connection across all of uninvited advertising or a waste of time. Furthermore, spam
these email providers. These emails facilitate administrative is a significant issue because, according to estimates from
duties, course management, and group research by ensuring Kaspersky Lab and Cisco Talos, 50-85% of the 200 billion
secure communication within academic communities. Edu emails received daily worldwide are spam [19]. Since spam
mail helps users develop a professional identity by pro- email has been an issue for the last few decades, businesses
viding access to educational materials, software discounts, and researchers are working to develop effective filters that
and increased privacy precautions. In contrast to Yahoo’s are both reliable and effective. To determine if an email
simplicity and ease of use, Gmail interacts with Google is spam or valid (commonly referred to as ham), various
Workspace for Education. Outlook, which is part of Microsoft methods based on ML techniques in the literature nowadays
365, offers powerful technologies like OneDrive and demonstrate excellent performance with accuracies around
Teams [28]. Based on the specific requirements of the insti- 90% [21]. Despite the remarkable speed results and upgrades
tution, each platform improves edu mail with features like to the filters, users still report attacks and frauds originating
enhanced collaboration tools, security measures, and intuitive from spam emails. The many different kinds of spam
interfaces. detection algorithms that have been effective in eradicating
The ability to exchange thoughts and ideas has increased as spam emails [34]. Different types of email spam detection
communication has developed over time. From the time when techniques are given in figure 6:
communication was limited to face-to-face interactions, letter
writing, phone conversations, and text messaging to the
present, online presence, communication has changed and 1) CONTENT BASED EMAIL SPAM DETECTION TECHNIQUE
become more affordable. Email is a helpful communication Emails can be automatically filtered and classified based
tool with a wider audience. There are two categories of on their contents using a variety of machine learning
emails: ham emails and spam emails [29]. However, email techniques, including k- Nearest Neighbor, Support Vector
is currently being utilised inappropriately under the guise Machine, Naive Bayesian classification and Neural Net-
of Spam. Bulk or unsolicited email, generally known as works [33], [35]. This technique often uses word analysis,
spam, may contain an advertisement, a link to a phishing occurrence analysis, and distribution analysis to detect
website, malware or a Trojan horse. Every day, each of us incoming email spam by analyzing the content of the
used to get a lot of emails, of which 70-80 percent were emails.

VOLUME 12, 2024 143631


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

This technique utilizes adaptive algorithms and machine


learning to remain efficient in the presence of continuously
evolving spam trends.
ML methods, Fuzzy systems, and DL methods are some
of the methods that have been used in email spam detection.
These methods were selected due to their superior classifica-
tion performance and high accuracy in detecting email spam.
In the following section, ML methods are discussed in more
detail. Numerous studies have demonstrated that ML delivers
high performance for email spam detection across diverse
datasets and evaluations metrics.
FIGURE 6. Types of email spam detection technique.

B. MACHINE LEARNING
2) CASE BASED EMAIL SPAM DETECTION TECHNIQUE
Artificial intelligence encompasses many subfields, one of
which is ML. The term ML is used to describe the process
Case based detection is a popular method for detecting spam
of designing, analyzing, and deploying systems that help a
emails. The first step is to gather all emails from each user’s
machine get better results. ML systems use training data to
mailbox, regardless of whether they are spam or not. The next
make predictions about the problem. In particular, training
step is to pre-process the email so that it may be converted
data is utilized to extract information and develop a method
using a client interface. This involves extracting and selecting
that should generalize to all conceivable problem cases
features, aggregating email data, and finally, evaluating the
throughout the learning phase [40]. ML method is used
results. Next, the data is divided into two sets of vectors [36].
to classify new samples after learning. The goal of ML is
Finally, a ML method is utilised to train and evaluate datasets
to develop a method that predicts well on test data with
in order to ascertain whether incoming emails are classified
new examples. For automated decision-making, ML methods
as spam.
are commonly used. ML uses training data to construct
methods that can effectively predict fresh data outcomes,
3) RULE BASED EMAIL SPAM DETECTION TECHNIQUE
enabling automated decision-making across many disciplines
In this method, numerous patterns, typically regular expres- [41]. Many fields have successfully used applications of
sions, are evaluated against a selected message using ML techniques. But there’s a subset that always needs new
preexisting rules or heuristics. The quality of a message methods because of an adversarial figure, and that includes
improves as it acquires several matching patterns. However, things like phishing detection spam detection and botnet
the score is reduced if any of the patterns were incorrect. If the identification [42]. Nonetheless, institutions and researchers
score of a message is high enough, it is classified as spam; need to address this issue by taking into account the unique
otherwise, it is considered legitimate. Some ranking factors characteristics of their respective fields of study. For instance,
are static, while others need to be updated frequently to keep phishing differs from spam in that it often masquerades
up with the ever-evolving threat posed by spammers and their as legitimate-looking branch logos and requests personal
more sophisticated and difficult-to-detect messages [37]. information or conveys an urgent message [43]. In another
SpamAssassin is an excellent example of a rule-based spam study, ML security research on adversarial techniques
detection. typically focuses on spam email detection, whose adversarial
figure is commonly referred to as a spammer. By including
4) PREVIOUS LIKENESS BASED EMAIL SPAM DETECTION specific misspelt terms or legitimate words in the email,
TECHNIQUE scammers want to fool the classifier without negatively
In this strategy, incoming emails are sorted according to how impacting the email’s readability. As a result, spam emails
closely they resemble instances already stored in memory may contain malicious data that was purposefully injected
(i.e., training emails). New instances are represented as by spammers to compromise the data used for training
points in a multidimensional space, which is generated the classifiers and, in turn, undermine its regular operation
using the email’s properties [38]. Then, the fresh instances filter [44]. As well as, a comparative analysis method in
are distributed among the most well-liked group among which many ML methods were tested on the same data set.
its k-nearest training instances. For this purpose, the k-NN Accuracy and precision were used to evaluate the various
method is used. machine learning methods. The accuracy of the support
vector machine is 98.09% [45]. Additionally, Cota et al. used
5) ADAPTIVE BASED EMAIL SPAM DETECTION TECHNIQUE two publicly accessible corpora. For the first set of tests,
The system recognizes spam detection by assigning it to one each corpus was divided into 80% training and 20% testing,
of several categories [39]. It classifies a collection of emails and for the second set, 70% training and 30% testing. Using
into categories and assigns a defining text to each category. Random Forests, the best accuracy for the input corpus was

143632 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

85.25% and 86.25 percent, respectively. These findings are area AUC of 0.971 [57]. Another aspect, Srinivasarao et al.
consistent with other studies [46]. According to the previous introduced fuzzy-based Recurrent Neural network-based
study on spam detection using ML methods outlined in Harris Hawk optimization (FRNN-HHO) to post-classify
Table 2, it can be inferred that scholars strongly appreciate spam and ham messages. Three distinct datasets SMS, Email
ML methods for their significance in detecting spam and Spam-assassin are used to assess the efficacy of the
texts. proposed architecture. For the SMS dataset, the suggested
Currently, ML methods employed for email spam detection method achieved an AUC of 0.9699, for the email dataset
mostly rely on techniques such as SVM, NB, RF, and k- it achieved 0.958, and for spam assassin it achieved 0.95
NN. These approaches have been successful in reaching [58]. In another study, fuzzy C-Means clustering was utilized
accuracies within the range of 90-99%. Nevertheless, these for spam email segmentation to prevent cybercrime in the
strategies encounter constraints such as inaccurate posi- Internet era. Previous studies have shown that clustering in
tive results, unchanging feature extraction, and demanding data mining for spam filtering has been understudied. This
computing intricacy. There is a notable lack of research study demonstrated that Fuzzy C-Means clustering showed
in creating more flexible and responsive methods that can promising results for spam email categorization on a public
respond to evolving spam strategies. Additionally, there is a spam dataset using different parameters [59]. As well as
requirement to investigate hybrid or ensemble techniques that email’s growing popularity as a secure online communication
integrate various algorithms in order to enhance accuracy and method has led to the rise of unsolicited bulk emails or
minimize false positives. spam. A proposed spam filtering strategy handles this issue
ML methods have proven effective for email spam by employing relief feature selection and a fuzzy-SVM to
detection across multiple studies. However, ML methods may deal with uncertain elements. Experiments showed that these
struggle with vague and ambiguous information. In contrast, algorithms improved spam filtering accuracy and detection
fuzzy systems can better handle uncertainty and imprecision speed [60]. In another study, the widespread problem of spam
in data and logic. This is because fuzzy systems can in mailboxes has negative effects on network resources and
represent and reason with vague, ambiguous information daily life. To address this issue, a content-based spam filtering
using fuzzy logic. Furthermore, fuzzy systems can adjust algorithm using fuzzy- SVM, and k-means was proposed.
and adapt to changing data and situations by applying fuzzy k-means clustering reduces data while maintaining critical
rules. In the following section, the use of fuzzy systems information. Meanwhile, fuzzy-SVM trains a classification
is discussed in more detail in the context of email spam method to handle ambiguity. This strategy improves spam
detection. filtering speed and accuracy, according to experiments [61].
Table 3 presents prior research on spam detection using fuzzy
system. From this analysis, it can be assumed that researchers
C. FUZZY SYSTEM highly value the significance of fuzzy system techniques in
There has been a proliferation of applications of fuzzy set email spam detection.
theory in recent years, including ML, data mining and DL. The present research examines the application of various
Researchers in this area recognised the need for measuring fuzzy systems in email spam detection. It focuses on distinct
the fuzzy membership vector in a fuzzy set or event as models, datasets, merits, and findings. However, there is a
a result of the widespread use of the idea of fuzzy set significant lack of research in combining fuzzy logic with
theory [55]. Additionally, Gazal et al. developed a two-level sophisticated DL methods. Although Fuzzy-BERT demon-
filter-based hybrid spam detection methodology. At Level- strates potential, there is a lack of investigation into hybrid
1, a high-level filter removes irrelevant and unimportant models that integrate fuzzy logic with other cutting-edge
features and content. Level-2 uses a fuzzy-based composite algorithms in order to enhance accuracy and resilience.
evaluator for low-level filtration and to find the most effective Moreover, the majority of research primarily concentrate on
features. CSDMC2010 SPAM, spambase and the SMS Spam binary classification, disregarding the potential advantages
Collection are all used in the method’s implementation. of employing multi-class classification methods for spam
The results of the comparison showed that the proposed detection.
method beat the current conventional and recent algorithms Fuzzy systems have proven effective for email spam
and methods, with an average accuracy of 98.80% on the detection across multiple studies. Fuzzy systems provide
CSDMC2010 dataset, 97.79% on the spambase dataset, and advantages in dealing with uncertainty but require expertise
98.84% on the SMS Spam collecting dataset [56].Moreover, in design and may struggle with high-dimensional data.
fuzzy inference systems utilising Interval Type-1 and Inter- In contrast, DL methods can handle high-dimensional data.
val Type-2 were created employing four distinct machine DL can automatically learn complex patterns from raw text
learning algorithms to showcase their efficacy in identifying input without extensive feature engineering. This enables DL
spam. The methods evaluated were SVM, LR, and average methods to overcome the curse of dimensionality faced by
perception. The Interval Type-2 Mamdani fuzzy inference fuzzy systems in processing raw email data. DL methods
system (IT2M-FIS) demonstrated superior performance, with can learn directly from raw text while handling high
an accuracy of 0.955, recall of 0.967, F-score of 0.962, and dimensionality [63]. In the following section, the use of DL

VOLUME 12, 2024 143633


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

methods is explored further for email spam detection, as DL optimizer in both models. According to the study, ReLU
is well-suited to overcome limitations of fuzzy systems. demonstrated superior performance compared to CNN, while
sigmoid showed superior performance compared to LSTM
on average [70]. As well as Rafat et al. investigated the
D. DEEP LEARNING impact of text pre-processing on email classification using
DL is an up-and-coming field that uses several nonlinear ML and DL techniques. The ML and DL algorithms were
processing layers to learn features directly from the input, compared using the Spamassassin corpus, both with and
leveraging AI and ML. Email spam detection accuracy may without text pre-processing. The researchers discovered that
be greatly improved with the help of DL methods. Deng DL methods performed better than ML methods. Specifically,
and Yu conducted an analysis of different DL methods, the LSTM method achieved a precision of 95.26%, recall
categorising them into supervised, unsupervised, and hybrid of 97.18%, and an F1-score of 96% without any text pre-
deep networks based on their network structures. They also processing. [71]. Additionally, Wen, Tingke, et al. introduced
explored various applications of these techniques, includ- LBPS, a phishing scam detection model for blockchain
ing computer vision, language modelling, text processing, financial security. The model is built on LSTM-FCN and
multimodal learning, and information retrieval [64], [65]. BP NN. The proposed model utilises a Backpropagation
DL relies on representations of data that include several Neural Network (BP NN) to analyse implicit features and
levels of hierarchy, often in the form of a neural network a LSTM-FCN NN to analyse the temporal aspects of
with more than two layers. Data features from a higher level transaction data. The experimental findings, using Ethereum
can be spontaneously integrated into those from a lower data, demonstrated that the chosen characteristics effectively
level using these methods. Each neuron in a neural network identified fraudulent accounts involved in phishing scams,
(NN) shares several common characteristics. The number of achieving a 97.86% F1-score and a 97% accuracy rate [72].
neurons and their interconnections are in turn determined by Table 4 presents the previous research on spam detection
the nature of the application being used [66]. Another aspect, using DL methods. DL methods undoubtedly enhance the
Baccouche et al. introduced a multi-label LSTM model to effectiveness of the spam detection method, reduce the impact
identify spam and fraud in emails and social media posts. The of overfitting, and handle large data.
model was developed by merging two datasets. The system A comprehensive explanation of the many different DL
was trained by utilising a collective dataset of prevalent methods that can be used to detect spam in email, including
bigrams obtained from multiple sources. Their model has an models such as CNN, LSTM, and hybrid combinations
accuracy of 92.7%. A limitation of the study was the absence of these methods. There is a significant research gap in
of a comparative analysis with other sophisticated techniques the development of ensemble learning techniques, which
for identifying harmful information. In the future, they intend combine the strengths of many DL models to further boost
to explore alternative NLP methods in order to enhance the performance. This is despite the fact that the results have
accuracy of the model [67]. In this study,Alauthman et al. been promising. In addition, although a great number of
proposed the utilisation of a SVM andGRU-RNN approach studies make use of datasets that are accessible to the
to detect botnet spam emails. Engaging with a dataset public, there is a dearth of research that investigates the
containing spam records. According to their assertion, their application of these models to large-scale datasets that are
method attained a precision of 98.7%. Their research was based on the actual world and have the potential to more
limited to assessing the efficacy of the proposed model using accurately represent a variety of spam characteristics. There
a single dataset. The proposed method accurately identifies is also a lack of attention paid to the interpretability and
spam emails, but additional investigation is required to explainability of DL models, which are essential for the actual
enhance the GRU model by integrating supplementary implementation of spam detection systems. This is another
multiclass classifiers [68]. Moreover, AbdulNabi and Yaseen gap. In addition, the majority of the research that is currently
et al. conducted research on word embedding techniques being conducted place an emphasis on accuracy measures,
for the purpose of classifying spam emails. The scientists while ignoring other significant features like as processing
enhanced the performance of a pre-trained BERT model and efficiency and adaptation to increasingly sophisticated spam
conducted a comparison with DNN and traditional classifiers strategies. By addressing these deficiencies, it may be
such as naïve Bayes and k-NN. The proposed technique possible to develop spam detection systems that are more
attained a 98.67% accuracy and a 98.66% F1 score when robust, efficient, and adaptable through the application of DL
evaluated on two open-source datasets [69]. Furthermore, techniques.
Eckhardt and Bagui et al. designed a study in which they The present review diverged from the previous reviews by
analysed LSTM and CNN methods for the purpose of placing greater emphasis on reevaluating ML, fuzzy system,
classifying textual input. The investigation revealed that and DL methods employed for the purpose of detecting
the LSTM method achieved the maximum accuracy of email spam. The review aims to discuss email spam detection
98.32% and a ROC score of 96.57%. The comparison methods, the parameters utilized for comparative analysis,
just pertains to the classification of textual material. They simulation tools, and the dataset corpus. The reviewed
asserted that the Adam optimizer outperformed the SGD era encompasses recent research articles that contribute to

143634 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 2. ML methods for email spam.

VOLUME 12, 2024 143635


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 2. (Continued.) ML methods for email spam.

the progress of email spam detection systems. Different


email spam detection methods exhibit varying strengths and
weaknesses, influenced by factors such as dataset size and
complexity. An analysis of the most effective techniques
along with their internal workflows is provided in the
following section.

III. METHODS
Email spam refers to the sending of fraudulent or undesired
mass emails through either an individual’s account or an
automated mechanism. The prevalence of spam emails has
steadily risen over the past decade, posing a widespread
issue. ML and DL have significantly contributed to the
identification of spam emails. Researchers are utilizing a
range of methods and strategies to create innovative spam
detection. In This section will provide an overview of the
most widely used ML and DL methods that have been
optimized for spam detection. FIGURE 7. Structure of the SVM.

A. SUPPORT VECTOR MACHINE


The SVM is a supervised learning paradigm with connected impressive 97.5% [85]. Additionally, a comparative analysis
learning method used for categorization of input data. Any method in which many ML methods were tested on the same
information fed into a computer that may be represented data set. Accuracy and precision were used to evaluate the
by a vector representation is considered input data. SVM’s various machine learning models. The accuracy of the support
great accuracy and precision in classifying different classes vector machine is 98.09% [45]. Furthermore, a Comparative
of data have led to its widespread adoption [82]. It specializes studied SVM, Random Forest and Multimodal NB are the
in unstructured data, making it suitable for classifying both three methods of content-based e-mail spam detection. The
linear and non linear datasets. Non-linear (SVMs) are used to advantages and disadvantages of the three approaches were
categories data received by a computing device, while linear compared in terms of their usefulness and effectiveness.
SVMs are helpful only for certain types of data. Its benefits The results of the experiment showed that SVM perform
include its efficiency in high-dimensional settings and its the best, with the other methods trailing behind by only
adaptability [83]. The downside of this approach, however, a very small margin [33]. Moreover, a new online method
may be the lack of transparency in the output, which makes used binary representation and linear SVMs without feature
it hard to evaluate the results [84]. Figure 7 presents structure selection. Character n-gram models allow the authors to
of SVM: predict all features. The next strategy showed more features
A novel approach to identifying spam in electronic and yielded millions of unique 4-gram features from TREC
messages. That was accomplished through the use of Naïve corpora [86]. As well as, It’s crucial to recognise that
Bayes or SVM based Supervised ML. They tested various most spam and valid messages use a template. SVM based
algorithms to see which ones were best at identifying incremental clustering algorithm was used by Haider in
spam from regular correspondence. NB accuracy was 95%, 2007 to identify spam and non-spam email messages based on
whereas the first method based on SVM achieved an their contents [87]. Discerning the importance of fine-tuning

143636 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 3. Fuzzy system for email spam.

classification algorithms. Optimize SVM algorithm for email practice in the fields of data mining, machine learning, and
spam detection is shown in algorithm 1. even statistics to employ the decision tree learning method.
Spam detection has been modified to use DT learning. The
B. DECISION TREE structure of the DT is presented in figure 8.
The DT is a popular technique for classifying data since the A hybrid approach combining LR and DT is used for
solution it produces is both interpretable and straightforward. email spam identification. LR was employed to reduce
Furthermore, it provides a result more quickly than other the impact of noisy data or instances prior to supplying
categorization techniques [88]. It is structured like a tree the data to DT induction. By applying a predetermined
with a central hub, branches, and leaves. The terminal node, false-negative threshold, LR effectively eliminated the noisy
or leaf node, represents a class attribute, and the other data by selecting only the accurate predictions [91]. This
nodes represent potential solutions. To determine the class study used Spambase dataset to assess the proposed
properties of the terminal node, the route from the root to technique. 91.67% accuracy is encouraging for the given
the terminal node must be accurately traced [89]. Tracing the strategy. LR may increase DT performance by minimising
tuples will be made significantly simpler by the translation noisy data. GADT is a hybrid spam email detection
of the tree into categorization rules [90]. It was common method. PCA improved GADT’s performance. Decision tree

VOLUME 12, 2024 143637


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 4. DL methods for email spam.

143638 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

Algorithm 1 SVM Algorithm for Email Spam Detection Algorithm 2 DT Algorithm for Email Spam Detection
1: Input: Email message x to classify 1: Input: Email message dataset D
2: Input: Training set S, kernel function k, regularization 2: Calculate entropy H (D) of full dataset
parameters C = {c1 , . . . , cnum }, kernel coefficients γ = 3: while stopping condition not met do
{γ1 , . . . , γnum } 4: for each attribute A do
3: Input: Number of nearest neighbors for k 5: Calculate entropy H (D|A) for splits on A
4: for l = 1 to num do 6: Calculate average entropy over all splits
5: Set C = ci 7: Calculate information gain Gain(A)
6: for j = 1 to num do 8: end for
7: Set γ = γj 9: Choose A with highest Gain(A) as split attribute
8: Train SVM classifier f (x) with parameters (C, γ ) 10: end while
on S 11: Return DT method classifying messages as spam or ham
9: if first classifier then
10: Set f(x) = f (x) as best classifier
11: else C. K-NEAREST NEIGHBOR
12: Compare f (x) with f(x) using k-fold cross- The k-Nearest Neighbor (k-NN) algorithm is one of the most
validation popular since it is simple to use and understand. This is
13: Set f(x) to the more accurate classifier because its advanced features can be quickly grasped and
14: end if put to use [95]. k-NN uses the computed distance between
15: end for a given instance and its k-NN to determine how to categorize
16: end for the instance in question. To which category a dataset belongs
17: Return spam or ham classification of x using final is decided by how many votes are cast for each possible
classifier f(x) nearest neighbor value. If k is set to two, for example, the
dataset will be classified based on its distance from its two
nearest neighbors [96]. The Euclidean distance (ED) between
a specified training sample and a test sample is typically used
for this purpose [97]. The classification results for k-NN vary
greatly depending on the value of k chosen for the number of
neighbors. A simple k-NN structure is given in figure 9.

FIGURE 8. Structure of the DT.

classification models with and without Recursive Feature


Elimination (RFE) were investigated for spam detection [92].
Furthermore, a novel spam detection system that reduced
false positives by mislabeling nonspam as spam. First,
wrapper-based feature selection. Second, C4.5 was used to
train the decision tree classifier model. Third, the cost matrix
weighted false positive and false negative errors differently. FIGURE 9. Structure of the k-NN.
The MBPSO-selected decision tree had 91.02% sensitivity,
97.51% specificity, and 94.27% accuracy [93]. Moreover, Using the k-NN classifier, Sahin et al. developed a filtering
the suggested method combines particle swarm optimization approach to pick features for spam detection via email [98].
with unsupervised filtering to enhance accuracy to 98.3%. Another aspect, the experiments calculate the accuracy and
Comparative analyses indicate better results than current F-measure of the e-mail texts classification using various
methods [94]. The optimize DT algorithm for email spam feature selection methods, varying numbers of features, and
detection is shown in algorithm 2. two distance measures to determine how far apart examples

VOLUME 12, 2024 143639


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

in the dataset were when executing the k-NN classifier. The involving the categorization of data. Tin K. Ho first presented
percentages of success gained were 98.08% and 95.98%. the generic random forest in 1995, then in 2001, an expansion
They suggested an approach that combines SVM and k-NN. of this approach. There are a lot of decision trees in this
The determination approach they came up with uses names method. Rather than creating each tree using the same set of
and proximity to a restriction on choices to determine which features, it generates a random forest of trees whose collective
instances to pick. The basic idea was to find similar questions prediction is more accurate than that of any one tree [102].
and construct a neighboring SVM that jelly the separation The approach relies on the fact that creating a simple decision
process on the set of similar questions [99]. Furthermore, they tree with a limited number of features requires nothing
conducted experiments using the publicly available dataset in the way of processing resources [103]. The algorithm’s
Dredze, which demonstrated an improvement in accuracy of three primary hyperparameters are node size, tree depth, and
almost 98%. In order to combat spam, they employed k-NN feature sampling. A simple RF structure is given in figure 10.
text classification using Chi squared feature selection to filter
out unwanted messages. The value of K where the k-NN
classifier obtains the highest accuracy was found through
experimentation [50]. Hnini et al. proposed using three
Nearest Neighbour (NN) methods k-NN, Wk NN, and K-d
tree to detect spam. NLP pre-processes emails and extracts
features using Bag-of-words (BoW), N gram, and TF-IDF.
k-NN performed well on four measurement parameters in
Enron and LingSpam datasets [100]. Additionally, a new
spam categorization method that combines the Harris Hawks
optimizer (HHO) and k-NN algorithms. This study found
that the proposed spam detection method had the highest
classification accuracy. The proposed approach achieved
94.3% accuracy in experiments [101]. The k-NN method for
email spam detection is presented in an algorithm 3.

Algorithm 3 k-NN Algorithm for Email Spam Detection


1: Extract class labels (spam or valid) for each email in the
training and test datasets
2: Set k = number of nearest neighbors
3: Load test set emails into D
4: Load training set emails into T
FIGURE 10. Structure of the RF.
5: Initialize empty label set L to store classifications for test
emails
6: Load training data The study of many Spam Filtering tactics and the
7: Load test data discussion of spam message categorization using various
8: for each test email d and each training email t do Machine Learning algorithms for the Spambase dataset are
9: Initialize empty set Neighbors(d) for nearest neigh- brought to a close by the methods proposed for multiple
bors parameters. RF has a higher accuracy 94.87% than other
10: if number of neighbors < k then Machine Learning techniques [104]. Furthermore, Cota et al.
11: Find k closest matches to d from T and add to used two publicly accessible corpora. In the initial set of
Neighbors(d) tests, the corpus was split into 80% for training and 20%
12: end if for testing. In the second set, the split was 70% for training
13: if number of neighbors ≥ k then and 30% for testing. Using RF, the best accuracy for the
14: Classify test email d based on labels of k nearest input corpus was 85.25% and 86.25 %, respectively. These
neighbors in T findings are consistent with other studies [46]. On the other
15: end if hand, Shrivastava et al.’s Weka implementation makes use of
16: end for cross-validation and a training set. For the training set, it’s
17: Return final classifications of emails in test set D as spam going to be the same data for both purposes. The training
or ham set is additionally split up into many folds for the purpose of
cross validation. As a result of this implementation and exper-
imentation, it’s been to the conclusion that using a training set
D. RANDOM FOREST with a Random Tree classifier yields approximately 100%
The RF is a prime example of an ensemble learning strategy accuracy in just 0.01 seconds [105]. Moreover, Goh et al.
and regression method well suited to the solution of issues improved performance by boosting, bagging, rotating forest,

143640 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

and stacking. SVM’s high accuracy would be substantially


harmed by tainted datasets, hence the authors suggest MLP.
The algorithm with the highest AUC was RF with AdaBoost,
at 93.7% [106]. Therefore, Random Tree is the most effective
technique for identifying spam e-mail. The RF algorithm for
email spam detection is shown in algorithm 4.

Algorithm 4 RF Algorithm for Email Spam Detection


1: Input: X : number of nodes per decision tree
2: Input: N : number of features per email message
3: Input: Y : number of decision trees to train
4: while termination conditions not met do
5: Randomly select email message S from training data
6: Grow decision tree Rt from S, with maximum depth X

7: Randomly select n features to split on at each node,


n≪N
FIGURE 11. Structure of the NB.
8: Compute optimal split point over the n features
9: Split node into two child nodes based on optimal split
10: end while
11: Add decision tree Rt to forest
12: repeat 1998, where the problem of message classification was exam-
13: Repeat steps 5-9 until maximum number of nodes X ined within a statistical decision theory framework [109].
reached Another aspect, the Bayesian framework is superior to other
14: until Y times to grow forest of Y trees algorithms because it integrates evidence from a variety of
15: for new email message do sources. By employing information gain to select binary
16: Pass down each of the Y trees to reach a leaf node features, the researchers conducted experiments on two
17: Classify email as spam or not based on leaf node private corpora. The results of these experiments showed
majority vote that the inclusion of non-textual elements enhanced the
18: end for classifier’s capability to categorise the messages. This finding
19: Return final classification spam or ham provides support for the NB filter’s capacity to maintain
low false positive rates. A novel method wherein incoming
emails’ URLs (links) were categorised using a NB model
[ [110]. Furthermore, this filter used delayed feedback to
E. NAÏVE BAYES be periodically refreshed with all of the messages that had
The NB classification is both a supervised learning method been classified by the system, not just the ones that had
and a statistical approach to classification. It serves as an been erroneously classified. Both spam and non-spam emails
important probabilistic method and allows us to exploit with at least one URL were included in the experiment’s
ethical grey areas by manipulating the odds of the method’s private corpus. It was determined that the system provided the
predictions. Analytical and prescriptive issues can be solved same level of performance as other URL or keyword based
with its help. Thomas Bayes created the categorization filters, with the exception that this model did not necessitate
method now known as Bayesian analysis [107]. Useful maintaining a blocklist or white list corresponding to the
learning algorithms are provided by the categorization, and URLs, making it fully automated [111]. Moreover, Ciltik et
both historical information and new experimental data can al. designed and evaluated the approaches under two models:
be combined. In order to better understand and evaluate a class general model and an email specific model. When
various learning algorithms, NB Classification provides a the two models are integrated, the latter is used in situations
useful perspective [108]. The algorithm is robust to noise where the former fails. However, the proposed system and
in the input data and is capable of accurately calculating the techniques created are universal and can be used with
likelihoods for hypotheses. A simple NB structure is given any language. Extensive testing was conducted, and results
in figure 11. showed a success rate of roughly 98% for Turkish and 99%
Sahami first proposed the NB algorithm for spam detec- for English. Time complexity has been demonstrated to be
tion, and it has since found widespread use in commercial greatly decreased without impacting performance [112].As
spam filters and open-source spam classifier implementations well as, spam emails’ influence on privacy and productivity.
thanks to its high accuracy in conducting binary classification They use NB, SVM and RF classifier to screen spam emails.
and straightforward implementation. The researchers initially NB algorithms reliably recognise and classify spam and
applied the NB method to the spam filtration problem in unwanted emails, with accuracy rates up to 98.8% [113]. The

VOLUME 12, 2024 143641


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

algorithm for email spam detection using NB is presented in


algorithm 5.

Algorithm 5 NB Algorithm for Email Spam Detection


1: Input email message dataset
2: for each email message do
3: Split message into individual component tokens
4: for each token do
5: Calculate spam probability S[W ] =
Cspam (W )
Cspam (W )+Cham (W ) FIGURE 12. Structure of the ANN.
6: Store spam probability values in database
7: end for
8: end for a dataset consisting of 2788 legitimate and 1812 spam emails
9: for each email message M do to train and evaluate their model [118]. Additionally, spam
10: Initialize spam score I [M ] = 0 email detection models challenges, as it wastes Internet traffic
11: while not end of message do and enables phishing and malware attacks. To address this,
12: Get next token Ti a feature selection-based strategy employing the sine-cosine
13: Query database for spam probability S(Ti ) algorithm (SCA) to optimize ANN for spam detection is
14: Update message’s spam probability S[M ] proposed. Experiments showed the suggested ANN classifier
15: Update message’s ham probability H [M ] surpassed other methods, achieving accuracy, precision, and
16: Compute message filtering signal: sensitivity of 97.92%, 98.64%, and 98.36%, respectively
17: I [M ] = I [M ] + S[M ] − H [M ] [119]. In this study, an ANN that has been tuned using the
18: if I [M ] > threshold then Grasshopper Optimization Algorithm (GOA) is used to create
19: Label message as spam a new method for email spam identification. The suggested
20: else GOA-ANN method outperforms traditional methods in
21: Label message as ham experiments, achieving 94.25% accuracy in classifying spam.
22: end if The research shows how bio-inspired algorithms, like GOA,
23: end while can be used to improve ANN learning for better spam
24: end for detection [120]. Furthermore, the challenges of constructing
25: Return final email classification as spam or ham efficient ANN structures and tuning parameters for spam
detection are examined. A hybrid model combining a genetic
algorithm (GA) with an ANN is proposed to optimize
F. ARTIFICIAL NEURAL NETWORK spam detection capabilities. Experiments showed the hybrid
ANN is a computational approach that draws inspiration ANN-GA model performs better in spam detection than
from the structure and functioning of biological neural conventional ANN methods [121]. Despite taking longer to
networks, such as the human brain. An ANN is composed train, neural networking can classify new patterns and tolerate
of interconnected artificial neurons organised in layers. The noisy data. The algorithm for email spam detection using
input neurons receive data, the hidden neurons process ANN is presented in algorithm 6.
information, and the output neurons generate results [114].
The power of ANN stems from the connections between these G. CONVOLUTIONAL NEURAL NETWORK
neurons which have adjustable weights that are tuned during As a type of DL method, CNN has recently risen to
training [115]. By dynamically adapting the weights to match prominence in the field of computer vision and is gaining
input and output values from the training data, ANN can attention in other areas, such as defending against email
approximate the mapping function representing relationships spam. In recent years, CNNs have been a popular topic
in the data. Information flows through the network hierarchy of study. CNN is useful because it can handle errors well,
starting from the input layer [116]. Each neuron’s activation process information in parallel, and learn on its own. It has
is determined by the input data, connection weights, and been used in the area of email spam filtering with great
the activation function, which manages how inputs are success. CNNs were described by Albelwi as a type of
transformed [117]. A simple architecture of the ANN is DL that is based on biology [122]. The network’s neurons
presented in figure 12 have weak local connections and a relatively even weight
Zhan and his team conducted research on spam classifica- distribution.A CNN is constructed by stacking multiple
tion using the ANN method. Their approach utilises descrip- trainable layers on top of each other. This is then followed
tive qualities of the evasive patterns employed by spammers, by a supervised classifier and a collection of arrays known as
rather than relying on the context or frequency of terms in feature maps, which represent the input and output of each
the message. Over several months, the researchers compiled layer. A typical CNN consists of several layers, such as a

143642 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

Algorithm 6 ANN Algorithm for Email Spam Detection 99.44% accuracy [20]. Moreover, Gupta et al. studied the
1: Input sample email message dataset efficacy of eight different classifiers and compared their
2: Initialize method parameters w (weight vector) and b results. The results of the classifier evaluation show that the
(bias term) randomly or to 0 CNN classifier achieves a maximum precision of 99.19%
3: repeat and an Average Recall of 99.26% and 99.94% respectively,
4: Get a training message sample (x, c) that our current across the two datasets [125]. As well as a CNN method was
method misclassifies, i.e. sign(wT x + b) ̸ = c developed for SMS spam detection using the Tiago dataset.
5: if no such misclassified sample exists then After preprocessing the text data, including tokenization and
6: Training completed, store final w and b and stop stopwords removal, the CNN achieved 98.40% accuracy in
7: else classifying messages as spam or not spam. The work provides
8: Update parameters: a highly accurate CNN architecture and process for SMS
9: w=w+c·x spam detection [126]. In another study, the analyses images
10: b=b+c using CNN and compares the findings to other ML methods.
11: Go to step 1 The CNN-based methodology detects real-world image spam
12: end if and challenging image spam-like datasets better than earlier
13: until methods by using a new feature set mixing raw photos and
14: To classify new email message x: Canny edges [127]. The algorithm for email spam detection
15: Compute sign(wT x + b) using CNN is presented in algorthm 7.
16: Return email message classification (spam or ham)

Algorithm 7 CNN Algorithm for Email Spam Detection


convolutional layer, a pooling layer, and a fully connected 1: Input Email Message
layer. The utilisation of multiple layers in CNN enables 2: Input parameters
the automatic acquisition of feature representations that are 3: file getting ()
extremely distinguishable, eliminating the need for manually 4: label getlabel(file)
engineered features [123]. A conventional backpropagation 5: test gettest(file)
neural network (BPN) operates on individual manually 6: vec getword2vec()
created image data, but a CNN is designed to extract valuable 7: random (label)
and essential characteristics from an email in order to classify 8: while condition do
it. A simple architecture of the CNN is presented in figure 13. 9: Nf_CV(len (Xshuffle), nf)
10: for trindex, teindex in kf do
11: Xtotal, vtotal ← xshuffle[trindex],
yshuffle[trindex]
12: Xtrain, Xdev, vtrain, vdev ← split
(Xtotal, vtotal)
13: for j < N do
14: get conv ()
15: h ← sigmoid(conv)
16: N ← getk()
17: tensorr ← gettensor()
18: for X_v in Xtrain, vtrain do
19: value, indice ← topk (tensorr)
20: tensors_get (value, indice)
21: tensora_append (tensors)
22: end for
23: end for
FIGURE 13. Structure of the CNN.
24: con (tensorp)
25: con_sigmoid (con)
A compared SMS detection using DL classifiers, AI, and
26: get softmax (conn)
CNN have been performed by [124]. CNN achieved the best
27: if getdev() then
accuracy of 99.10% and 98.25% on SMS Spam Collection
28: tr ← false
v.1 and Spam SMS Dataset 2011 12, respectively. Another
29: end if
aspect, the SMS Spam Collection dataset categorizes spam
30: end for
and ham text messages using CNN and Long Short-term
31: end while
Memory (LSTM). CNN and LSTM models extracted and
32: return Final Email Message Detection (Spam or Ham)
categorized vectors. Three CNN layers with dropouts yielded

VOLUME 12, 2024 143643


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

H. LONG SHORT-TERM MEMORY model is used to determine the spam likelihood based on
LSTM is an advanced RNN in sequence modeling. RNNs any attached images [77]. In another study, a combined
function work in a similar way the network remembers earlier model using an LSTM, LR, NB, RF, k-NN, SVM and DT
information and utilizes it to process the current input [128]. was tested on the UCI SMS spam collection dataset with
RNNs with traditional architectures have a recurring prob- various embedding techniques (count vectorizer, TF-IDF
lem. Because of the phenomenon known as the vanishing vectorizer and hashing vectorizer). The highest accuracy of
gradient, RNN) are incapable of retaining and recalling long- 98.5% was achieved by the LSTM method in this combined
term dependencies. LSTM is specifically designed to mitigate architecture [78]. Moreover, a Semantic LSTM (SLSTM)
risks related to long-term reliance [129]. The default behavior was proposed for spam SMS detection and classification
of LSTM is to learn long-term dependencies by memorizing using the SMS Spam Collection dataset and Twitter dataset.
information over lengthy periods of time. LSTM employs The SLSTM incorporates a semantic layer into an LSTM
gates to regulate information flow in recurrent computations. network using Word2Vec word embeddings. Experiments
LSTM was designed in 1997, this type of recurrent neural showed the proposed SLSTM technique achieved accuracy
network to deal with temporal data sequences and to solve results of 99.01% on the SMS Spam Collection dataset
the challenges of expanding and vanishing gradients, which and 95.09% on the Twitter dataset [132]. Furthermore,
is a problem [130]. A memory cell is included in this neural a lightweight GRU (LG-GRU) was employed instead of
network which can hold values that have been recorded the LSTM layer for spam classification on the SMS Spam
throughout time in relation to previous information. The Collection dataset. To improve the semantic understanding of
memory cell is controlled by three gates. Each of the gates the SMS text inputs, external information from WordNet was
serves a different function. The forget gate is responsible incorporated. Compared to LSTM models, the proposed LG
for determining whether the information from the previous GRU model drastically reduced training time and the number
timestamp should be retained or disregarded. The input of parameters, while maintaining 99.04% accuracy for spam
gate is responsible for acquiring fresh information from the categorization [79]. Additionally, RNNs are one type of NN
input [131]. The output gate which sends the new information that can remember past data but suffer from vanishing and
from current to the next timestamp. This is accomplished via a exploding gradient issues. To overcome this drawback, the
sigmoid function, which returns a number between zero that proposed system leverages the Spambase and Ling Spam
is (‘‘totally forget’’) and one which is (‘‘completely keep’’) datasets to classify spam and ham emails using an LSTM
when executed. Every time an LSTM network is activated, architecture. LSTM keeps track of prior email information
it creates two states. Those are, a cell state that is passed and learns to select relevant features while ignoring irrelevant
to the next time-step, as well as time-step’s output vector is ones for identifying spam. Experiments showed the LSTM
hidden state. A simple architecture of the LSTM is presented method achieves 97.4% accuracy, outperforming other DL
in figure 14. methods on these datasets [80]. Moreover, spam emails are
used for propaganda, advertising, and phishing, which can
financially and morally harm internet users as well as disrupt
internet traffic. To address this issue, detected spam emails
in a Turkish dataset with 100% accuracy using the Keras
library and LSTM method. The results demonstrated that an
LSTM based method was highly effective for spam detection
in Turkish emails [133]. Furthermore, spam emails cause
issues like network disruption and cybercrime. A sentiment
analysis-friendly spam mail detection method was proposed
using Word Embedding techniques including Bag of Words,
Hashing, and an LSTM method. Experiments on a dataset
of 5,572 messages showed the proposed technique achieved
93-98% in precision, recall, F1-score, and accuracy [134].
FIGURE 14. Structure of the LSTM. The algorithm for email spam detection using an LSTM is
presented in algorithm 8.
Since their introduction, several DL based spam detec-
tion algorithms have been proposed. Yang and his team
outlined an email classification system called Multi-Modal I. GATED RECURRENT UNIT
Architecture with Model Fusion (MMA-MF). The primary GRU is an RNN version that employs gating methods to solve
focus of this model is to identify spam by processing the vanishing gradient problem through controlling information
email’s text and images independently using an LSTM flow between cells in the neural network. Kyunghyun Cho
model and a CNN model, respectively. An LSTM model introduced the GRU network in 2014, This RNN is almost
is utilized to determine the likelihood that an email is like LSTM neural network [135]. The structure of the GRU
spam based on its textual content. Meanwhile, a CNN allows it to effectively capture dependencies from large

143644 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

Algorithm 8 LSTM Algorithm for Email Spam Detection relevant context and sequentially whether the message is
1: Input Email Spam dataset likely to be spam or not. The ability of GRUs to selectively
2: Convert the text data into numerical vectors using word propagate relevant information while processing variable
embeddings length sequences makes them a promising approach for
3: Split the data into training and testing modeling email text for spam detection [70]. Moreover,
4: Define LSTM architecture a new DL approach uses CNN and RNN to analyze
5: Set the LSTM units and hidden layers email communication by classifying message components
6: Add an embedding layer to convert numerical vectors into zones. The method leverages GRU-CRF to segment
into word embedding emails into zones like header, quotation, greeting, and body.
7: Add dropout Experiments show the technique achieves 98 accuracy on
8: Add dense output layer using sigmoid zone prediction, outperforming traditional methods, with
9: Compile with binary cross-entropy improved adaptability and resilience [140]. Furthermore,
10: Train the method with specified epochs a lightweight GRU (LG-GRU) was employed instead of
11: Evaluate the method an LSTM layer for spam classification on the SMS Spam
12: Predict the email message (spam or ham) Collection dataset. To improve the semantic understanding
of the SMS text inputs, external information from WordNet
was incorporated. Compared to LSTM models, the proposed
sequences of data in a flexible manner, while retaining LG-GRU model drastically reduced training time and the
knowledge from prior sections of the sequence. The GRU number of parameters, while maintaining 99.04% accuracy
model consists of two gating mechanisms: the update gate for spam categorization [79]. The algorithm for email spam
and the reset gate [136]. This neural network utilises only one detection using GRU is presented in algorithm 9.
hidden state to concurrently retain both long-term and short-
term memory. The reset gate is formulated and calculated by Algorithm 9 GRU Algorithm for Email Spam Detection
incorporating the hidden state from the previous time step and 1: Input Email Spam dataset
the input data from the current time step. The gate controls 2: Convert the text data into numerical vectors using word
the integration of new input with existing memory [137]. The embeddings
update gate is used for how much of the previous state is 3: Split the data into training and testing
kept. This is extremely useful since the method may choose 4: Define GRU architecture
to duplicate all previous data and remove the possibility 5: Set the GRU units and hidden layers
of vanishing gradients. This is accomplished via a sigmoid 6: Add an embedding layer to convert numerical vectors
function, which returns a number between 0 and 1. For this into word embedding
simple architecture, the network is able to train rapidly [138]. 7: Add dropout
A simple architecture of the GRU is presented in figure 15. 8: Add dense output layer using sigmoid
9: Compile with binary cross-entropy
10: Train the method with specified epochs
11: Evaluate the method
12: Predict the email message (spam or ham)

J. BIDIRECTIONAL LONG SHORT-TERM MEMORY


Bi-LSTM builds on the standard LSTM architecture to
method sequential data more effectively. In contrast to
traditional LSTMs that process inputs in only the for-
ward direction, Bi-LSTMs also process the sequence in
FIGURE 15. Structure of the GRU. reverse [141]. This bidirectional approach provides complete
past and future context to the method. The Bi-LSTM is
Email spam detection poses a sequence modeling problem composed of two LSTM layers. One layer processes the
well-suited for GRU. A GRU-based architecture for spam input sequence in a forward direction, starting from the
detection would process the email text sequentially, encoding beginning and ending at the end. The other layer processes
each word into a hidden state vector. The gating units the input sequence in a reverse direction, starting from
in the GRU regulate the flow of information, learning to the end and ending at the beginning [114]. The outputs
identify key words and phrases that serve as indicators of from both directions are concatenated at each time step
spam or legitimate emails [139]. Additionally, as the GRU to generate the final output. This allows the method to
progresses through the email text, its hidden state captures preserve contextual information from the entire sequence

VOLUME 12, 2024 143645


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

when making predictions [142]. A simple architecture of the Algorithm 10 Bi-LSTM Algorithm for Email Spam Detec-
Bi-LSTM is presented in figure 16. tion
1: Input Email Spam dataset
2: Convert the text data into numerical vectors using word
embeddings
3: Split the data into training and testing
4: Define Bi-LSTM architecture
5: Set the Bi-LSTM units and hidden layers
6: Add an embedding layer to convert numerical vectors
into word embedding
7: Add dropout
8: Add dense output layer using sigmoid
9: Compile with binary cross-entropy
FIGURE 16. Structure of the Bi-LSTM. 10: Train the method with specified epochs
11: Evaluate the method
The task of email spam detection involves the construction 12: Predict the email message (spam or ham)
of models that capture the contextual information of words
inside an email, enabling the determination of whether the
email’s content may be classified as spam or not. The LSTM’s adaptability to various writing styles and content
Bi-LSTM model is very suitable for this particular task types further enhances its effectiveness across different
because of its ability to effectively capture both semantic datasets and evolving spam techniques.
and syntactic links between words. This is achieved by To further improve LSTM’s accuracy in email spam
processing the email content in both forward and backward detection, several modifications can be considered. Incor-
orientations [143]. Additionally, a new DL model for email porating attention mechanisms could help the model focus
spam detection using sentiment analysis of email text, on the most relevant parts of an email. Ensemble methods,
combining WordEmbeddings, CNN, and Bi-LSTM networks combining LSTM with other models, could leverage the
to analyze textual and sequential properties. Evaluated on two strengths of different approaches. Transfer learning, by pre-
spam datasets, the method achieves improved accuracy of 98- training the LSTM model on a large corpus of email
99% and outperforms popular classifiers and state-of-the-art data, could enhance performance, especially when dealing
methods, proving its superiority for spam detection [144]. with limited labeled data. Additional strategies such as
Moreover, spam emails are becoming more common and feature engineering, regularization techniques, hierarchical
troublesome as email usage grows, so there is a need for LSTM structures, and character-level input processing could
effective methods to detect spam. A recent study compared also contribute to improved accuracy.Furthermore, numerous
different ML and DL models, such as RF, NB, ANN, SVM, evaluation metrics have been employed to measure the
LSTM, and Bi-LSTM, for the task of identifying spam effectiveness of these LSTM model. Here are some frequently
emails. The study found that Bi-LSTM had the best accuracy used metrics in the papers we have reviewed:
of 98.57% for spam prediction [145]. Furthermore, spam text Accuracy: Accuracy is one factor to consider when
messages steal information from users and hurt them, but rating categorization models. Accuracy is the proportion
the methods available for finding them aren’t good enough. of forecasts that method predicted successfully. For binary
The vectorization-based feature engineering and Bi-LSTM classification, accuracy can also be assessed in terms of
networks can be used together to make an effective predictor positives and negatives, as shown below:
that can find spam SMS. Experiments showed that the method
is more accurate than other methods in terms of precision, TP + TN
Accuracy = (1)
recall, and F1 measures [146]. The algorithm for email spam TP + TN + FP + FN
detection using Bi-LSTM is presented in algorithm 10.
The LSTM model has proven to be the most effective Precision: Precision can also be used to judge how well an
for email spam detection due to its specialized architecture identifying system works. It is found by adding up the number
designed for sequential data. Emails are inherently sequen- of true positives to the number of fake positives for each class.
tial, consisting of words and sentences in a specific order, It shows really good cases out of all the optimistic forecasts.
which aligns perfectly with LSTM’s strengths. The model’s TP
memory cell excels at capturing long-term dependencies Precision = (2)
TP + FP
and contextual information, allowing it to effectively learn
patterns and relationships between words or tokens in email Recall: Recall is a quantitative measure that indicates the
sequences. This ability to retain and process contextual infor- proportion of instances correctly identified by the method
mation over many timesteps is crucial for spam detection, among all the possible positive labels. The term refers to the
as important clues may be spread throughout the email body. ratio of true positive cases to the sum of true positive and false

143646 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

negative cases. missing numbers or the substitution of such values with the
mean, the median, or specified values.
TP
Recall = TPR = (3)
TP + FN C. TEXT PREPROCESSING
F1-score: The accuracy metric quantifies the frequency at Text preprocessing transforms raw text data into a cleaner
which a model accurately predicted the entirety of the dataset. form before analysis. Removing extraneous elements allows
  more accurate feature extraction and developing further
Precision*Recall downstream. Preprocessing is thus an essential first step when
F1-score = 2 ∗ (4)
Precision + Recall working with text data. Common text cleaning tasks include
stripping punctuation, deleting HTTP links, eliminating
IV. DATASETS COLLECTION AND PRE-PROCESSING special characters, getting rid of stop words, lowercasing
A. DATASETS all text, correcting spellings, and more. Numerous text-
The collection of data samples contained within a corpus preprocessing techniques exist for the purpose of eliminating
plays a pivotal role in evaluating the efficacy of any spam unnecessary information from incoming text input, as shown
detection technique. While there exists several conventional in Figure 17.
datasets that are commonly leveraged to assess text classifi-
cation, only recently have researchers publishing new spam
detection methodologies made an effort to provide public
access to the same corpora of emails applied to assess the
effectiveness of their proposed methods. A comprehensive
listing of publicly released spam email datasets referenced
across the datasets characterize covered in this paper are sum-
marized in Table 5. Each corpus contains intrinsically unique
traits and labeling that ultimately dictate the generalizability
and alignment of experimental outcomes for every published
approach utilizing that data source. Key dimensions that
characterize an evaluation dataset’s nature include the size
of emails, proportional class balance between spam and ham
samples.
The vast majority of features leveraged to distinguish spam
from legitimate emails manifest in textual content. Applying
appropriate pre-processing to standardize, clean, and filter
this text data represents a foundational data wrangling step FIGURE 17. Various text preprocessing techniques.
prior to method development. The following sub-section
provides the details of pre-processing techniques. 1) STEMMING
Stemming seeks to simplify text analysis by stripping words
B. PRE-PROCESSING TECHNIQUES down to their base form. Tools match terms like ‘‘drunk’’,
Before data can be analyzed, it must be prepared through ‘‘drink’’, and ‘‘drank’’ to their core stem - ‘‘drink’’. This
a process called preprocessing. Raw datasets often contain normalization groups together different inflections, allowing
inconsistencies like missing values, duplicate entries, and more generalized patterns to emerge. Stemmers remove
text in incompatible formats that methods cannot interpret. suffixes systematically using rule-based algorithms like the
Preprocessing transforms messy raw data into a clean form popular Porter stemmer in Python’s NLTK library. However,
that analytical methods can work with effectively. This overly zealous stemming risks both under stemming and over
crucial step improves the accuracy of later analysis. Common stemming textual data. Under stemming fails to fully reduce
preprocessing tasks include handling incomplete data, stan- related terms down to one stem.
dardizing text into numerical forms, extracting informative
features, and removing noise. Careful preprocessing allows 2) TOKENIZATION
methods to discover more robust patterns and make better Tokenization splits text into discrete units for analysis. First,
predictions. Mostly used preprocessing techniques for email extraneous characters like HTML and punctuation are filtered
spam detection is given below: out. Then words and numbers are extracted into individual
tokens by splitting on whitespace and symbols. These atomic
1) HANDLING MISSING VALUES elements can be manipulated, counted, classified and more.
The management of missing values in datasets is a key Tokenization forms the basis for quantitative text analysis.
component in preventing bias and ensuring that methods This preprocessing step makes linguistic features accessible
continue to produce accurate results. There are a number of using Python’s Regex library and Natural Language Pro-
approaches that can be utilized, including the elimination of cessing toolkits. Proper tokenization increases performance

VOLUME 12, 2024 143647


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 5. Publicly available email spam datasets.

on tasks ranging from sentiment classification to document all share the lemma ‘‘play’’. Lemmatizers can thus group
summarization. Table 6 shows a sample sentence and its together different inflections and variants by canonicalizing
associated tokens. them to their common origin. Tools like NLTK’s WordNet
Lemmatizer leverage semantic databases to correctly resolve
3) STOPWORDS REMOVAL words to their underlying lemma based on context. Prop-
Stop words are common filler words that carry little meaning, erly deploying lemmatization avoids incorrectly collapsing
such as ‘‘a’’, ‘‘an’’, ‘‘so’’, ‘‘and’’, and ‘‘the’’. Though unrelated words while clustering together meaningful word
frequently occurring, these terms contribute more noise associations, boosting performance on semantics-sensitive
than signal during text analysis. Filtering out stop words tasks.
shrinks datasets down to more meaningful vocabulary. Most
D. FEATURE EXTRACTION TECHNIQUES
text analysis toolkits provide standard stop word lists and
Feature extraction converts unstructured text into quantitative
functions like Python’s NLTK library to effortlessly strip
data amenable to modeling, by transforming documents
this cover. Table 7 presents the descriptions and web URLs
into numerical vectors. Common methods calculate Term
of several libraries and packages that are accessible for the
Frequency-Inverse Document Frequency (TF IDF) weights,
purpose of preprocessing text data.
Bag of words (BoW), count N-gram patterns, encode
syntactic Parsing Trees, apply Topic Modeling algorithms
4) NORMALIZATION like Latent Dirichlet Allocation, or ingest word vectors
Normalization transforms text into a standard format to (Word2Vec). Robust text analytics combines multiple feature
enhance analysis. This preprocessing step structures messy extraction methods to fully capture linguistic complexity
linguistic data by correcting variant spellings, coercing case within interpretable data structures.
and tense, resolving contractions, converting numbers to Spam is a major issue in current email communication,
numerals, transliterating terms, aligning related words to a stemming from motives like advertising and fraud. To effec-
root form via stemming and lemmatization, and more. tively detect spam, appropriate preprocessing techniques
are needed, such as removing noise, taking out common
5) LEMMATIZATION stop words, stemming, lemmatization, and adjusting term
Lemmatization maps words to their root form using lexical frequencies. Mallampat et al. proposed a multi-modal system
analysis. It relies on dictionaries and knowledge of mor- (MMA FM) that uses a combined method (IMTF-IDF+Skip-
phology to connect related terms to the same base lemma. thoughts) and a CNN to extract features. This achieves
For example, the words ‘‘plays’’, ‘‘playing’’, and ‘‘played’’ superior 99.16% accuracy in identifying spam compared

143648 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 6. A representation of a sentence and the tokens it automatically generates.

TABLE 7. Pre-processing tools for text.

to using Naive Bayes, when tested on the Enron, Dredze, classifiers. Bag-of-words style features unlock effective text
and TREC 2007 datasets [161]. Saini et al. introduced a analysis despite ignoring complex linguistic structure [170].
new method for predicting email spam that uses random The flexibility of multiple vocabulary quantification strate-
forest for feature extraction. The features extracted by gies enables customized feature engineering for tasks ranging
the random forest are then fed into a logistic regression from spam detection to sentiment analysis across domains.
method which predicts whether an email is spam [162].
Cheng et al. presented a new attack method that strategically 2) ONE HOT ENCODING
modifies text data using insights from adversarial examples. One-hot encoding transforms text into numeric features by
It intentionally alters features that represent an email. They assigning each unique word or token its own binary vector.
explored different feature extraction techniques using various Documents represent bags of these orthogonal hot vectors -
NLP methods. Their study designs effective mechanisms to sparse yet unambiguous codes with a single ‘‘1’’ marking
translate adversarial perturbations back into magic words"in the presence of each distinctive term. One hot encoding
the text. This causes intentional misclassifications across matrices efficiently quantify textual data, with vector lengths
multiple datasets and ML methods under white-box, gray- equal to vocabulary size rather than the longer original
box and black-box attack scenarios [163]. Hassan et al. raw text. By indexing words into binary indicator columns,
tested different feature extraction techniques along with two this method facilitates quantitative analysis while retaining
supervised ML classifiers on two public spam email datasets. the ability to map patterns back to original tokens. One
They emphasized the importance of finding the optimal hot encoding forms the input for many machine learning
pairing of feature extraction and classification method. They algorithms, often outperforming methods lacking explicit
also highlighted the benefits of testing on different datasets. word-level encoding. The simplicity of tallying vocabulary
SVM and NB showed impressive accuracy with TF-IDF, into orthogonal dimensions makes one hot representation a
reaching over 99% and around 98% respectively [164]. widely useful feature extraction technique for textual data.
Table 8 presents the previous research on spam detection
3) WORD EMBEDDING
using feature extraction techniques.
One-hot encoding scales poorly to large vocabularies due
to its explosion of sparse binary features. Embedding
1) BAG OF WORDS (BOW) methods address this weakness through distributed repre-
BoW representation is a simple yet powerful approach for sentation. Word embeddings map language into compact
extracting numeric text features. This method counts the dense vectors capturing similarities between related terms.
occurrences of words within a document while disregarding For instance, vectors for cat and kitten cluster together,
grammar and word order. Documents become vectors denot- unlike the orthogonal one-hot encoding. This efficiency
ing the frequency of terms like ‘‘cat’’, ‘‘tree’’, and ‘‘slept’’. facilitates DL on extensive corpora. Embeddings also encode
Bags-of-words thus efficiently quantifies unstructured text meaning - algebraic operations reveal relationships like
as matrices tallying vocabulary. Many extensions enrich this king is to queen as man is to woman. Created using
basic technique like n-grams counting multi-word expres- neural networks, embeddings represent both syntax and
sions and skip-grams sampling non-contiguous patterns. For semantics within a low-dimensional subspace. Methods
instance, Barushka et al. detected deceptive hotel reviews on learn contextual associations, quantifying elusive concepts
TripAdvisor by representing documents as n-gram frequen- like gender or formality. Versatile representations power
cies and skip-gram embeddings to train machine learning cutting-edge applications from chatbots to search. Embed-

VOLUME 12, 2024 143649


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

dings distill enormous dictionaries into meaningful, manip- 7) GLOVE WORD EMBEDDING
ulable codes advancing the frontiers of text mining. This is an unsupervised method that generates a vector to
represent words or text. It aims to capture semantic and
4) WORD2VEC contextual meaning of words. It is a count-based method
This method turns words into vectors and works like a that utilizes co-occurrence statistics of words in a corpus.
two-layer network to handle text that is made up of words. Specifically, it trains on the non-zero entries in a word-
There is a matched vector in the space for every word in the context co-occurrence matrix. The key intuition behind Glove
corpus. Either a continuous skipgram or a continuous bag of Word Embedding is that ratios of word-word co-occurrence
words design (CBOW) is used by Word2vec. In the case of probabilities can encode meaning. Equation 3 demonstrates
the continuous skipgram, the present word is used to guess the computation of the co-occurrence probability for the texts
the words that come after it. In the CBOW method, on the in each word embedding.
other hand, the surrounding or neighboring words are used to V (tx, ty, tz) = Fxy /Fyz

(7)
guess a middle word. With a small amount of training data,
the skip-gram method can correctly represent even rare words where,
or phrases. However, the CBOW method is many times faster • The co-occurrence possibility for the texts tx and ty is
to train and is a little more accurate for common keywords. Fxy .
The word2vec method is better because it lets you learn • The co-occurrence possibility for the texts ty and tz is
high-quality word embedding in less time and space. From Fyz .
a much larger body of writing, it is possible to learn bigger • The regular texts or words that appear in a document are
embeddings (with more dimensions). tx and ty and the investigated text is tz.
• When the above-mentioned ratio is 1, the investigated
5) N-GRAMS text is related to tx rather than ty.
A lot of Natural Language Processing (NLP) tasks use N-
grams, which are long strings of words or tokens in a text. V. IMPLICATIONS
Based on the number of ‘‘n,’’ they are divided into different The review covered a comprehensive analysis and inte-
groups, such as Unigram, Bigram, and Trigram. Kanaris et al. gration of the present condition of email spam detection.
used a set of 2,893 emails to pull out n-gram traits from text. A broad range of ML and DL approaches for email spam
In their study, they looked at success factors like spam recall detection is covered, along with an analysis of how these
and precision. Combining SVM with n-grams, they were able approaches could be improved for greater efficiency. The
to make a spam filtering method that had an accuracy score of review explored the intricate difficulties encountered in
more than 0.90 for finding spam [171]. Table 9 below shows identifying and screening spam emails while recognizing
several examples of N-grams. the constraints of conventional techniques such as blocklists,
real-time blackhole listing, and content-based approaches.
6) TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY
The review analyzed and addressed current research defi-
(TF-IDF)
ciencies, shedding light on areas that require additional
exploration. This will emphasize the continuous necessity for
The BoW approach faces challenges with high-frequency
innovation and enhancement in spam detection techniques.
terms dominating the data while lower-scoring domain-
In addition, the study suggested potential areas for future
specific words may be eliminated or ignored. An improve-
research, highlighting possible paths for further advancement
ment on bag-of-words is the TF-IDF technique. TF-IDF
and directing researchers towards addressing the observed
multiplies the number of times a term appears in a document
deficiencies.
TF by the inverse of how often that term shows up
The review emphasized the importance of effective spam
across all documents (inverse document frequency or IDF).
detection in order to safeguard users from the detrimental
These scores highlight unique and information-rich terms
effects of spam, including time wastage, resource depletion,
within a document. As demonstrated by Equations 1 and 2,
and potential data theft, given the widespread use of emails
TF represents the ratio of the term’s count in the document
across many industries. The objective of the study is to offer
to the overall count of all terms. On the other hand, IDF is
a methodical and empirically supported comprehension of
the logarithm of the total number of documents divided by
current research, assessing the efficacy of various ML and
the number of documents that contain the term. The resulting
DL techniques. Through the synthesis and examination of
TF-IDF scores better represent a term’s significance.
data from many studies, it aims to provide an impartial
assessment of the advantages and disadvantages of current
Number of times word w appears in a document methodologies. The thorough assessment of methods for
Tf(w) =
Total number of terms in the document identifying email spam has substantial ramifications for
(5) the domains of digital communication and cybersecurity.
Total number of documents The study examined the application of various ML and
Idf(w) = loge (6) DL techniques, with a focus on shifting from traditional
Number of documents with term w
143650 VOLUME 12, 2024
E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

TABLE 8. Feature extraction techniques for email spam.

methodologies to more sophisticated ones. This change has TABLE 9. Illustration of an n-grams.
the capacity to improve the precision of detection and the
efficiency of computing. This technological advancement
may lead to enhanced email systems that offer more
robust defenses against harmful material and reduce the
wasteful consumption of resources. The review comprised various adversarial attacks - poisoning attacks that pollute
a comprehensive analysis and integration of the present training data, evasive attacks that manipulate test samples
condition of email spam detection. to bypass filters, and privacy attacks attempting to steal
sensitive training data. Deep fakes leveraging AI generation
VI. CHALLENGES OF EMAIL SPAM DETECTION and modification techniques around images, video and text
Spam detection systems have difficulty figuring out how to for disseminating misinformation further threaten credibility.
properly evaluate features across textual, temporal, semantic, Imbalanced datasets with far more legitimate emails than
and statistical dimensions because the amount of different spam continue biasing method performance towards false
and complicated data on the Internet is growing all the positives. Research on intelligent oversampling methods aims
time. Additionally, most methods are trained on balanced to improve minority class representation during training. The
datasets which rarely match real-world conditions. Self- dynamic evolution of spam tactics also reduces generalization
learning methods that can adapt without manual supervision capabilities against new previously unseen attacks. Ensuring
remain an open area. Spam detection methods also face method robustness through adversarial training is an active

VOLUME 12, 2024 143651


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

research direction. Potential adversarial samples crafted robust solutions. Furthermore, it is essential for future
specifically to fool deep nets pose reliability hurdles. research to focus on providing researchers with standardized
Detecting adversarial patterns and training on adversarial labelled datasets to train classifiers. Additionally, enhancing
datasets helps improve resilience. the accuracy and reliability of spam detection methods can
The black-box nature of deep networks also hampers be achieved by incorporating other features into the dataset,
method interpretability and user trust. Advancing explainable such as the spammer’s IP address and location. The following
AI to increase transparency in method behaviors and deci- are further fields of future study and open problems that need
sions thus remains important. The computational intensity to be solved in the field of spam detection:
for large-scale DL limits accessibility to organizations with Current spam detection approaches rely heavily on limited
fewer resources, though optimizations around method effi- features from email headers, subject lines, and message
ciency and hardware acceleration are progressively lowering bodies. To improve accuracy, more comprehensive and
barriers. automated feature engineering is needed, moving beyond
Generalizability across different email systems, user manual selection. While most evaluations focus on statistical
groups, and usage patterns is needed for wide real-world performance metrics like precision and recall, incorporating
deployment. Multi-model learning and personalization are time complexity analysis would provide crucial insight into
promising techniques under investigation. Adoption is made real-world viability. Exploring advanced feature extraction
harder by problems with privacy, usability, and integrating methods using DL on various email components, beyond
content analysis across a wide range of old systems just message bodies, can reveal more nuanced signals for
and infrastructure. Limited availability of labeled data for detection. Several system design aspects warrant focus to
adequately training deep nets continues to be an industry wide enhance practical applicability. These include improving fault
bottleneck, although data augmentation, transfer learning tolerance for reliability, ensuring quick response times under
and semi-supervised techniques help multiply value from heavy loads, and implementing self-learning capabilities
limited labels. Finally, meeting real-time latency demands at without manual supervision for robust adaptability to evolv-
scale for live traffic with deep methods has throughput and ing spam tactics. Dynamic updating of feature representations
optimization implications. Quantization, network pruning using deep neural networks as new spam data emerges
and efficient method distillation actively aim to improve can bolster detection relevance over time. Ensuring strong
inferencing speed. security mechanisms against exploratory attacks or poisoning
of the pipeline data or model itself is imperative for trust-
VII. RESEARCH GAPS AND OPEN RESEARCH PROBLEMS worthy operation. Reducing false positive rates continues to
This section examines the areas where research is lacking and pose challenges to usability. Expanding beyond textual spam
the problems that remain in the field of spam identification. to effectively flag image-based messages and addressing
Current detection approaches rely heavily on manually engi- real-time threats rather than relying on batch processing,
neered datasets which rarely match the nuanced complexities given the low latency constraints of email systems, will sig-
of real-world spam. Future work should select developing nificantly expand practical applicability. Several promising
robust methods using authentic spam samples only. Though research directions emerge. The lack of labeled multilingual
ML, fuzzy logic and DL methods are individually leveraged corpora presents an opportunity for developing more globally
today, hybrid systems that synergistically combine multiple effective solutions. Semi-supervised learning methods could
techniques could potentially improve accuracy and efficiency help leverage vast amounts of unlabeled data. Identifying
further. Enhanced feature engineering leveraging deep neural coordinated spammer networks and behaviors could lead
networks’ self-learned representations via representation to more proactive defense strategies. Rather than manual
learning presents promising opportunities to automatically labeling or curation that can introduce unconscious bias,
capture differentiating attributes. Clustering algorithms that discovering ground truth spam characteristics automatically
enable dynamic spam database updates based on continuous through federated learning over decentralized data holds
user feedback requires exploration for tighter spam relevancy. potential for more robust and unbiased models. Exploring
In addition to DL based blockchain methods and concepts the potential of large language models in transforming
can potentially be employed for email spam detection in the spam detection is justified due to their ability to catch
future. Advancing the art of manual spam dataset annotation intricate patterns and contextual nuances that conventional
by collaborating with linguistics and psychology experts methods may overlook. Studying the potential of fine-
can potentially better encapsulate semantic and cognitive tuning pre-trained models such as BERT or GPT for spam
subtleties within messages for training more discerning mod- classification tasks could lead to the development of more
els. Hardware optimizations leveraging graphics cards and precise and flexible spam detection systems. Moreover, the
field-programmable gate arrays provide additional vectors to utilization of these expansive models could potentially tackle
improve real-time throughput and latency when classifying existing obstacles in spam detection, including managing
high-velocity email streams. Centrally, the availability of evolving spam strategies and minimizing false positives,
multifaceted, standardized labeled corpora spanning diverse, hence facilitating the development of more resilient and
real-world spam types remains lacking, constraining more effective spam detection solutions.

143652 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

VIII. CONCLUSION [9] O. Fonseca, E. Fazzion, I. Cunha, P. H. B. Las-Casas, D. Guedes,


The purpose of this literature review is to provide a W. Meira, C. Hoepers, K. Steding-Jessen, and M. H. P. Chaves,
‘‘Measuring, characterizing, and avoiding spam traffic costs,’’ IEEE
summary of the most recent research on the application Internet Comput., vol. 20, no. 4, pp. 16–24, Jul. 2016.
of ML and DL for the detection of spam in email. [10] S. Ogwu, P. Sice, S. Keogh, and C. Goodlet, ‘‘An exploratory study of the
It provides illuminates on a number of shortcomings as well application of mindsight in email communication,’’ Heliyon, vol. 6, no. 7,
Jul. 2020, Art. no. e04305.
as potential enhancements that could be made to enhance [11] O. A. Okunade. (2017). Manipulating e-mail Server feedback for Spam
the efficiency of detection against constantly developing Prevention. [Online]. Available: https://www.azojete.com.ng
spammer strategies. The implementation of detection systems [12] (2023). Firms. Accessed: Dec. 28, 2023. [Online]. Available:
in close proximity to primary servers, expanding beyond https://99firms.com/blog/spam-statistics/
[13] S. Dhanaraj and V. Karthikeyani, ‘‘A study on e-mail image spam filtering
linguistic analyses, and broadening the scope of content techniques,’’ in Proc. Int. Conf. Pattern Recognit., Informat. Mobile Eng.,
evaluation are all examples of prospective advancements. Feb. 2013, pp. 49–55.
Prioritization is crucial for effectively addressing modern [14] A. Bhowmick and S. M. Hazarika, ‘‘Machine learning for e-mail spam
filtering: Review, techniques and trends,’’ 2016, arXiv:1606.01042v1.
attack types, managing concept drift, enhancing model
[15] C. Laorden, X. Ugarte-Pedrero, I. Santos, B. Sanz, J. Nieves, and
generalizability, and aligning training with test performance. P. G. Bringas, ‘‘Study on the effectiveness of anomaly detection for spam
The report examines the evolution of machine learning filtering,’’ Inf. Sci., vol. 277, pp. 421–444, Sep. 2014.
and deep learning applications in distinguishing spam from [16] N. Ahmed, R. Amin, H. Aldabbas, D. Koundal, B. Alouffi, and T. Shah,
‘‘Machine learning techniques for spam detection in email and IoT
legitimate communications, particularly in the context of platforms: Analysis and research challenges,’’ Secur. Commun. Netw.,
spammers circumventing existing filters. By comparing vol. 2022, pp. 1–19, Feb. 2022.
various methodologies and highlighting unresolved research [17] S. Gibson, B. Issac, L. Zhang, and S. M. Jacob, ‘‘Detecting spam
email with machine learning optimized with bio-inspired metaheuristic
challenges, the report illuminates persistent difficulties in algorithms,’’ IEEE Access, vol. 8, pp. 187914–187932, 2020.
this field. While current state-of-the-art approaches have [18] T. Gangavarapu, C. D. Jaidhar, and B. Chanduka, ‘‘Applicability of
limitations, focused efforts on recommended improvements machine learning in spam and phishing email filtering: Review and
can significantly enhance both accuracy and efficiency. approaches,’’ Artif. Intell. Rev., vol. 53, no. 7, pp. 5019–5081, Oct. 2020.
[19] S. Zavrak and S. Yilmaz, ‘‘Email spam detection using hierarchical
Future research can be directed towards identified short- attention hybrid deep learning method,’’ Expert Syst. Appl., vol. 233,
comings to develop more robust anti-spam systems. The Dec. 2023, Art. no. 120977.
synthesized insights enable researchers to refine spam [20] P. K. Roy, J. P. Singh, and S. Banerjee, ‘‘Deep learning to filter SMS
spam,’’ Future Gener. Comput. Syst., vol. 102, pp. 524–533, Jan. 2020.
protection strategies through meticulous enhancements that
[21] S. Magdy, Y. Abouelseoud, and M. Mikhail, ‘‘Efficient spam and phishing
proactively address both current and emerging threats. Key emails filtering based on deep learning,’’ Comput. Netw., vol. 206,
areas for improvement include adapting to evolving attack Apr. 2022, Art. no. 108826.
patterns, mitigating concept drift in spam detection models, [22] S. Kaddoura, G. Chandrasekaran, D. Elena Popescu, and J. H. Duraisamy,
‘‘A systematic literature review on spam content detection and classifica-
improving model generalizability across diverse communica- tion,’’ PeerJ Comput. Sci., vol. 8, p. e830, Jan. 2022.
tion contexts, and reducing discrepancies between training [23] T. Lin, D. E. Capecci, D. M. Ellis, H. A. Rocha, S. Dommaraju,
and real-world performance. By concentrating on these D. S. Oliveira, and N. C. Ebner, ‘‘Susceptibility to spear-phishing
emails,’’ ACM Trans. Comput.-Hum. Interact., vol. 26, no. 5, pp. 1–28,
aspects, researchers can create more effective and adaptable Oct. 2019.
anti-spam solutions that stay ahead of sophisticated spam [24] K. Thakur, M. L. Ali, M. A. Obaidat, and A. Kamruzzaman, ‘‘A
tactics. systematic review on deep-learning-based phishing email detection,’’
Electronics, vol. 12, no. 21, p. 4545, Nov. 2023.
[25] R. Li, Z. Zhang, J. Shao, R. Lu, X. Jia, and G. Wei, ‘‘The potential harm
REFERENCES of email delivery: Investigating the HTTPS configurations of webmail
services,’’ IEEE Trans. Dependable Secur. Comput., vol. 21, no. 1,
[1] K. Deshpande, J. Girkar, and R. Mangrulkar, ‘‘Security enhancement and
pp. 1–14, Aug. 2023.
analysis of images using a novel sudoku-based encryption algorithm,’’
J. Inf. Telecommun., vol. 7, no. 3, pp. 270–303, Jul. 2023. [26] A. Abayomi-Alli, O. Abayomi-Alli, S. Misra, and L. Fernandez-Sanz,
[2] D. Goel and A. K. Jain, ‘‘Mobile phishing attacks and defence ‘‘Study of the yahoo-yahoo hash-tag tweets using sentiment analysis
mechanisms: State of art and open research challenges,’’ Comput. Secur., and opinion mining algorithms,’’ Information, vol. 13, no. 3, p. 152,
vol. 73, pp. 519–544, Mar. 2018. Mar. 2022.
[3] J. Doshi, K. Parmar, R. Sanghavi, and N. Shekokar, ‘‘A comprehensive [27] S. A. Ebad, ‘‘Lessons learned from offline assessment of security-critical
dual-layer architecture for phishing and spam email detection,’’ Comput. systems: The case of microsoft’s active directory,’’ Int. J. Syst. Assurance
Secur., vol. 133, Oct. 2023, Art. no. 103378. Eng. Manage., vol. 13, no. 1, pp. 535–545, Feb. 2022.
[4] F. Salahdine and N. Kaabouch, ‘‘Social engineering attacks: A survey,’’ [28] A. Kumar, ‘‘An empirical examination of the effects of design elements
Future Internet, vol. 11, no. 4, p. 89, Apr. 2019. of email newsletters on consumers’ email responses and their purchase,’’
[5] M. Alawida, A. E. Omolara, O. I. Abiodun, and M. Al-Rajab, ‘‘A deeper J. Retailing Consum. Services, vol. 58, Jan. 2021, Art. no. 102349.
look into cybersecurity issues in the wake of COVID-19: A survey,’’ [29] V. Y. Oviedo and J. E. Fox Tree, ‘‘Meeting by text or video-chat: Effects
J. King Saud Univ. Comput. Inf. Sci., vol. 34, no. 10, pp. 8176–8206, on confidence and performance,’’ Comput. Hum. Behav. Rep., vol. 3,
Nov. 2022. Jan. 2021, Art. no. 100054.
[6] B. Parmar, ‘‘Protecting against spear-phishing,’’ Comput. Fraud Secur., [30] M. K. Islam, M. A. Amin, M. R. Islam, M. N. I. Mahbub,
vol. 2012, no. 1, pp. 8–11, Jan. 2012. M. I. H. Showrov, and C. Kaushal, ‘‘Spam-detection with comparative
[7] E. G. Dada, J. S. Bassi, H. Chiroma, S. M. Abdulhamid, A. O. analysis and spamming words extractions,’’ in Proc. 9th Int. Conf. Rel.,
Adetunmbi, and O. E. Ajibuwa, ‘‘Machine learning for email spam INFOCOM Technol. Optim., Sep. 2021, pp. 1–9.
filtering: Review, approaches and open research problems,’’ Heliyon, [31] F. Jáñez-Martino, R. Alaiz-Rodríguez, V. González-Castro, E. Fidalgo,
vol. 5, no. 6, Jun. 2019, Art. no. e01802. and E. Alegre, ‘‘A review of spam email detection: Analysis of spammer
[8] (2023). Statista. [Online]. Available: https://www.statista. strategies and the dataset shift problem,’’ Artif. Intell. Rev., vol. 56, no. 2,
com/statistics/456500/daily-number-of-e-mails-worldwide/ pp. 1145–1173, Feb. 2023.

VOLUME 12, 2024 143653


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

[32] N. Pérez-Díaz, D. Ruano-Ordás, F. Fdez-Riverola, and J. R. Méndez, [54] X. Zheng, X. Zhang, Y. Yu, T. Kechadi, and C. Rong, ‘‘ELM-based
‘‘SDAI: An integral evaluation methodology for content-based spam spammer detection in social networks,’’ J. Supercomput., vol. 72, no. 8,
filtering models,’’ Expert Syst. Appl., vol. 39, no. 16, pp. 12487–12500, pp. 2991–3005, Aug. 2016.
Nov. 2012. [55] S. Rezvani, X. Wang, and F. Pourpanah, ‘‘Intuitionistic fuzzy twin
[33] N. Saidani, K. Adi, and M. S. Allili, ‘‘A semantic-based classification support vector machines,’’ IEEE Trans. Fuzzy Syst., vol. 27, no. 11,
approach for an enhanced spam detection,’’ Comput. Secur., vol. 94, pp. 2140–2151, Nov. 2019.
Jul. 2020, Art. no. 101716. [56] K. Juneja, ‘‘Two-phase fuzzy feature-filter based hybrid model for spam
[34] Z. Zhang, E. Damiani, H. A. Hamadi, C. Y. Yeun, and F. Taher, ‘‘Explain- classification,’’ J. King Saud Univ. Comput. Inf. Sci., vol. 34, no. 10,
able artificial intelligence to detect image spam using convolutional pp. 10339–10355, Nov. 2022.
neural network,’’ in Proc. Int. Conf. Cyber Resilience (ICCR), Oct. 2022, [57] I. Atacak, O. Çıtlak, and I. A. Dogru, ‘‘Application of interval type-2
pp. 1–5. fuzzy logic and type-1 fuzzy logic-based approaches to social networks
[35] A. Hosseinalipour and R. Ghanbarzadeh, ‘‘A novel approach for spam for spam detection with combined feature capabilities,’’ PeerJ Comput.
detection using horse herd optimization algorithm,’’ Neural Comput. Sci., vol. 9, p. e1316, Apr. 2023.
[58] U. Srinivasarao and A. Sharaff, ‘‘SMS sentiment classification using
Appl., vol. 34, no. 15, pp. 13091–13105, Aug. 2022.
an evolutionary optimization based fuzzy recurrent neural network,’’
[36] M. Novo-Lourés, D. Ruano-Ordás, R. Pavón, R. Laza, S. Gómez-Meire,
Multimedia Tools Appl., vol. 82, no. 27, pp. 42207–42238, Nov. 2023.
and J. R. Méndez, ‘‘Enhancing representation in the context of multiple- [59] A. W. Wijayanto and Takdir, ‘‘Fighting cyber crime in email spamming:
channel spam filtering,’’ Inf. Process. Manage., vol. 59, no. 2, Mar. 2022, An evaluation of fuzzy clustering approach to classify spam messages,’’ in
Art. no. 102812. Proc. Int. Conf. Inf. Technol. Syst. Innov. (ICITSI), Nov. 2014, pp. 19–24.
[37] Z. F. Sokhangoee and A. Rezapour, ‘‘A novel approach for spam detection [60] L. Bansal and N. Tiwari, ‘‘Feature selection based classification of spams
based on association rule mining and genetic algorithm,’’ Comput. Electr. using fuzzy support vector machine,’’ in Proc. Int. Conf. Smart Electron.
Eng., vol. 97, Jan. 2022, Art. no. 107655. Commun. (ICOSEC), Sep. 2020, pp. 258–263.
[38] A. R. Yeruva, D. Kamboj, P. Shankar, U. S. Aswal, A. K. Rao, and [61] S. Wang, X. Zhang, Y. Cheng, F. Jiang, W. Yu, and J. Peng, ‘‘A fast
C. S. Somu, ‘‘E-mail spam detection using machine learning—KNN,’’ content-based spam filtering algorithm with fuzzy-SVM and k-means,’’
in Proc. 5th Int. Conf. Contemp. Comput. Informat. (IC3I), Dec. 2022, in Proc. IEEE Int. Conf. Big Data Smart Comput. (BigComp), Jul. 2018,
pp. 1024–1028. pp. 301–307.
[39] M. A. Shaaban, Y. F. Hassan, and S. K. Guirguis, ‘‘Deep convolutional [62] S. A. Khan, K. Iqbal, N. Mohammad, R. Akbar, S. S. A. Ali, and
forest: A dynamic deep ensemble approach for spam detection in text,’’ A. A. Siddiqui, ‘‘A novel fuzzy-logic-based multi-criteria metric for
Complex Intell. Syst., vol. 8, no. 6, pp. 4897–4909, Dec. 2022. performance evaluation of spam email detection algorithms,’’ Appl. Sci.,
[40] M. F. Faisal, M. N. U. Saqlain, M. A. S. Bhuiyan, M. H. Miraz, and vol. 12, no. 14, p. 7043, Jul. 2022.
M. J. A. Patwary, ‘‘Credit approval system using machine learning: [63] X. Wang, Y. Zhao, and F. Pourpanah, ‘‘Recent advances in deep learning,’’
Challenges and future directions,’’ in Proc. Int. Conf. Comput., Netw., Int. J. Mach. Learn. Cybern., vol. 11, pp. 747–750, Jan. 2020.
Telecommun. Eng. Sci. Appl. (CoNTESA), 2021, pp. 76–82. [64] A. Kamilaris and F. X. Prenafeta-Boldu, ‘‘Deep learning in agriculture:
[41] F. Sebastiani, ‘‘Machine learning in automated text categorization,’’ ACM A survey,’’ Comput. Electron. Agricult., vol. 147, pp. 70–90, Apr. 2018.
Comput. Surveys, vol. 34, no. 1, pp. 1–47, Mar. 2002. [65] L. Deng and D. Yu, ‘‘Deep learning: Methods and applications,’’ Found.
Trends Signal Process., vol. 7, nos. 3–4, pp. 197–387, 2014.
[42] M. RAZA, N. D. Jayasinghe, and M. M. A. Muslam, ‘‘A comprehensive
[66] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew, ‘‘Deep
review on email spam classification using machine learning algorithms,’’
learning for visual understanding: A review,’’ Neurocomputing, vol. 187,
in Proc. Int. Conf. Inf. Netw. (ICOIN), Jan. 2021, pp. 327–332.
pp. 27–48, Apr. 2016.
[43] N. Govil, K. Agarwal, A. Bansal, and A. Varshney, ‘‘A machine learning [67] A. Baccouche, S. Ahmed, D. Sierra-Sosa, and A. Elmaghraby, ‘‘Mali-
based spam detection mechanism,’’ in Proc. 4th Int. Conf. Comput. cious text identification: Deep learning from public comments and
Methodologies Commun. (ICCMC), Mar. 2020, pp. 954–957. emails,’’ Information, vol. 11, no. 6, p. 312, Jun. 2020.
[44] C. Bansal and B. Sidhu, ‘‘Machine learning based hybrid approach for [68] M. Alauthman, ‘‘Botnet spam e-Mail detection using deep recurrent
email spam detection,’’ in Proc. 9th Int. Conf. Rel., INFOCOM Technol. neural network,’’ Int. J. Emerg. Trends Eng. Res., vol. 8, no. 5,
Optim., Sep. 2021, pp. 1–4. pp. 1979–1986, May 2020.
[45] P. Thakur, K. Joshi, P. Thakral, and S. Jain, ‘‘Detection of email spam [69] I. AbdulNabi and Q. Yaseen, ‘‘Spam email detection using deep learning
using machine learning algorithms: A comparative study,’’ in Proc. 8th techniques,’’ Proc. Comput. Sci., vol. 184, no. 2, pp. 853–858, 2021.
Int. Conf. Signal Process. Commun. (ICSC), Dec. 2022, pp. 349–352. [70] A. A. Abdullahi and M. Kaya, ‘‘A deep learning based method to detect
[46] R. P. Cota and D. Zinca, ‘‘Comparative results of spam email detection email and SMS spams,’’ in Proc. Int. Conf. Decis. Aid Sci. Appl. (DASA),
using machine learning algorithms,’’ in Proc. 14th Int. Conf. Commun. Dec. 2021, pp. 430–435.
(COMM), Jun. 2022, pp. 1–5. [71] K. F. Rafat, Q. Xin, A. R. Javed, Z. Jalil, and R. Z. Ahmad, ‘‘Evading
[47] B. K. Dedeturk and B. Akay, ‘‘Spam filtering using a logistic regression obscure communication from spam emails,’’ Math. Biosciences Eng.,
model trained by an artificial bee colony algorithm,’’ Appl. Soft Comput., vol. 19, no. 2, pp. 1926–1943, 2021.
vol. 91, Jun. 2020, Art. no. 106229. [72] T. Wen, Y. Xiao, A. Wang, and H. Wang, ‘‘A novel hybrid feature
fusion model for detecting phishing scam on Ethereum using deep neural
[48] Y. Kontsewaya, E. Antonov, and A. Artamonov, ‘‘Evaluating the
network,’’ Expert Syst. Appl., vol. 211, Jan. 2023, Art. no. 118463.
effectiveness of machine learning methods for spam detection,’’ Proc.
[73] Z. Alom, B. Carminati, and E. Ferrari, ‘‘A deep learning model for
Comput. Sci., vol. 190, pp. 479–486, Jun. 2021.
Twitter spam detection,’’ Online Social Netw. Media, vol. 18, Jul. 2020,
[49] V. Sunjaya, S. Senjaya, J. Utama, H. Lucky, and D. Suhartono, ‘‘Content Art. no. 100079.
based spam classifying algorithms in email,’’ 3rd Int. Conf. Artif. Intell. [74] A. Makkar and N. Kumar, ‘‘An efficient deep learning-based scheme for
Data Sci., vol. 94, Jul. 2020, Art. no. 101716. Web spam detection in IoT environment,’’ Future Gener. Comput. Syst.,
[50] T. Georgieva-Trifonova, ‘‘Research on filtering feature selection methods vol. 108, pp. 467–487, Jul. 2020.
for e-mail spam detection by applying K-NN classifier,’’ in Proc. [75] S. Smadi, N. Aslam, and L. Zhang, ‘‘Detection of online phishing
Int. Congr. Hum.-Comput. Interact., Optim. Robotic Appl. (HORA), email using dynamic evolving neural network based on reinforcement
Jun. 2022, pp. 1–4. learning,’’ Decis. Support Syst., vol. 107, pp. 88–102, Mar. 2018.
[51] L. N. Vejendla, B. Bysani, A. Mundru, M. Setty, and V. J. Kunta, ‘‘Score [76] S. Isik, Z. Kurt, Y. Anagun, and K. Ozkan, ‘‘Spam e-mail classification
based support vector machine for spam mail detection,’’ in Proc. 7th Int. recurrent neural networks for spam e-mail classification on an agglutina-
Conf. Trends Electron. Informat., 2023, pp. 915–920. tive language,’’ Int. J. Intell. Syst. Appl. Eng., vol. 8, no. 4, pp. 221–227,
[52] H. Faris, F. A. Alqatawna, M. Al-Zoubi, and I. Aljarah. Dec. 2020.
(2017). Improving Email Spam Detection Using Content Based [77] H. Yang, Q. Liu, S. Zhou, and Y. Luo, ‘‘A spam filtering method based
Feature Engineering Approach. [Online]. Available: http://cran.r- on multi-modal fusion,’’ Appl. Sci., vol. 9, no. 6, p. 1152, Mar. 2019.
project.org/web/packages/Boruta/index.html [78] S. Gadde, A. Lakshmanarao, and S. Satyanarayana, ‘‘SMS spam detection
[53] S. O. Olatunji, ‘‘Extreme learning machines and support vector machines using machine learning and deep learning techniques,’’ in Proc. 7th
models for email spam detection,’’ in Proc. IEEE 30th Can. Conf. Electr. Int. Conf. Adv. Comput. Commun. Syst. (ICACCS), vol. 1, Mar. 2021,
Comput. Eng. (CCECE), Apr. 2017, pp. 1–6. pp. 358–362.

143654 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

[79] F. Wei and T. Nguyen, ‘‘A lightweight deep neural model for SMS [101] A. S. Mashaleh, N. F. Binti Ibrahim, M. A. Al-Betar, H. M. J. Mustafa, and
spam detection,’’ in Proc. Int. Symp. Netw., Comput. Commun. (ISNCC), Q. M. Yaseen, ‘‘Detecting spam email with machine learning optimized
Oct. 2020, pp. 1–6. with Harris hawks optimizer (HHO) algorithm,’’ Proc. Comput. Sci.,
[80] V. S. Vinitha, D. K. Renuka, and L. A. Kumar, ‘‘Long short-term memory vol. 201, pp. 659–664, Aug. 2022.
networks for email spam classification,’’ in Proc. Int. Conf. Intell. Syst. [102] M. Belgiu and L. Dragus, ‘‘Random forest in remote sensing: A review of
Commun., IoT Secur. (ICISCoIS), Feb. 2023, pp. 176–180. applications and future directions,’’ ISPRS J. Photogramm. Remote Sens.,
[81] S. Bagui, D. Nandi, S. Bagui, and R. J. White, ‘‘Machine learning and vol. 114, pp. 24–31, Apr. 2016.
deep learning for phishing email classification using one-hot encoding,’’ [103] J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, ‘‘A comparison of random
J. Comput. Sci., vol. 17, no. 7, pp. 610–623, Jul. 2021. forest variable selection methods for classification prediction modeling,’’
[82] D. A. Otchere, T. O. A. Ganat, R. Gholami, and S. Ridha, ‘‘Application Expert Syst. Appl., vol. 134, pp. 93–101, Nov. 2019.
of supervised machine learning paradigms in the prediction of petroleum [104] N. R. Kothapally and V. Kakulapati, ‘‘Classification of spam messages
reservoir properties: Comparative analysis of ANN and SVM models,’’ using random forest algorithm,’’ J. Xidian University, vol. 15, no. 8,
J. Petroleum Sci. Eng., vol. 200, May 2021, Art. no. 108182. pp. 495–505, 2021.
[83] M. Mohammadi, T. A. Rashid, S. H. T. Karim, A. H. M. Aldalwie, [105] A. Shrivastava and R. Dubey, ‘‘Classification of spam mail using
Q. T. Tho, M. Bidaki, A. M. Rahmani, and M. Hosseinzadeh, ‘‘A different machine learning algorithms,’’ in Proc. Int. Conf. Adv. Comput.
comprehensive survey and taxonomy of the SVM-based intrusion Telecommun. (ICACAT), Dec. 2018, pp. 1–10.
detection systems,’’ J. Netw. Comput. Appl., vol. 178, Mar. 2021,
[106] K. L. Goh and A. K. Singh, ‘‘Comprehensive literature review on machine
Art. no. 102983.
learning structures for web spam classification,’’ Proc. Comput. Sci.,
[84] W. Wang, X. Du, D. Shan, and N. Wang, ‘‘A hybrid cloud intrusion
vol. 70, pp. 434–441, Jun. 2015.
detection method based on SDAE and SVM,’’ in Proc. 12th Int. Conf.
Intell. Comput. Technol. Autom. (ICICTA), Oct. 2019, pp. 271–274. [107] F. Ye, G. Chen, Q. Liu, L. Zhang, Q. Qi, B. Hu, and X. Fan, ‘‘A
[85] P. Navaney, G. Dubey, and A. Rana, ‘‘SMS spam filtering using spam classification method based on naive Bayes,’’ in Proc. IEEE 6th
supervised machine learning algorithms,’’ in Proc. 8th Int. Conf. Cloud Inf. Technol. Mechatronics Eng. Conf. (ITOEC), vol. 6, Mar. 2022,
Comput., Data Sci. Eng., Jan. 2018, pp. 43–48. pp. 1856–1861.
[86] D. Sculley and G. M. Wachman, ‘‘Relaxed online SVMs for spam [108] T. S. Guzella and W. M. Caminhas, ‘‘A review of machine learning
filtering,’’ in Proc. 30th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. approaches to spam filtering,’’ Expert Syst. Appl., vol. 36, no. 7,
Retr., Jul. 2007, pp. 415–422. pp. 10206–10222, Sep. 2009.
[87] P. Haider, U. Brefeld, and T. Scheffer, ‘‘Supervised clustering of [109] M. Sahami, S. Dumais, D. Heckerman, E. Horvitz, and G. Building,
streaming data for email batch detection,’’ in Proc. 24th Int. Conf. Mach. ‘‘A Bayesian approach to filtering junk e-mail,’’ in Proc. Learning Text
Learn., Jun. 2007, pp. 345–352. Categorization, Workshop, 1998, pp. 98–105.
[88] H. Zhou, J. Zhang, Y. Zhou, X. Guo, and Y. Ma, ‘‘A feature selection [110] J. Kim, K. Chung, and K. Choi, ‘‘Spam filtering with dynamically updated
algorithm of decision tree based on feature weight,’’ Expert Syst. Appl., URL statistics,’’ IEEE Secur. Privacy Mag., vol. 5, no. 4, pp. 33–39,
vol. 164, Feb. 2021, Art. no. 113842. Jul. 2007.
[89] M. M. Ghiasi, S. Zendehboudi, and A. A. Mohsenipour, ‘‘Decision tree- [111] X. Deng, Y. Li, J. Weng, and J. Zhang, ‘‘Feature selection for text
based diagnosis of coronary artery disease: CART model,’’ Comput. classification: A review,’’ Multimedia Tools Appl., vol. 78, no. 3,
Methods Programs Biomed., vol. 192, Aug. 2020, Art. no. 105400. pp. 3797–3816, Feb. 2019.
[90] S. Rizvi, B. Rienties, and S. A. Khoja, ‘‘The role of demographics in [112] A. Çıltık and T. Güngör, ‘‘Time-efficient spam e-mail filtering using
online learning; a decision tree based approach,’’ Comput. Educ., vol. 137, n-gram models,’’ Pattern Recognit. Lett., vol. 29, no. 1, pp. 19–33,
pp. 32–47, Aug. 2019. Jan. 2008.
[91] A. Wijaya and A. Bisri, ‘‘Hybrid decision tree and logistic regression [113] T. Toma, S. Hassan, and M. Arifuzzaman, ‘‘An analysis of supervised
classifier for email spam detection,’’ in Proc. 8th Int. Conf. Inf. Technol. machine learning algorithms for spam email detection,’’ in Proc. Int.
Electr. Eng. (ICITEE), Oct. 2016, pp. 1–4. Conf. Autom., Control Mechatronics, Jul. 2021, pp. 1–5.
[92] A. F. Zulfikar, D. Supriyadi, Y. Heryadi, and Lukas, ‘‘Comparison [114] C. Li, G. Zhan, and Z. Li, ‘‘News text classification based on improved
performance of decision tree classification model for spam filtering with Bi-LSTM-CNN,’’ in Proc. 9th Int. Conf. Inf. Technol. Med. Educ. (ITME),
or without the recursive feature elimination (RFE) approach,’’ in Proc. Oct. 2018, pp. 890–893.
4th Int. Conf. Inf. Technol., Inf. Syst. Electr. Eng. (ICITISEE), Nov. 2019, [115] H. Moayedi, M. Mehrabi, M. Mosallanezhad, A. S. A. Rashid,
pp. 311–316. and B. Pradhan, ‘‘Modification of landslide susceptibility mapping
[93] Y. Zhang, S. Wang, P. Phillips, and G. Ji, ‘‘Binary PSO with mutation using optimized PSO-ANN technique,’’ Eng. Comput., vol. 35, no. 3,
operator for feature selection using decision tree applied to spam pp. 967–984, Jul. 2019.
detection,’’ Knowl.-Based Syst., vol. 64, pp. 22–31, Jul. 2014.
[116] A. Kurani, P. Doshi, A. Vakharia, and M. Shah, ‘‘A comprehensive
[94] H. Kaur and A. Sharma, ‘‘Improved email spam classification method
comparative study of artificial neural network (ANN) and support vector
using integrated particle swarm optimization and decision tree,’’ in
machines (SVM) on stock forecasting,’’ Ann. Data Sci., vol. 10, no. 1,
Proc. 2nd Int. Conf. Next Gener. Comput. Technol. (NGCT), Oct. 2016,
pp. 183–208, Feb. 2023.
pp. 516–521.
[95] S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, ‘‘Comparative [117] B. Ingre and A. Yadav, ‘‘Performance analysis of NSL-KDD dataset using
performance analysis of K-nearest neighbour (KNN) algorithm and its ANN,’’ in Proc. Int. Conf. Signal Process. Commun. Eng. Syst., Jan. 2015,
different variants for disease prediction,’’ Sci. Rep., vol. 12, no. 1, pp. 92–96.
p. 10358, Apr. 2022. [118] C. Zhan, F. Zhang, and M. Zheng, ‘‘Design and implementation of an
[96] H. Liu, J. An, W. Xu, X. Jia, L. Gan, and C. Yuen, ‘‘K-means optimization system of span filter rule based on neural network,’’ in Proc.
based constellation optimization for index modulated reconfigurable Int. Conf. Commun., Circuits Syst., vol. 3, Jul. 2007, pp. 882–886.
intelligent surfaces,’’ IEEE Commun. Lett., vol. 27, no. 8, pp. 2152–2156, [119] R. Talaei Pashiri, Y. Rostami, and M. Mahrami, ‘‘Spam detection
Jun. 2023. through feature selection using artificial neural network and sine–cosine
[97] S. A. Orazbayev, R. E. Zhumadylov, A. T. Zhunisbekov, T. S. Ramazanov, algorithm,’’ Math. Sci., vol. 14, no. 3, pp. 193–199, Sep. 2020.
and M. T. Gabdullin, ‘‘Obtaining of copper nanoparticles in combined [120] S. A. A. Ghaleb, M. Mohamad, E. F. H. S. Abdullah, and
RF+DC discharge plasma,’’ Mater. Today, Proc., vol. 20, pp. 329–334, W. A. H. M. Ghanem, ‘‘Spam classification based on supervised
Jun. 2020. learning using grasshopper optimization algorithm and artificial neural
[98] D. Ö. Sahin and S. Demirci, ‘‘Spam filtering with KNN: Investigation of network,’’ in Proc. 2nd Int. Conf., 2021, pp. 420–434.
the effect of k value on classification performance,’’ in Proc. 28th Signal [121] A. Arram, H. Mousa, and A. Zainal, ‘‘Spam detection using hybrid
Process. Commun. Appl. Conf. (SIU), Oct. 2020, pp. 1–4. artificial neural network and genetic algorithm,’’ in Proc. 13th Int. Conf.
[99] Y. K. Zamil, S. A. Ali, and M. A. Naser, ‘‘Spam image email filtering Intellient Syst. Design Appl., Dec. 2013, pp. 336–340.
using K-NN and SVM,’’ Int. J. Electr. Comput. Eng. (IJECE), vol. 9, no. 1, [122] J. Gu, ‘‘Recent advances in convolutional neural networks,’’ Pattern
p. 245, Feb. 2019. Recognit., vol. 77, pp. 354–377, May 2018.
[100] G. Hnini, J. Riffi, M. A. Mahraz, A. Yahyaouy, and H. Tairi, ‘‘Spam [123] Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, ‘‘A survey of convolutional
filtering system based on nearest neighbor algorithms,’’ in Proc. Int. Conf. neural networks: Analysis, applications, and prospects,’’ IEEE Trans.
Artif. Intell. Ind. Appl., 2021, pp. 36–46. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999–7019, Dec. 2022.

VOLUME 12, 2024 143655


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

[124] V. Gupta, A. Mehta, A. Goel, U. Dixit, and A. C. Pandey, ‘‘Spam detection [146] A. L. Rosewelt, N. D. Raju, and S. Ganapathy, ‘‘An effective spam
using ensemble learning,’’ in Harmony Search and Nature Inspired message detection model using feature engineering and bi-LSTM,’’
Optimization Algorithms: Theory and Applications. Cham, Switzerland: in Proc. Int. Conf. Adv. Comput., Commun. Appl. Informat. (ACCAI),
Springer, 2019, pp. 661–668. Jan. 2022, pp. 1–6.
[125] M. Gupta, A. Bakliwal, S. Agarwal, and P. Mehndiratta, ‘‘A comparative [147] Y. Gao, M. Yang, X. Zhao, B. Pardo, Y. Wu, T. N. Pappas, and
study of spam SMS detection using machine learning classifiers,’’ in Proc. A. Choudhary, ‘‘Image spam hunter,’’ in Proc. IEEE Int. Conf. Acoust.,
11th Int. Conf. Contemp. Comput., Aug. 2018, pp. 1–7. Speech Signal Process., vol. 2, Mar. 2008, pp. 1765–1768.
[126] M. Popovac, M. Karanovic, S. Sladojevic, M. Arsenovic, and A. Anderla, [148] M. Dredze, R. Gevaryahu, and A. Elias-Bachrach. (2007). Learning Fast
‘‘Convolutional neural network based SMS spam detection,’’ in Proc. Classifiers for Image Spam. [Online]. Available: http://fuzzyocr.own-
26th Telecommun. Forum (TELFOR), Nov. 2018, pp. 1–4. hero.net/
[149] Z. Wang, W. Josephson, Q. Lv, M. Charikar, and K. Li, ‘‘Filtering image
[127] T. Sharmin, F. Di Troia, K. Potika, and M. Stamp, ‘‘Convolutional neural
spam with near-duplicate detection,’’ in Proc. CEAS, 2007, pp. 1–10.
networks for image spam detection,’’ Inf. Secur. J. A Global Perspective,
[150] D. Debarr and H. Wechsler. (2007). Spam Detection Using Clus-
vol. 29, no. 3, pp. 103–117, May 2020.
tering, Random Forests, and Active Learning. [Online]. Available:
[128] A. Farzad, H. Mashayekhi, and H. Hassanpour, ‘‘A comparative
http://trec.nist.gov/pubs/trec16/papers/SPAM.OVERVIEW1
performance analysis of different activation functions in LSTM networks [151] I. Androutsopoulos, G. Paliouras, V. Karkaletsis, G. Sakkis,
for classification,’’ Neural Comput. Appl., vol. 31, no. 7, pp. 2507–2521, C. D. Spyropoulos, and P. Stamatopoulos. (2006). Learning to Filter
Jul. 2019. Spam e-mail: A Comparison of a Naive Bayesian and a Memory-based
[129] A. Sherstinsky, ‘‘Fundamentals of recurrent neural network (RNN) Approach 1. [Online]. Available: http://www.cauce.org
and long short-term memory (LSTM) network,’’ Phys. D, Nonlinear [152] I. Koprinska, J. Poon, J. Clark, and J. Chan, ‘‘Learning to classify e-mail,’’
Phenomena, vol. 404, Mar. 2020, Art. no. 132306. Inf. Sci., vol. 177, no. 10, pp. 2167–2187, May 2007.
[130] S. Muzaffar and A. Afshari, ‘‘Short-term load forecasts using LSTM [153] B. Biggio, I. Corona, G. Fumera, G. Giacinto, and F. Roli, ‘‘Bagging
networks,’’ Energy Proc., vol. 158, pp. 2922–2927, Feb. 2019. classifiers for fighting poisoning attacks in adversarial classification
[131] B. Lindemann, B. Maschler, N. Sahlab, and M. Weyrich, ‘‘A survey tasks,’’ in Proc. 10th Int. Workshop, 2011, pp. 350–359.
on anomaly detection for technical systems using LSTM networks,’’ [154] I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and
Comput. Ind., vol. 131, Oct. 2021, Art. no. 103498. C. D. Spyropoulos. (2000). An Evaluation of Naive Bayesian Anti-Spam
[132] G. Jain, M. Sharma, and B. Agarwal, ‘‘Optimizing semantic lstm for spam Filtering. [Online]. Available: http://www.caucee.org
detection,’’ Int. J. Inf. Technol., vol. 11, pp. 239–250, Jun. 2019. [155] G. V. Cormack and T. R. Lynam, ‘‘Online supervised spam filter
[133] E. E. Eryilmaz, D. Ö. Sahin, and E. Kiliç, ‘‘Filtering Turkish spam using evaluation,’’ ACM Trans. Inf. Syst., vol. 25, no. 3, p. 11, Jul. 2007.
LSTM from deep learning techniques,’’ in Proc. 8th Int. Symp. Digit. [156] L. Zhang, J. Zhu, and T. Yao, ‘‘An evaluation of statistical spam filtering
Forensics Secur. (ISDFS), Jun. 2020, pp. 1–6. techniques,’’ ACM Trans. Asian Lang. Inf. Process., vol. 3, no. 4,
[134] S. Thanarattananakin, S. Bulao, B. Visitsilp, and M. Maliyaem, ‘‘Spam pp. 243–269, Dec. 2004.
detection using word embedding-based LSTM,’’ in Proc. Joint Int. Conf. [157] J. R. Mendez, F. Fdez-Riverola, F. Dãsaz, E. L. Iglesias, and J. M. Cor-
Digit. Arts, Media Technol. ECTI Sect. Conf. Electr., Electron., Comput. chado, ‘‘A comparative performance study of feature selection methods
Telecommun. Eng. (ECTI DAMT NCON), Jan. 2022, pp. 227–231. for the anti-spam filtering domain,’’ in Proc. 6th Ind. Conf. Data Mining,
2006, pp. 106–120.
[135] O. Yildirim, U. B. Baloglu, R.-S. Tan, E. J. Ciaccio, and U. R. Acharya,
[158] A. Attar, R. M. Rad, and R. E. Atani, ‘‘A survey of image spamming
‘‘A new approach for arrhythmia classification using deep coded features
and filtering techniques,’’ Artif. Intell. Rev., vol. 40, no. 1, pp. 71–105,
and LSTM networks,’’ Comput. Methods Programs Biomed., vol. 176,
Jun. 2013.
pp. 121–133, Jul. 2019. [159] G. Sakkis, I. Androutsopoulos, G. Paliouras, V. Karkaletsis, C. D. Spy-
[136] K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio, ‘‘On the ropoulos, and P. Stamatopoulos. (2001). Stacking Classifiers for Anti-
properties of neural machine translation: Encoder–decoder approaches,’’ spam Filtering of e-mail. [Online]. Available: www.junkemail.org
2014, arXiv:1409.1259. [160] T. A. Almeida and A. Yamakami, ‘‘Content-based spam filtering,’’ in
[137] P. T. Yamak, L. Yujian, and P. K. Gadosey, ‘‘A comparison between Proc. Int. Joint Conf. Neural Netw. (IJCNN), Jul. 2010, pp. 1–7.
ARIMA, LSTM, and GRU for time series forecasting,’’ in Proc. [161] D. Mallampati and N. P. Hegde, ‘‘Feature extraction and classification of
2nd Int. Conf. Algorithms, Comput. Artif. Intell., Dec. 2019, email spam detection using IMTF-IDF+skip-thought vectors,’’ Ingenierie
pp. 49–55. Des. Syst. Inf., vol. 27, no. 6, pp. 941–948, Dec. 2022.
[138] S. Gao, Y. Huang, S. Zhang, J. Han, G. Wang, M. Zhang, and Q. Lin, [162] H. Saini and K. S. Saini, ‘‘Hybrid model for email spam prediction
‘‘Short-term runoff prediction with GRU and LSTM networks without using random forest for feature extraction,’’ in Proc. Int. Conf. Artif.
requiring time step optimization during sample generation,’’ J. Hydrol., Intell. Appl. (ICAIA) Alliance Technol. Conf. (ATCON-1), Apr. 2023,
vol. 589, Oct. 2020, Art. no. 125188. pp. 1–4.
[139] K. A. Al-Thelaya, T. S. Al-Nethary, and E. Y. Ramadan, ‘‘Social networks [163] Q. Cheng, A. Xu, X. Li, and L. Ding, ‘‘Adversarial email gener-
spam detection using graph-based features analysis and sequence of ation against spam detection models through feature perturbation,’’
interactions between users,’’ in Proc. IEEE Int. Conf. Informat., IoT, in Proc. IEEE Int. Conf. Assured Autonomy (ICAA), Mar. 2022,
Enabling Technol. (ICIoT), Feb. 2020, pp. 206–211. pp. 83–92.
[140] T. Repke and R. Krestel. (2018). Bringing Back Structure To Free [164] M. A. Hassan and N. Mtetwa, ‘‘Feature extraction and classification of
Text Email Conversations With Recurrent Neural Networks. [Online]. spam emails,’’ in Proc. 5th Int. Conf. Soft Comput. Mach. Intell. (ISCMI),
Available: http://isc.enron.com/ Nov. 2018, pp. 93–98.
[141] T. Le, M. Vo, B. Vo, E. Hwang, S. Rho, and S. Baik, ‘‘Improving electric [165] I. Inuwa-Dutse, M. Liptrott, and I. Korkontzelos, ‘‘Detection of spam-
energy consumption prediction using CNN and bi-LSTM,’’ Appl. Sci., posting accounts on Twitter,’’ Neurocomputing, vol. 315, pp. 496–511,
vol. 9, no. 20, p. 4237, Oct. 2019. Nov. 2018.
[166] S. Aiyar and N. P. Shetty, ‘‘N-gram assisted YouTube spam comment
[142] F. Shahid, A. Zameer, and M. Muneeb, ‘‘Predictions for COVID-19 with
detection,’’ Proc. Comput. Sci., vol. 132, pp. 174–182, Jul. 2018.
deep learning models of LSTM, GRU and bi-LSTM,’’ Chaos, Solitons
[167] R. Alharthi, A. Alhothali, and K. Moria, ‘‘A real-time deep-learning
Fractals, vol. 140, Nov. 2020, Art. no. 110212.
approach for filtering Arabic low-quality content and accounts on
[143] S. M. Zaman, M. M. Hasan, R. I. Sakline, D. Das, and M. A. Alam, ‘‘A Twitter,’’ Inf. Syst., vol. 99, Jul. 2021, Art. no. 101740.
comparative analysis of optimizers in recurrent neural networks for text [168] Y. Liu, B. Pang, and X. Wang, ‘‘Opinion spam detection by incorporating
classification,’’ in Proc. IEEE Asia–Pacific Conf. Comput. Sci. Data Eng. multimodal embedded representation into a probabilistic review graph,’’
(CSDE), vol. 3, Dec. 2021, pp. 1–6. Neurocomputing, vol. 366, pp. 276–283, Nov. 2019.
[144] S. E. Rahman and S. Ullah, ‘‘Email spam detection using bidirectional [169] T. Wu, S. Liu, J. Zhang, and Y. Xiang, ‘‘Twitter spam detection based on
long short term memory with convolutional neural network,’’ in Proc. deep learning,’’ in ACM Int. Conf. Proc. Ser., Jan. 2017, pp. 1–26.
IEEE Region 10 Symp. (TENSYMP), Jun. 2020, pp. 1307–1311. [170] A. Barushka and P. Hajek, ‘‘Review spam detection using word
[145] C. M. Shaik, N. M. Penumaka, S. K. Abbireddy, V. Kumar, and embeddings and deep neural networks,’’ in Proc. 15th IFIP WG, 2019,
S. S. Aravinth, ‘‘Bi-LSTM and conventional classifiers for email spam pp. 340–350.
filtering,’’ in Proc. 3rd Int. Conf. Artif. Intell. Smart Energy (ICAIS), [171] I. Kanaris, K. Kanaris, and E. Stamatatos, ‘‘Spam detection using
Feb. 2023, pp. 1350–1355. character n-grams,’’ in Proc. 4th Helenic Conf., 2006, pp. 95–104.

143656 VOLUME 12, 2024


E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems

EKRAMUL HAQUE TUSHER received the B.Sc. talent, in 2022. He was awarded the Higher Education Academy (HEA)
degree in computer science from International Fellowship from the U.K. He has received several prestigious international
Islamic University Chittagong (IIUC). He is research awards, notably the Best Paper Award at ICNS’15 (Italy); IC0902
currently pursuing the master’s degree in soft Grant (France); Italian Government Ph.D. Research Scholarship; the IIUM
computing and intelligent systems with Universiti Best Masters Student Award; the Best Supervisor Award at UMP; and the
Malaysia Pahang Al-Sultan Abdullah (UMPSA), Awards in International Exhibitions, including the Euro Business-HALLER
Pekan, Pahang, Malaysia. He has been a Research Poland Special Award at MTE 2022; the Best Innovation Award at MTE
Assistant with the Machine Intelligence Research 2020, Malaysia; the Diamond and Gold in BiS’17 U.K.; the Best of
Group (MIRG), UMPSA, since 2023. His current the Best Innovation Award and Most Commercial IT Innovation Award,
research interests include machine learning meth- Malaysia; and the Gold and Silver Medals in iENA’17 Germany. He served
ods, deep learning, fuzzy systems, and explainable AI. as the Specialty Chief Editor for IoT Theory and Fundamental Research
(specialty section of Frontiers in the Internet of Things); an Advisory Board
Member and an Editorial Board Member for Computer Systems Science and
MOHD ARFIAN ISMAIL received the B.Sc.,
Engineering (Tech Science Press) and Computers (MDPI); a Lead Guest
M.Sc., and Ph.D. degrees in computer sci-
Editor for IEEE ACCESS and Computers; an Associate Editor for IEEE ACCESS
ence from Universiti Teknologi Malaysia (UTM),
and Patron; the General Chair; the Organizing Committee Member; the
in 2008, 2011, and 2016, respectively. He is
Publicity Chair; the Session Chair; the Programme Committee Member;
currently an Associate Professor with the Faculty
and a member of the Technical Programme Committee (TPC) in numerous
of Computing, University Malaysia Pahang Al-
leading conferences worldwide, such as IEEE Globecom, IEEE DASC, IEEE
Sultan Abdullah, Malaysia. His current research
iSCI, and IEEE ETCCE, and journals. His name was enlisted inside the
interests include machine learning methods and
World Top 2% Scientists list released by Stanford University under the
fuzzy systems.
category of Citation Impact in Single Calendar Year in 2019, 2020, and 2021.

MD ARAFATUR RAHMAN (Senior Member,


IEEE) received the Ph.D. degree in electronic and
telecommunications engineering from the Univer- ALI H. ALENEZI received the B.S. degree in
sity of Naples Federico II, Naples, Italy, in 2013. electrical engineering from King Saud Univer-
He has around 15 years of research and teaching sity, Saudi Arabia, the M.S. degree in electrical
experience in the domain of computer science and engineering from the KTH Royal Institute of
Technology, Sweden, and the Ph.D. degree in
communications engineering. He was an Associate
Professor with the Faculty of Computing, Uni- electrical engineering from New Jersey Institute
versiti Malaysia Pahang. He was a Postdoctoral of Technology, USA, in 2018. He is currently
Research Fellow with the University of Naples an Associate Professor with the Electrical Engi-
Federico II, in 2014, and a Visiting Researcher with the Sapienza University neering Department, Northern Border University,
of Rome, in 2016. Currently, he is a Reader of cyber security with the Saudi Arabia. His research interests include acous-
School of Engineering, Computing and Mathematical Sciences, University tic communication, wireless communications, and 4G and 5G networks
using UAVs.
of Wolverhampton, U.K. He has developed an excellent track record
of academic leadership and management and execution of international
ICT projects that are supported by agencies in the U.K., Italy, EU, and
Malaysia. He has co-authored around 150 prestigious IEEE and Elsevier
journals, such as IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, IEEE MUEEN UDDIN (Senior Member, IEEE) received
TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, IEEE TRANSACTIONS the Ph.D. degree from Universiti Teknologi
ON GREEN COMMUNICATIONS AND NETWORKING, IEEE TRANSACTIONS ON SERVICES Malaysia (UTM), in 2013. He is currently an
COMPUTING, IEEE Communications Magazine, JNCA (Elsevier), and FGCS Associate Professor of data and cybersecurity
(Elsevier); and conference publications, such as IEEE Globecom and IEEE with the University of Doha for Science and
DASC. His research interests include cyber security, in particular on the Technology, Qatar. He has published more than
Internet of Things (IoT), wireless communication networks, cognitive radio 130 international journals and conference papers
networks, 5G, vehicular communication, cyber-physical systems, big data, in highly reputed journals with a cumulative
cloud-fog-edge computing, and machine learning-dependent applications. impact factor of over 300. His research interests
He was a fellow of the IBM Center of Excellence and the Earth Resources include blockchain, cybersecurity, the IoT secu-
and Sustainability Center, Malaysia. He was endorsed by the Royal Academy rity, and network and cloud security.
of Engineering, U.K., as a Global Talent under the category of exceptional

VOLUME 12, 2024 143657

You might also like