Email Spam A Comprehensive Review of Optimize Detection Methods Challenges and Open Research Problems
Email Spam A Comprehensive Review of Optimize Detection Methods Challenges and Open Research Problems
Corresponding authors: Mohd Arfian Ismail ([email protected]) and Mueen Uddin ([email protected])
This work was supported in part by the Fundamental Research Grant (FRGS) with FRGS/1/2022/ICT02/UMP/02/2 from the Ministry of
Higher Education Malaysia under Grant RDU220134; in part by Qatar National Library—QNL (Open Access Research); and in part by the
Deanship of Scientific Research at Northern Border University, Arar, Suadi Arabia, under Project NBU-FFR-2024-2159-07.
ABSTRACT Nowadays, emails are used across almost every field, spanning from business to education.
Broadly, emails can be categorized as either ham or spam. Email spam, also known as junk emails or
unwanted emails, can harm users by wasting time and computing resources, along with stealing valuable
information. The volume of spam emails is rising rapidly day by day. Detecting and filtering spam presents
significant and complex challenges for email systems. Traditional identification techniques like blocklists,
real-time blackhole listing, and content-based methods have limitations. These limitations have led to the
advancement of more sophisticated machine learning (ML) and deep learning (DL) methods for enhanced
spam detection accuracy. In recent years, considerable attention has focused on the potential of ML and
DL methods to improve email spam detection. A comprehensive literature review is therefore imperative
for developing an updated, evidence-based understanding of contemporary research on employing these
methods against this persistent problem. The review aims to systematically identify various ML and DL
methods applied for spam detection, evaluate their effectiveness, and highlight promising future research
directions considering gaps. By combining and analyzing findings across studies, it will obtain the strengths
and weaknesses of existing methods. This review seeks to advance knowledge on reliable and efficient
integration of state-of-the-art ML and DL into identifying email spam.
INDEX TERMS Email spam, machine learning, deep learning, fuzzy system, feature selection, spam
detection.
B. CONTRIBUTION
to novel spam strategies, and guaranteeing computational
effectiveness. Furthermore, maintaining a proper balance There are gaps in understanding the effectiveness, limitations,
between precision and the utilization of resources continues and potential improvements needed for current spam detec-
to be a crucial concern. Researchers and practitioners in tion techniques. With the field evolving rapidly and accurate
the field face an ongoing challenge to keep ahead of identification of spam being crucial, an updated comprehen-
spammers as they constantly improve their techniques [22]. sive review is needed. This review would synthesize available
However, there is still potential for additional enhancement evidence on existing methods, highlight literature gaps to
and advancement. address through new research, and provide the following key
The use of ML and DL to find spam is an area that is contributions:
changing quickly and has gained a lot of attention lately • The paper presents a comprehensive review of the
because it has the potential to get around the problems with crucial characteristics used to identify email spam,
traditional methods and make detection more accurate. But as well as significant advancements in this field. The
there needs to be a comprehensive overview of the present survey identifies significant research gaps and outlines
research in this field. This can help us figure out the pros and future research goals in the field of email spam
cons of different ML and DL methods and guide the growth of detection, based on a comprehensive analysis of existing
future research. It is possible to get a clear and evidence-based literature.
view of how ML and DL can be used to find spam through • This review paper focuses on the various ML and DL
gathering together the results of all the relevant studies and methods utilized for spam email detection and analyzes
finding any gaps in the literature. In particular, this kind of the effectiveness of existing techniques in accurately
study can show how well different ML and DL methods work, identifying spam messages.
as well as their flaws and possible ways to make them better. • The review presents an elaborate study of several
A thorough study can also find gaps in the research and help methods applied to email spam detection over the period
with coming up with new research questions and areas to 2005-2024.
focus on. • Analyses the performance of ML and DL methods
by examining the findings reported in recent research.
A. REVIEW SCOPE Presents a concise summary of these findings in well-
In our comprehensive investigation of more than one hundred organised tables.
research publications sourced from renowned scientific • The review identifies the strengths and limitations of
databases such as IEEE Xplore, Web of Science, ScienceDi- various spam detection methods. Analysing the current
B. MACHINE LEARNING
2) CASE BASED EMAIL SPAM DETECTION TECHNIQUE
Artificial intelligence encompasses many subfields, one of
which is ML. The term ML is used to describe the process
Case based detection is a popular method for detecting spam
of designing, analyzing, and deploying systems that help a
emails. The first step is to gather all emails from each user’s
machine get better results. ML systems use training data to
mailbox, regardless of whether they are spam or not. The next
make predictions about the problem. In particular, training
step is to pre-process the email so that it may be converted
data is utilized to extract information and develop a method
using a client interface. This involves extracting and selecting
that should generalize to all conceivable problem cases
features, aggregating email data, and finally, evaluating the
throughout the learning phase [40]. ML method is used
results. Next, the data is divided into two sets of vectors [36].
to classify new samples after learning. The goal of ML is
Finally, a ML method is utilised to train and evaluate datasets
to develop a method that predicts well on test data with
in order to ascertain whether incoming emails are classified
new examples. For automated decision-making, ML methods
as spam.
are commonly used. ML uses training data to construct
methods that can effectively predict fresh data outcomes,
3) RULE BASED EMAIL SPAM DETECTION TECHNIQUE
enabling automated decision-making across many disciplines
In this method, numerous patterns, typically regular expres- [41]. Many fields have successfully used applications of
sions, are evaluated against a selected message using ML techniques. But there’s a subset that always needs new
preexisting rules or heuristics. The quality of a message methods because of an adversarial figure, and that includes
improves as it acquires several matching patterns. However, things like phishing detection spam detection and botnet
the score is reduced if any of the patterns were incorrect. If the identification [42]. Nonetheless, institutions and researchers
score of a message is high enough, it is classified as spam; need to address this issue by taking into account the unique
otherwise, it is considered legitimate. Some ranking factors characteristics of their respective fields of study. For instance,
are static, while others need to be updated frequently to keep phishing differs from spam in that it often masquerades
up with the ever-evolving threat posed by spammers and their as legitimate-looking branch logos and requests personal
more sophisticated and difficult-to-detect messages [37]. information or conveys an urgent message [43]. In another
SpamAssassin is an excellent example of a rule-based spam study, ML security research on adversarial techniques
detection. typically focuses on spam email detection, whose adversarial
figure is commonly referred to as a spammer. By including
4) PREVIOUS LIKENESS BASED EMAIL SPAM DETECTION specific misspelt terms or legitimate words in the email,
TECHNIQUE scammers want to fool the classifier without negatively
In this strategy, incoming emails are sorted according to how impacting the email’s readability. As a result, spam emails
closely they resemble instances already stored in memory may contain malicious data that was purposefully injected
(i.e., training emails). New instances are represented as by spammers to compromise the data used for training
points in a multidimensional space, which is generated the classifiers and, in turn, undermine its regular operation
using the email’s properties [38]. Then, the fresh instances filter [44]. As well as, a comparative analysis method in
are distributed among the most well-liked group among which many ML methods were tested on the same data set.
its k-nearest training instances. For this purpose, the k-NN Accuracy and precision were used to evaluate the various
method is used. machine learning methods. The accuracy of the support
vector machine is 98.09% [45]. Additionally, Cota et al. used
5) ADAPTIVE BASED EMAIL SPAM DETECTION TECHNIQUE two publicly accessible corpora. For the first set of tests,
The system recognizes spam detection by assigning it to one each corpus was divided into 80% training and 20% testing,
of several categories [39]. It classifies a collection of emails and for the second set, 70% training and 30% testing. Using
into categories and assigns a defining text to each category. Random Forests, the best accuracy for the input corpus was
85.25% and 86.25 percent, respectively. These findings are area AUC of 0.971 [57]. Another aspect, Srinivasarao et al.
consistent with other studies [46]. According to the previous introduced fuzzy-based Recurrent Neural network-based
study on spam detection using ML methods outlined in Harris Hawk optimization (FRNN-HHO) to post-classify
Table 2, it can be inferred that scholars strongly appreciate spam and ham messages. Three distinct datasets SMS, Email
ML methods for their significance in detecting spam and Spam-assassin are used to assess the efficacy of the
texts. proposed architecture. For the SMS dataset, the suggested
Currently, ML methods employed for email spam detection method achieved an AUC of 0.9699, for the email dataset
mostly rely on techniques such as SVM, NB, RF, and k- it achieved 0.958, and for spam assassin it achieved 0.95
NN. These approaches have been successful in reaching [58]. In another study, fuzzy C-Means clustering was utilized
accuracies within the range of 90-99%. Nevertheless, these for spam email segmentation to prevent cybercrime in the
strategies encounter constraints such as inaccurate posi- Internet era. Previous studies have shown that clustering in
tive results, unchanging feature extraction, and demanding data mining for spam filtering has been understudied. This
computing intricacy. There is a notable lack of research study demonstrated that Fuzzy C-Means clustering showed
in creating more flexible and responsive methods that can promising results for spam email categorization on a public
respond to evolving spam strategies. Additionally, there is a spam dataset using different parameters [59]. As well as
requirement to investigate hybrid or ensemble techniques that email’s growing popularity as a secure online communication
integrate various algorithms in order to enhance accuracy and method has led to the rise of unsolicited bulk emails or
minimize false positives. spam. A proposed spam filtering strategy handles this issue
ML methods have proven effective for email spam by employing relief feature selection and a fuzzy-SVM to
detection across multiple studies. However, ML methods may deal with uncertain elements. Experiments showed that these
struggle with vague and ambiguous information. In contrast, algorithms improved spam filtering accuracy and detection
fuzzy systems can better handle uncertainty and imprecision speed [60]. In another study, the widespread problem of spam
in data and logic. This is because fuzzy systems can in mailboxes has negative effects on network resources and
represent and reason with vague, ambiguous information daily life. To address this issue, a content-based spam filtering
using fuzzy logic. Furthermore, fuzzy systems can adjust algorithm using fuzzy- SVM, and k-means was proposed.
and adapt to changing data and situations by applying fuzzy k-means clustering reduces data while maintaining critical
rules. In the following section, the use of fuzzy systems information. Meanwhile, fuzzy-SVM trains a classification
is discussed in more detail in the context of email spam method to handle ambiguity. This strategy improves spam
detection. filtering speed and accuracy, according to experiments [61].
Table 3 presents prior research on spam detection using fuzzy
system. From this analysis, it can be assumed that researchers
C. FUZZY SYSTEM highly value the significance of fuzzy system techniques in
There has been a proliferation of applications of fuzzy set email spam detection.
theory in recent years, including ML, data mining and DL. The present research examines the application of various
Researchers in this area recognised the need for measuring fuzzy systems in email spam detection. It focuses on distinct
the fuzzy membership vector in a fuzzy set or event as models, datasets, merits, and findings. However, there is a
a result of the widespread use of the idea of fuzzy set significant lack of research in combining fuzzy logic with
theory [55]. Additionally, Gazal et al. developed a two-level sophisticated DL methods. Although Fuzzy-BERT demon-
filter-based hybrid spam detection methodology. At Level- strates potential, there is a lack of investigation into hybrid
1, a high-level filter removes irrelevant and unimportant models that integrate fuzzy logic with other cutting-edge
features and content. Level-2 uses a fuzzy-based composite algorithms in order to enhance accuracy and resilience.
evaluator for low-level filtration and to find the most effective Moreover, the majority of research primarily concentrate on
features. CSDMC2010 SPAM, spambase and the SMS Spam binary classification, disregarding the potential advantages
Collection are all used in the method’s implementation. of employing multi-class classification methods for spam
The results of the comparison showed that the proposed detection.
method beat the current conventional and recent algorithms Fuzzy systems have proven effective for email spam
and methods, with an average accuracy of 98.80% on the detection across multiple studies. Fuzzy systems provide
CSDMC2010 dataset, 97.79% on the spambase dataset, and advantages in dealing with uncertainty but require expertise
98.84% on the SMS Spam collecting dataset [56].Moreover, in design and may struggle with high-dimensional data.
fuzzy inference systems utilising Interval Type-1 and Inter- In contrast, DL methods can handle high-dimensional data.
val Type-2 were created employing four distinct machine DL can automatically learn complex patterns from raw text
learning algorithms to showcase their efficacy in identifying input without extensive feature engineering. This enables DL
spam. The methods evaluated were SVM, LR, and average methods to overcome the curse of dimensionality faced by
perception. The Interval Type-2 Mamdani fuzzy inference fuzzy systems in processing raw email data. DL methods
system (IT2M-FIS) demonstrated superior performance, with can learn directly from raw text while handling high
an accuracy of 0.955, recall of 0.967, F-score of 0.962, and dimensionality [63]. In the following section, the use of DL
methods is explored further for email spam detection, as DL optimizer in both models. According to the study, ReLU
is well-suited to overcome limitations of fuzzy systems. demonstrated superior performance compared to CNN, while
sigmoid showed superior performance compared to LSTM
on average [70]. As well as Rafat et al. investigated the
D. DEEP LEARNING impact of text pre-processing on email classification using
DL is an up-and-coming field that uses several nonlinear ML and DL techniques. The ML and DL algorithms were
processing layers to learn features directly from the input, compared using the Spamassassin corpus, both with and
leveraging AI and ML. Email spam detection accuracy may without text pre-processing. The researchers discovered that
be greatly improved with the help of DL methods. Deng DL methods performed better than ML methods. Specifically,
and Yu conducted an analysis of different DL methods, the LSTM method achieved a precision of 95.26%, recall
categorising them into supervised, unsupervised, and hybrid of 97.18%, and an F1-score of 96% without any text pre-
deep networks based on their network structures. They also processing. [71]. Additionally, Wen, Tingke, et al. introduced
explored various applications of these techniques, includ- LBPS, a phishing scam detection model for blockchain
ing computer vision, language modelling, text processing, financial security. The model is built on LSTM-FCN and
multimodal learning, and information retrieval [64], [65]. BP NN. The proposed model utilises a Backpropagation
DL relies on representations of data that include several Neural Network (BP NN) to analyse implicit features and
levels of hierarchy, often in the form of a neural network a LSTM-FCN NN to analyse the temporal aspects of
with more than two layers. Data features from a higher level transaction data. The experimental findings, using Ethereum
can be spontaneously integrated into those from a lower data, demonstrated that the chosen characteristics effectively
level using these methods. Each neuron in a neural network identified fraudulent accounts involved in phishing scams,
(NN) shares several common characteristics. The number of achieving a 97.86% F1-score and a 97% accuracy rate [72].
neurons and their interconnections are in turn determined by Table 4 presents the previous research on spam detection
the nature of the application being used [66]. Another aspect, using DL methods. DL methods undoubtedly enhance the
Baccouche et al. introduced a multi-label LSTM model to effectiveness of the spam detection method, reduce the impact
identify spam and fraud in emails and social media posts. The of overfitting, and handle large data.
model was developed by merging two datasets. The system A comprehensive explanation of the many different DL
was trained by utilising a collective dataset of prevalent methods that can be used to detect spam in email, including
bigrams obtained from multiple sources. Their model has an models such as CNN, LSTM, and hybrid combinations
accuracy of 92.7%. A limitation of the study was the absence of these methods. There is a significant research gap in
of a comparative analysis with other sophisticated techniques the development of ensemble learning techniques, which
for identifying harmful information. In the future, they intend combine the strengths of many DL models to further boost
to explore alternative NLP methods in order to enhance the performance. This is despite the fact that the results have
accuracy of the model [67]. In this study,Alauthman et al. been promising. In addition, although a great number of
proposed the utilisation of a SVM andGRU-RNN approach studies make use of datasets that are accessible to the
to detect botnet spam emails. Engaging with a dataset public, there is a dearth of research that investigates the
containing spam records. According to their assertion, their application of these models to large-scale datasets that are
method attained a precision of 98.7%. Their research was based on the actual world and have the potential to more
limited to assessing the efficacy of the proposed model using accurately represent a variety of spam characteristics. There
a single dataset. The proposed method accurately identifies is also a lack of attention paid to the interpretability and
spam emails, but additional investigation is required to explainability of DL models, which are essential for the actual
enhance the GRU model by integrating supplementary implementation of spam detection systems. This is another
multiclass classifiers [68]. Moreover, AbdulNabi and Yaseen gap. In addition, the majority of the research that is currently
et al. conducted research on word embedding techniques being conducted place an emphasis on accuracy measures,
for the purpose of classifying spam emails. The scientists while ignoring other significant features like as processing
enhanced the performance of a pre-trained BERT model and efficiency and adaptation to increasingly sophisticated spam
conducted a comparison with DNN and traditional classifiers strategies. By addressing these deficiencies, it may be
such as naïve Bayes and k-NN. The proposed technique possible to develop spam detection systems that are more
attained a 98.67% accuracy and a 98.66% F1 score when robust, efficient, and adaptable through the application of DL
evaluated on two open-source datasets [69]. Furthermore, techniques.
Eckhardt and Bagui et al. designed a study in which they The present review diverged from the previous reviews by
analysed LSTM and CNN methods for the purpose of placing greater emphasis on reevaluating ML, fuzzy system,
classifying textual input. The investigation revealed that and DL methods employed for the purpose of detecting
the LSTM method achieved the maximum accuracy of email spam. The review aims to discuss email spam detection
98.32% and a ROC score of 96.57%. The comparison methods, the parameters utilized for comparative analysis,
just pertains to the classification of textual material. They simulation tools, and the dataset corpus. The reviewed
asserted that the Adam optimizer outperformed the SGD era encompasses recent research articles that contribute to
III. METHODS
Email spam refers to the sending of fraudulent or undesired
mass emails through either an individual’s account or an
automated mechanism. The prevalence of spam emails has
steadily risen over the past decade, posing a widespread
issue. ML and DL have significantly contributed to the
identification of spam emails. Researchers are utilizing a
range of methods and strategies to create innovative spam
detection. In This section will provide an overview of the
most widely used ML and DL methods that have been
optimized for spam detection. FIGURE 7. Structure of the SVM.
classification algorithms. Optimize SVM algorithm for email practice in the fields of data mining, machine learning, and
spam detection is shown in algorithm 1. even statistics to employ the decision tree learning method.
Spam detection has been modified to use DT learning. The
B. DECISION TREE structure of the DT is presented in figure 8.
The DT is a popular technique for classifying data since the A hybrid approach combining LR and DT is used for
solution it produces is both interpretable and straightforward. email spam identification. LR was employed to reduce
Furthermore, it provides a result more quickly than other the impact of noisy data or instances prior to supplying
categorization techniques [88]. It is structured like a tree the data to DT induction. By applying a predetermined
with a central hub, branches, and leaves. The terminal node, false-negative threshold, LR effectively eliminated the noisy
or leaf node, represents a class attribute, and the other data by selecting only the accurate predictions [91]. This
nodes represent potential solutions. To determine the class study used Spambase dataset to assess the proposed
properties of the terminal node, the route from the root to technique. 91.67% accuracy is encouraging for the given
the terminal node must be accurately traced [89]. Tracing the strategy. LR may increase DT performance by minimising
tuples will be made significantly simpler by the translation noisy data. GADT is a hybrid spam email detection
of the tree into categorization rules [90]. It was common method. PCA improved GADT’s performance. Decision tree
Algorithm 1 SVM Algorithm for Email Spam Detection Algorithm 2 DT Algorithm for Email Spam Detection
1: Input: Email message x to classify 1: Input: Email message dataset D
2: Input: Training set S, kernel function k, regularization 2: Calculate entropy H (D) of full dataset
parameters C = {c1 , . . . , cnum }, kernel coefficients γ = 3: while stopping condition not met do
{γ1 , . . . , γnum } 4: for each attribute A do
3: Input: Number of nearest neighbors for k 5: Calculate entropy H (D|A) for splits on A
4: for l = 1 to num do 6: Calculate average entropy over all splits
5: Set C = ci 7: Calculate information gain Gain(A)
6: for j = 1 to num do 8: end for
7: Set γ = γj 9: Choose A with highest Gain(A) as split attribute
8: Train SVM classifier f (x) with parameters (C, γ ) 10: end while
on S 11: Return DT method classifying messages as spam or ham
9: if first classifier then
10: Set f(x) = f (x) as best classifier
11: else C. K-NEAREST NEIGHBOR
12: Compare f (x) with f(x) using k-fold cross- The k-Nearest Neighbor (k-NN) algorithm is one of the most
validation popular since it is simple to use and understand. This is
13: Set f(x) to the more accurate classifier because its advanced features can be quickly grasped and
14: end if put to use [95]. k-NN uses the computed distance between
15: end for a given instance and its k-NN to determine how to categorize
16: end for the instance in question. To which category a dataset belongs
17: Return spam or ham classification of x using final is decided by how many votes are cast for each possible
classifier f(x) nearest neighbor value. If k is set to two, for example, the
dataset will be classified based on its distance from its two
nearest neighbors [96]. The Euclidean distance (ED) between
a specified training sample and a test sample is typically used
for this purpose [97]. The classification results for k-NN vary
greatly depending on the value of k chosen for the number of
neighbors. A simple k-NN structure is given in figure 9.
in the dataset were when executing the k-NN classifier. The involving the categorization of data. Tin K. Ho first presented
percentages of success gained were 98.08% and 95.98%. the generic random forest in 1995, then in 2001, an expansion
They suggested an approach that combines SVM and k-NN. of this approach. There are a lot of decision trees in this
The determination approach they came up with uses names method. Rather than creating each tree using the same set of
and proximity to a restriction on choices to determine which features, it generates a random forest of trees whose collective
instances to pick. The basic idea was to find similar questions prediction is more accurate than that of any one tree [102].
and construct a neighboring SVM that jelly the separation The approach relies on the fact that creating a simple decision
process on the set of similar questions [99]. Furthermore, they tree with a limited number of features requires nothing
conducted experiments using the publicly available dataset in the way of processing resources [103]. The algorithm’s
Dredze, which demonstrated an improvement in accuracy of three primary hyperparameters are node size, tree depth, and
almost 98%. In order to combat spam, they employed k-NN feature sampling. A simple RF structure is given in figure 10.
text classification using Chi squared feature selection to filter
out unwanted messages. The value of K where the k-NN
classifier obtains the highest accuracy was found through
experimentation [50]. Hnini et al. proposed using three
Nearest Neighbour (NN) methods k-NN, Wk NN, and K-d
tree to detect spam. NLP pre-processes emails and extracts
features using Bag-of-words (BoW), N gram, and TF-IDF.
k-NN performed well on four measurement parameters in
Enron and LingSpam datasets [100]. Additionally, a new
spam categorization method that combines the Harris Hawks
optimizer (HHO) and k-NN algorithms. This study found
that the proposed spam detection method had the highest
classification accuracy. The proposed approach achieved
94.3% accuracy in experiments [101]. The k-NN method for
email spam detection is presented in an algorithm 3.
Algorithm 6 ANN Algorithm for Email Spam Detection 99.44% accuracy [20]. Moreover, Gupta et al. studied the
1: Input sample email message dataset efficacy of eight different classifiers and compared their
2: Initialize method parameters w (weight vector) and b results. The results of the classifier evaluation show that the
(bias term) randomly or to 0 CNN classifier achieves a maximum precision of 99.19%
3: repeat and an Average Recall of 99.26% and 99.94% respectively,
4: Get a training message sample (x, c) that our current across the two datasets [125]. As well as a CNN method was
method misclassifies, i.e. sign(wT x + b) ̸ = c developed for SMS spam detection using the Tiago dataset.
5: if no such misclassified sample exists then After preprocessing the text data, including tokenization and
6: Training completed, store final w and b and stop stopwords removal, the CNN achieved 98.40% accuracy in
7: else classifying messages as spam or not spam. The work provides
8: Update parameters: a highly accurate CNN architecture and process for SMS
9: w=w+c·x spam detection [126]. In another study, the analyses images
10: b=b+c using CNN and compares the findings to other ML methods.
11: Go to step 1 The CNN-based methodology detects real-world image spam
12: end if and challenging image spam-like datasets better than earlier
13: until methods by using a new feature set mixing raw photos and
14: To classify new email message x: Canny edges [127]. The algorithm for email spam detection
15: Compute sign(wT x + b) using CNN is presented in algorthm 7.
16: Return email message classification (spam or ham)
H. LONG SHORT-TERM MEMORY model is used to determine the spam likelihood based on
LSTM is an advanced RNN in sequence modeling. RNNs any attached images [77]. In another study, a combined
function work in a similar way the network remembers earlier model using an LSTM, LR, NB, RF, k-NN, SVM and DT
information and utilizes it to process the current input [128]. was tested on the UCI SMS spam collection dataset with
RNNs with traditional architectures have a recurring prob- various embedding techniques (count vectorizer, TF-IDF
lem. Because of the phenomenon known as the vanishing vectorizer and hashing vectorizer). The highest accuracy of
gradient, RNN) are incapable of retaining and recalling long- 98.5% was achieved by the LSTM method in this combined
term dependencies. LSTM is specifically designed to mitigate architecture [78]. Moreover, a Semantic LSTM (SLSTM)
risks related to long-term reliance [129]. The default behavior was proposed for spam SMS detection and classification
of LSTM is to learn long-term dependencies by memorizing using the SMS Spam Collection dataset and Twitter dataset.
information over lengthy periods of time. LSTM employs The SLSTM incorporates a semantic layer into an LSTM
gates to regulate information flow in recurrent computations. network using Word2Vec word embeddings. Experiments
LSTM was designed in 1997, this type of recurrent neural showed the proposed SLSTM technique achieved accuracy
network to deal with temporal data sequences and to solve results of 99.01% on the SMS Spam Collection dataset
the challenges of expanding and vanishing gradients, which and 95.09% on the Twitter dataset [132]. Furthermore,
is a problem [130]. A memory cell is included in this neural a lightweight GRU (LG-GRU) was employed instead of
network which can hold values that have been recorded the LSTM layer for spam classification on the SMS Spam
throughout time in relation to previous information. The Collection dataset. To improve the semantic understanding of
memory cell is controlled by three gates. Each of the gates the SMS text inputs, external information from WordNet was
serves a different function. The forget gate is responsible incorporated. Compared to LSTM models, the proposed LG
for determining whether the information from the previous GRU model drastically reduced training time and the number
timestamp should be retained or disregarded. The input of parameters, while maintaining 99.04% accuracy for spam
gate is responsible for acquiring fresh information from the categorization [79]. Additionally, RNNs are one type of NN
input [131]. The output gate which sends the new information that can remember past data but suffer from vanishing and
from current to the next timestamp. This is accomplished via a exploding gradient issues. To overcome this drawback, the
sigmoid function, which returns a number between zero that proposed system leverages the Spambase and Ling Spam
is (‘‘totally forget’’) and one which is (‘‘completely keep’’) datasets to classify spam and ham emails using an LSTM
when executed. Every time an LSTM network is activated, architecture. LSTM keeps track of prior email information
it creates two states. Those are, a cell state that is passed and learns to select relevant features while ignoring irrelevant
to the next time-step, as well as time-step’s output vector is ones for identifying spam. Experiments showed the LSTM
hidden state. A simple architecture of the LSTM is presented method achieves 97.4% accuracy, outperforming other DL
in figure 14. methods on these datasets [80]. Moreover, spam emails are
used for propaganda, advertising, and phishing, which can
financially and morally harm internet users as well as disrupt
internet traffic. To address this issue, detected spam emails
in a Turkish dataset with 100% accuracy using the Keras
library and LSTM method. The results demonstrated that an
LSTM based method was highly effective for spam detection
in Turkish emails [133]. Furthermore, spam emails cause
issues like network disruption and cybercrime. A sentiment
analysis-friendly spam mail detection method was proposed
using Word Embedding techniques including Bag of Words,
Hashing, and an LSTM method. Experiments on a dataset
of 5,572 messages showed the proposed technique achieved
93-98% in precision, recall, F1-score, and accuracy [134].
FIGURE 14. Structure of the LSTM. The algorithm for email spam detection using an LSTM is
presented in algorithm 8.
Since their introduction, several DL based spam detec-
tion algorithms have been proposed. Yang and his team
outlined an email classification system called Multi-Modal I. GATED RECURRENT UNIT
Architecture with Model Fusion (MMA-MF). The primary GRU is an RNN version that employs gating methods to solve
focus of this model is to identify spam by processing the vanishing gradient problem through controlling information
email’s text and images independently using an LSTM flow between cells in the neural network. Kyunghyun Cho
model and a CNN model, respectively. An LSTM model introduced the GRU network in 2014, This RNN is almost
is utilized to determine the likelihood that an email is like LSTM neural network [135]. The structure of the GRU
spam based on its textual content. Meanwhile, a CNN allows it to effectively capture dependencies from large
Algorithm 8 LSTM Algorithm for Email Spam Detection relevant context and sequentially whether the message is
1: Input Email Spam dataset likely to be spam or not. The ability of GRUs to selectively
2: Convert the text data into numerical vectors using word propagate relevant information while processing variable
embeddings length sequences makes them a promising approach for
3: Split the data into training and testing modeling email text for spam detection [70]. Moreover,
4: Define LSTM architecture a new DL approach uses CNN and RNN to analyze
5: Set the LSTM units and hidden layers email communication by classifying message components
6: Add an embedding layer to convert numerical vectors into zones. The method leverages GRU-CRF to segment
into word embedding emails into zones like header, quotation, greeting, and body.
7: Add dropout Experiments show the technique achieves 98 accuracy on
8: Add dense output layer using sigmoid zone prediction, outperforming traditional methods, with
9: Compile with binary cross-entropy improved adaptability and resilience [140]. Furthermore,
10: Train the method with specified epochs a lightweight GRU (LG-GRU) was employed instead of
11: Evaluate the method an LSTM layer for spam classification on the SMS Spam
12: Predict the email message (spam or ham) Collection dataset. To improve the semantic understanding
of the SMS text inputs, external information from WordNet
was incorporated. Compared to LSTM models, the proposed
sequences of data in a flexible manner, while retaining LG-GRU model drastically reduced training time and the
knowledge from prior sections of the sequence. The GRU number of parameters, while maintaining 99.04% accuracy
model consists of two gating mechanisms: the update gate for spam categorization [79]. The algorithm for email spam
and the reset gate [136]. This neural network utilises only one detection using GRU is presented in algorithm 9.
hidden state to concurrently retain both long-term and short-
term memory. The reset gate is formulated and calculated by Algorithm 9 GRU Algorithm for Email Spam Detection
incorporating the hidden state from the previous time step and 1: Input Email Spam dataset
the input data from the current time step. The gate controls 2: Convert the text data into numerical vectors using word
the integration of new input with existing memory [137]. The embeddings
update gate is used for how much of the previous state is 3: Split the data into training and testing
kept. This is extremely useful since the method may choose 4: Define GRU architecture
to duplicate all previous data and remove the possibility 5: Set the GRU units and hidden layers
of vanishing gradients. This is accomplished via a sigmoid 6: Add an embedding layer to convert numerical vectors
function, which returns a number between 0 and 1. For this into word embedding
simple architecture, the network is able to train rapidly [138]. 7: Add dropout
A simple architecture of the GRU is presented in figure 15. 8: Add dense output layer using sigmoid
9: Compile with binary cross-entropy
10: Train the method with specified epochs
11: Evaluate the method
12: Predict the email message (spam or ham)
when making predictions [142]. A simple architecture of the Algorithm 10 Bi-LSTM Algorithm for Email Spam Detec-
Bi-LSTM is presented in figure 16. tion
1: Input Email Spam dataset
2: Convert the text data into numerical vectors using word
embeddings
3: Split the data into training and testing
4: Define Bi-LSTM architecture
5: Set the Bi-LSTM units and hidden layers
6: Add an embedding layer to convert numerical vectors
into word embedding
7: Add dropout
8: Add dense output layer using sigmoid
9: Compile with binary cross-entropy
FIGURE 16. Structure of the Bi-LSTM. 10: Train the method with specified epochs
11: Evaluate the method
The task of email spam detection involves the construction 12: Predict the email message (spam or ham)
of models that capture the contextual information of words
inside an email, enabling the determination of whether the
email’s content may be classified as spam or not. The LSTM’s adaptability to various writing styles and content
Bi-LSTM model is very suitable for this particular task types further enhances its effectiveness across different
because of its ability to effectively capture both semantic datasets and evolving spam techniques.
and syntactic links between words. This is achieved by To further improve LSTM’s accuracy in email spam
processing the email content in both forward and backward detection, several modifications can be considered. Incor-
orientations [143]. Additionally, a new DL model for email porating attention mechanisms could help the model focus
spam detection using sentiment analysis of email text, on the most relevant parts of an email. Ensemble methods,
combining WordEmbeddings, CNN, and Bi-LSTM networks combining LSTM with other models, could leverage the
to analyze textual and sequential properties. Evaluated on two strengths of different approaches. Transfer learning, by pre-
spam datasets, the method achieves improved accuracy of 98- training the LSTM model on a large corpus of email
99% and outperforms popular classifiers and state-of-the-art data, could enhance performance, especially when dealing
methods, proving its superiority for spam detection [144]. with limited labeled data. Additional strategies such as
Moreover, spam emails are becoming more common and feature engineering, regularization techniques, hierarchical
troublesome as email usage grows, so there is a need for LSTM structures, and character-level input processing could
effective methods to detect spam. A recent study compared also contribute to improved accuracy.Furthermore, numerous
different ML and DL models, such as RF, NB, ANN, SVM, evaluation metrics have been employed to measure the
LSTM, and Bi-LSTM, for the task of identifying spam effectiveness of these LSTM model. Here are some frequently
emails. The study found that Bi-LSTM had the best accuracy used metrics in the papers we have reviewed:
of 98.57% for spam prediction [145]. Furthermore, spam text Accuracy: Accuracy is one factor to consider when
messages steal information from users and hurt them, but rating categorization models. Accuracy is the proportion
the methods available for finding them aren’t good enough. of forecasts that method predicted successfully. For binary
The vectorization-based feature engineering and Bi-LSTM classification, accuracy can also be assessed in terms of
networks can be used together to make an effective predictor positives and negatives, as shown below:
that can find spam SMS. Experiments showed that the method
is more accurate than other methods in terms of precision, TP + TN
Accuracy = (1)
recall, and F1 measures [146]. The algorithm for email spam TP + TN + FP + FN
detection using Bi-LSTM is presented in algorithm 10.
The LSTM model has proven to be the most effective Precision: Precision can also be used to judge how well an
for email spam detection due to its specialized architecture identifying system works. It is found by adding up the number
designed for sequential data. Emails are inherently sequen- of true positives to the number of fake positives for each class.
tial, consisting of words and sentences in a specific order, It shows really good cases out of all the optimistic forecasts.
which aligns perfectly with LSTM’s strengths. The model’s TP
memory cell excels at capturing long-term dependencies Precision = (2)
TP + FP
and contextual information, allowing it to effectively learn
patterns and relationships between words or tokens in email Recall: Recall is a quantitative measure that indicates the
sequences. This ability to retain and process contextual infor- proportion of instances correctly identified by the method
mation over many timesteps is crucial for spam detection, among all the possible positive labels. The term refers to the
as important clues may be spread throughout the email body. ratio of true positive cases to the sum of true positive and false
negative cases. missing numbers or the substitution of such values with the
mean, the median, or specified values.
TP
Recall = TPR = (3)
TP + FN C. TEXT PREPROCESSING
F1-score: The accuracy metric quantifies the frequency at Text preprocessing transforms raw text data into a cleaner
which a model accurately predicted the entirety of the dataset. form before analysis. Removing extraneous elements allows
more accurate feature extraction and developing further
Precision*Recall downstream. Preprocessing is thus an essential first step when
F1-score = 2 ∗ (4)
Precision + Recall working with text data. Common text cleaning tasks include
stripping punctuation, deleting HTTP links, eliminating
IV. DATASETS COLLECTION AND PRE-PROCESSING special characters, getting rid of stop words, lowercasing
A. DATASETS all text, correcting spellings, and more. Numerous text-
The collection of data samples contained within a corpus preprocessing techniques exist for the purpose of eliminating
plays a pivotal role in evaluating the efficacy of any spam unnecessary information from incoming text input, as shown
detection technique. While there exists several conventional in Figure 17.
datasets that are commonly leveraged to assess text classifi-
cation, only recently have researchers publishing new spam
detection methodologies made an effort to provide public
access to the same corpora of emails applied to assess the
effectiveness of their proposed methods. A comprehensive
listing of publicly released spam email datasets referenced
across the datasets characterize covered in this paper are sum-
marized in Table 5. Each corpus contains intrinsically unique
traits and labeling that ultimately dictate the generalizability
and alignment of experimental outcomes for every published
approach utilizing that data source. Key dimensions that
characterize an evaluation dataset’s nature include the size
of emails, proportional class balance between spam and ham
samples.
The vast majority of features leveraged to distinguish spam
from legitimate emails manifest in textual content. Applying
appropriate pre-processing to standardize, clean, and filter
this text data represents a foundational data wrangling step FIGURE 17. Various text preprocessing techniques.
prior to method development. The following sub-section
provides the details of pre-processing techniques. 1) STEMMING
Stemming seeks to simplify text analysis by stripping words
B. PRE-PROCESSING TECHNIQUES down to their base form. Tools match terms like ‘‘drunk’’,
Before data can be analyzed, it must be prepared through ‘‘drink’’, and ‘‘drank’’ to their core stem - ‘‘drink’’. This
a process called preprocessing. Raw datasets often contain normalization groups together different inflections, allowing
inconsistencies like missing values, duplicate entries, and more generalized patterns to emerge. Stemmers remove
text in incompatible formats that methods cannot interpret. suffixes systematically using rule-based algorithms like the
Preprocessing transforms messy raw data into a clean form popular Porter stemmer in Python’s NLTK library. However,
that analytical methods can work with effectively. This overly zealous stemming risks both under stemming and over
crucial step improves the accuracy of later analysis. Common stemming textual data. Under stemming fails to fully reduce
preprocessing tasks include handling incomplete data, stan- related terms down to one stem.
dardizing text into numerical forms, extracting informative
features, and removing noise. Careful preprocessing allows 2) TOKENIZATION
methods to discover more robust patterns and make better Tokenization splits text into discrete units for analysis. First,
predictions. Mostly used preprocessing techniques for email extraneous characters like HTML and punctuation are filtered
spam detection is given below: out. Then words and numbers are extracted into individual
tokens by splitting on whitespace and symbols. These atomic
1) HANDLING MISSING VALUES elements can be manipulated, counted, classified and more.
The management of missing values in datasets is a key Tokenization forms the basis for quantitative text analysis.
component in preventing bias and ensuring that methods This preprocessing step makes linguistic features accessible
continue to produce accurate results. There are a number of using Python’s Regex library and Natural Language Pro-
approaches that can be utilized, including the elimination of cessing toolkits. Proper tokenization increases performance
on tasks ranging from sentiment classification to document all share the lemma ‘‘play’’. Lemmatizers can thus group
summarization. Table 6 shows a sample sentence and its together different inflections and variants by canonicalizing
associated tokens. them to their common origin. Tools like NLTK’s WordNet
Lemmatizer leverage semantic databases to correctly resolve
3) STOPWORDS REMOVAL words to their underlying lemma based on context. Prop-
Stop words are common filler words that carry little meaning, erly deploying lemmatization avoids incorrectly collapsing
such as ‘‘a’’, ‘‘an’’, ‘‘so’’, ‘‘and’’, and ‘‘the’’. Though unrelated words while clustering together meaningful word
frequently occurring, these terms contribute more noise associations, boosting performance on semantics-sensitive
than signal during text analysis. Filtering out stop words tasks.
shrinks datasets down to more meaningful vocabulary. Most
D. FEATURE EXTRACTION TECHNIQUES
text analysis toolkits provide standard stop word lists and
Feature extraction converts unstructured text into quantitative
functions like Python’s NLTK library to effortlessly strip
data amenable to modeling, by transforming documents
this cover. Table 7 presents the descriptions and web URLs
into numerical vectors. Common methods calculate Term
of several libraries and packages that are accessible for the
Frequency-Inverse Document Frequency (TF IDF) weights,
purpose of preprocessing text data.
Bag of words (BoW), count N-gram patterns, encode
syntactic Parsing Trees, apply Topic Modeling algorithms
4) NORMALIZATION like Latent Dirichlet Allocation, or ingest word vectors
Normalization transforms text into a standard format to (Word2Vec). Robust text analytics combines multiple feature
enhance analysis. This preprocessing step structures messy extraction methods to fully capture linguistic complexity
linguistic data by correcting variant spellings, coercing case within interpretable data structures.
and tense, resolving contractions, converting numbers to Spam is a major issue in current email communication,
numerals, transliterating terms, aligning related words to a stemming from motives like advertising and fraud. To effec-
root form via stemming and lemmatization, and more. tively detect spam, appropriate preprocessing techniques
are needed, such as removing noise, taking out common
5) LEMMATIZATION stop words, stemming, lemmatization, and adjusting term
Lemmatization maps words to their root form using lexical frequencies. Mallampat et al. proposed a multi-modal system
analysis. It relies on dictionaries and knowledge of mor- (MMA FM) that uses a combined method (IMTF-IDF+Skip-
phology to connect related terms to the same base lemma. thoughts) and a CNN to extract features. This achieves
For example, the words ‘‘plays’’, ‘‘playing’’, and ‘‘played’’ superior 99.16% accuracy in identifying spam compared
to using Naive Bayes, when tested on the Enron, Dredze, classifiers. Bag-of-words style features unlock effective text
and TREC 2007 datasets [161]. Saini et al. introduced a analysis despite ignoring complex linguistic structure [170].
new method for predicting email spam that uses random The flexibility of multiple vocabulary quantification strate-
forest for feature extraction. The features extracted by gies enables customized feature engineering for tasks ranging
the random forest are then fed into a logistic regression from spam detection to sentiment analysis across domains.
method which predicts whether an email is spam [162].
Cheng et al. presented a new attack method that strategically 2) ONE HOT ENCODING
modifies text data using insights from adversarial examples. One-hot encoding transforms text into numeric features by
It intentionally alters features that represent an email. They assigning each unique word or token its own binary vector.
explored different feature extraction techniques using various Documents represent bags of these orthogonal hot vectors -
NLP methods. Their study designs effective mechanisms to sparse yet unambiguous codes with a single ‘‘1’’ marking
translate adversarial perturbations back into magic words"in the presence of each distinctive term. One hot encoding
the text. This causes intentional misclassifications across matrices efficiently quantify textual data, with vector lengths
multiple datasets and ML methods under white-box, gray- equal to vocabulary size rather than the longer original
box and black-box attack scenarios [163]. Hassan et al. raw text. By indexing words into binary indicator columns,
tested different feature extraction techniques along with two this method facilitates quantitative analysis while retaining
supervised ML classifiers on two public spam email datasets. the ability to map patterns back to original tokens. One
They emphasized the importance of finding the optimal hot encoding forms the input for many machine learning
pairing of feature extraction and classification method. They algorithms, often outperforming methods lacking explicit
also highlighted the benefits of testing on different datasets. word-level encoding. The simplicity of tallying vocabulary
SVM and NB showed impressive accuracy with TF-IDF, into orthogonal dimensions makes one hot representation a
reaching over 99% and around 98% respectively [164]. widely useful feature extraction technique for textual data.
Table 8 presents the previous research on spam detection
3) WORD EMBEDDING
using feature extraction techniques.
One-hot encoding scales poorly to large vocabularies due
to its explosion of sparse binary features. Embedding
1) BAG OF WORDS (BOW) methods address this weakness through distributed repre-
BoW representation is a simple yet powerful approach for sentation. Word embeddings map language into compact
extracting numeric text features. This method counts the dense vectors capturing similarities between related terms.
occurrences of words within a document while disregarding For instance, vectors for cat and kitten cluster together,
grammar and word order. Documents become vectors denot- unlike the orthogonal one-hot encoding. This efficiency
ing the frequency of terms like ‘‘cat’’, ‘‘tree’’, and ‘‘slept’’. facilitates DL on extensive corpora. Embeddings also encode
Bags-of-words thus efficiently quantifies unstructured text meaning - algebraic operations reveal relationships like
as matrices tallying vocabulary. Many extensions enrich this king is to queen as man is to woman. Created using
basic technique like n-grams counting multi-word expres- neural networks, embeddings represent both syntax and
sions and skip-grams sampling non-contiguous patterns. For semantics within a low-dimensional subspace. Methods
instance, Barushka et al. detected deceptive hotel reviews on learn contextual associations, quantifying elusive concepts
TripAdvisor by representing documents as n-gram frequen- like gender or formality. Versatile representations power
cies and skip-gram embeddings to train machine learning cutting-edge applications from chatbots to search. Embed-
dings distill enormous dictionaries into meaningful, manip- 7) GLOVE WORD EMBEDDING
ulable codes advancing the frontiers of text mining. This is an unsupervised method that generates a vector to
represent words or text. It aims to capture semantic and
4) WORD2VEC contextual meaning of words. It is a count-based method
This method turns words into vectors and works like a that utilizes co-occurrence statistics of words in a corpus.
two-layer network to handle text that is made up of words. Specifically, it trains on the non-zero entries in a word-
There is a matched vector in the space for every word in the context co-occurrence matrix. The key intuition behind Glove
corpus. Either a continuous skipgram or a continuous bag of Word Embedding is that ratios of word-word co-occurrence
words design (CBOW) is used by Word2vec. In the case of probabilities can encode meaning. Equation 3 demonstrates
the continuous skipgram, the present word is used to guess the computation of the co-occurrence probability for the texts
the words that come after it. In the CBOW method, on the in each word embedding.
other hand, the surrounding or neighboring words are used to V (tx, ty, tz) = Fxy /Fyz
(7)
guess a middle word. With a small amount of training data,
the skip-gram method can correctly represent even rare words where,
or phrases. However, the CBOW method is many times faster • The co-occurrence possibility for the texts tx and ty is
to train and is a little more accurate for common keywords. Fxy .
The word2vec method is better because it lets you learn • The co-occurrence possibility for the texts ty and tz is
high-quality word embedding in less time and space. From Fyz .
a much larger body of writing, it is possible to learn bigger • The regular texts or words that appear in a document are
embeddings (with more dimensions). tx and ty and the investigated text is tz.
• When the above-mentioned ratio is 1, the investigated
5) N-GRAMS text is related to tx rather than ty.
A lot of Natural Language Processing (NLP) tasks use N-
grams, which are long strings of words or tokens in a text. V. IMPLICATIONS
Based on the number of ‘‘n,’’ they are divided into different The review covered a comprehensive analysis and inte-
groups, such as Unigram, Bigram, and Trigram. Kanaris et al. gration of the present condition of email spam detection.
used a set of 2,893 emails to pull out n-gram traits from text. A broad range of ML and DL approaches for email spam
In their study, they looked at success factors like spam recall detection is covered, along with an analysis of how these
and precision. Combining SVM with n-grams, they were able approaches could be improved for greater efficiency. The
to make a spam filtering method that had an accuracy score of review explored the intricate difficulties encountered in
more than 0.90 for finding spam [171]. Table 9 below shows identifying and screening spam emails while recognizing
several examples of N-grams. the constraints of conventional techniques such as blocklists,
real-time blackhole listing, and content-based approaches.
6) TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY
The review analyzed and addressed current research defi-
(TF-IDF)
ciencies, shedding light on areas that require additional
exploration. This will emphasize the continuous necessity for
The BoW approach faces challenges with high-frequency
innovation and enhancement in spam detection techniques.
terms dominating the data while lower-scoring domain-
In addition, the study suggested potential areas for future
specific words may be eliminated or ignored. An improve-
research, highlighting possible paths for further advancement
ment on bag-of-words is the TF-IDF technique. TF-IDF
and directing researchers towards addressing the observed
multiplies the number of times a term appears in a document
deficiencies.
TF by the inverse of how often that term shows up
The review emphasized the importance of effective spam
across all documents (inverse document frequency or IDF).
detection in order to safeguard users from the detrimental
These scores highlight unique and information-rich terms
effects of spam, including time wastage, resource depletion,
within a document. As demonstrated by Equations 1 and 2,
and potential data theft, given the widespread use of emails
TF represents the ratio of the term’s count in the document
across many industries. The objective of the study is to offer
to the overall count of all terms. On the other hand, IDF is
a methodical and empirically supported comprehension of
the logarithm of the total number of documents divided by
current research, assessing the efficacy of various ML and
the number of documents that contain the term. The resulting
DL techniques. Through the synthesis and examination of
TF-IDF scores better represent a term’s significance.
data from many studies, it aims to provide an impartial
assessment of the advantages and disadvantages of current
Number of times word w appears in a document methodologies. The thorough assessment of methods for
Tf(w) =
Total number of terms in the document identifying email spam has substantial ramifications for
(5) the domains of digital communication and cybersecurity.
Total number of documents The study examined the application of various ML and
Idf(w) = loge (6) DL techniques, with a focus on shifting from traditional
Number of documents with term w
143650 VOLUME 12, 2024
E. H. Tusher et al.: Email Spam: Detection Methods, Challenges, and Open Research Problems
methodologies to more sophisticated ones. This change has TABLE 9. Illustration of an n-grams.
the capacity to improve the precision of detection and the
efficiency of computing. This technological advancement
may lead to enhanced email systems that offer more
robust defenses against harmful material and reduce the
wasteful consumption of resources. The review comprised various adversarial attacks - poisoning attacks that pollute
a comprehensive analysis and integration of the present training data, evasive attacks that manipulate test samples
condition of email spam detection. to bypass filters, and privacy attacks attempting to steal
sensitive training data. Deep fakes leveraging AI generation
VI. CHALLENGES OF EMAIL SPAM DETECTION and modification techniques around images, video and text
Spam detection systems have difficulty figuring out how to for disseminating misinformation further threaten credibility.
properly evaluate features across textual, temporal, semantic, Imbalanced datasets with far more legitimate emails than
and statistical dimensions because the amount of different spam continue biasing method performance towards false
and complicated data on the Internet is growing all the positives. Research on intelligent oversampling methods aims
time. Additionally, most methods are trained on balanced to improve minority class representation during training. The
datasets which rarely match real-world conditions. Self- dynamic evolution of spam tactics also reduces generalization
learning methods that can adapt without manual supervision capabilities against new previously unseen attacks. Ensuring
remain an open area. Spam detection methods also face method robustness through adversarial training is an active
research direction. Potential adversarial samples crafted robust solutions. Furthermore, it is essential for future
specifically to fool deep nets pose reliability hurdles. research to focus on providing researchers with standardized
Detecting adversarial patterns and training on adversarial labelled datasets to train classifiers. Additionally, enhancing
datasets helps improve resilience. the accuracy and reliability of spam detection methods can
The black-box nature of deep networks also hampers be achieved by incorporating other features into the dataset,
method interpretability and user trust. Advancing explainable such as the spammer’s IP address and location. The following
AI to increase transparency in method behaviors and deci- are further fields of future study and open problems that need
sions thus remains important. The computational intensity to be solved in the field of spam detection:
for large-scale DL limits accessibility to organizations with Current spam detection approaches rely heavily on limited
fewer resources, though optimizations around method effi- features from email headers, subject lines, and message
ciency and hardware acceleration are progressively lowering bodies. To improve accuracy, more comprehensive and
barriers. automated feature engineering is needed, moving beyond
Generalizability across different email systems, user manual selection. While most evaluations focus on statistical
groups, and usage patterns is needed for wide real-world performance metrics like precision and recall, incorporating
deployment. Multi-model learning and personalization are time complexity analysis would provide crucial insight into
promising techniques under investigation. Adoption is made real-world viability. Exploring advanced feature extraction
harder by problems with privacy, usability, and integrating methods using DL on various email components, beyond
content analysis across a wide range of old systems just message bodies, can reveal more nuanced signals for
and infrastructure. Limited availability of labeled data for detection. Several system design aspects warrant focus to
adequately training deep nets continues to be an industry wide enhance practical applicability. These include improving fault
bottleneck, although data augmentation, transfer learning tolerance for reliability, ensuring quick response times under
and semi-supervised techniques help multiply value from heavy loads, and implementing self-learning capabilities
limited labels. Finally, meeting real-time latency demands at without manual supervision for robust adaptability to evolv-
scale for live traffic with deep methods has throughput and ing spam tactics. Dynamic updating of feature representations
optimization implications. Quantization, network pruning using deep neural networks as new spam data emerges
and efficient method distillation actively aim to improve can bolster detection relevance over time. Ensuring strong
inferencing speed. security mechanisms against exploratory attacks or poisoning
of the pipeline data or model itself is imperative for trust-
VII. RESEARCH GAPS AND OPEN RESEARCH PROBLEMS worthy operation. Reducing false positive rates continues to
This section examines the areas where research is lacking and pose challenges to usability. Expanding beyond textual spam
the problems that remain in the field of spam identification. to effectively flag image-based messages and addressing
Current detection approaches rely heavily on manually engi- real-time threats rather than relying on batch processing,
neered datasets which rarely match the nuanced complexities given the low latency constraints of email systems, will sig-
of real-world spam. Future work should select developing nificantly expand practical applicability. Several promising
robust methods using authentic spam samples only. Though research directions emerge. The lack of labeled multilingual
ML, fuzzy logic and DL methods are individually leveraged corpora presents an opportunity for developing more globally
today, hybrid systems that synergistically combine multiple effective solutions. Semi-supervised learning methods could
techniques could potentially improve accuracy and efficiency help leverage vast amounts of unlabeled data. Identifying
further. Enhanced feature engineering leveraging deep neural coordinated spammer networks and behaviors could lead
networks’ self-learned representations via representation to more proactive defense strategies. Rather than manual
learning presents promising opportunities to automatically labeling or curation that can introduce unconscious bias,
capture differentiating attributes. Clustering algorithms that discovering ground truth spam characteristics automatically
enable dynamic spam database updates based on continuous through federated learning over decentralized data holds
user feedback requires exploration for tighter spam relevancy. potential for more robust and unbiased models. Exploring
In addition to DL based blockchain methods and concepts the potential of large language models in transforming
can potentially be employed for email spam detection in the spam detection is justified due to their ability to catch
future. Advancing the art of manual spam dataset annotation intricate patterns and contextual nuances that conventional
by collaborating with linguistics and psychology experts methods may overlook. Studying the potential of fine-
can potentially better encapsulate semantic and cognitive tuning pre-trained models such as BERT or GPT for spam
subtleties within messages for training more discerning mod- classification tasks could lead to the development of more
els. Hardware optimizations leveraging graphics cards and precise and flexible spam detection systems. Moreover, the
field-programmable gate arrays provide additional vectors to utilization of these expansive models could potentially tackle
improve real-time throughput and latency when classifying existing obstacles in spam detection, including managing
high-velocity email streams. Centrally, the availability of evolving spam strategies and minimizing false positives,
multifaceted, standardized labeled corpora spanning diverse, hence facilitating the development of more resilient and
real-world spam types remains lacking, constraining more effective spam detection solutions.
[32] N. Pérez-Díaz, D. Ruano-Ordás, F. Fdez-Riverola, and J. R. Méndez, [54] X. Zheng, X. Zhang, Y. Yu, T. Kechadi, and C. Rong, ‘‘ELM-based
‘‘SDAI: An integral evaluation methodology for content-based spam spammer detection in social networks,’’ J. Supercomput., vol. 72, no. 8,
filtering models,’’ Expert Syst. Appl., vol. 39, no. 16, pp. 12487–12500, pp. 2991–3005, Aug. 2016.
Nov. 2012. [55] S. Rezvani, X. Wang, and F. Pourpanah, ‘‘Intuitionistic fuzzy twin
[33] N. Saidani, K. Adi, and M. S. Allili, ‘‘A semantic-based classification support vector machines,’’ IEEE Trans. Fuzzy Syst., vol. 27, no. 11,
approach for an enhanced spam detection,’’ Comput. Secur., vol. 94, pp. 2140–2151, Nov. 2019.
Jul. 2020, Art. no. 101716. [56] K. Juneja, ‘‘Two-phase fuzzy feature-filter based hybrid model for spam
[34] Z. Zhang, E. Damiani, H. A. Hamadi, C. Y. Yeun, and F. Taher, ‘‘Explain- classification,’’ J. King Saud Univ. Comput. Inf. Sci., vol. 34, no. 10,
able artificial intelligence to detect image spam using convolutional pp. 10339–10355, Nov. 2022.
neural network,’’ in Proc. Int. Conf. Cyber Resilience (ICCR), Oct. 2022, [57] I. Atacak, O. Çıtlak, and I. A. Dogru, ‘‘Application of interval type-2
pp. 1–5. fuzzy logic and type-1 fuzzy logic-based approaches to social networks
[35] A. Hosseinalipour and R. Ghanbarzadeh, ‘‘A novel approach for spam for spam detection with combined feature capabilities,’’ PeerJ Comput.
detection using horse herd optimization algorithm,’’ Neural Comput. Sci., vol. 9, p. e1316, Apr. 2023.
[58] U. Srinivasarao and A. Sharaff, ‘‘SMS sentiment classification using
Appl., vol. 34, no. 15, pp. 13091–13105, Aug. 2022.
an evolutionary optimization based fuzzy recurrent neural network,’’
[36] M. Novo-Lourés, D. Ruano-Ordás, R. Pavón, R. Laza, S. Gómez-Meire,
Multimedia Tools Appl., vol. 82, no. 27, pp. 42207–42238, Nov. 2023.
and J. R. Méndez, ‘‘Enhancing representation in the context of multiple- [59] A. W. Wijayanto and Takdir, ‘‘Fighting cyber crime in email spamming:
channel spam filtering,’’ Inf. Process. Manage., vol. 59, no. 2, Mar. 2022, An evaluation of fuzzy clustering approach to classify spam messages,’’ in
Art. no. 102812. Proc. Int. Conf. Inf. Technol. Syst. Innov. (ICITSI), Nov. 2014, pp. 19–24.
[37] Z. F. Sokhangoee and A. Rezapour, ‘‘A novel approach for spam detection [60] L. Bansal and N. Tiwari, ‘‘Feature selection based classification of spams
based on association rule mining and genetic algorithm,’’ Comput. Electr. using fuzzy support vector machine,’’ in Proc. Int. Conf. Smart Electron.
Eng., vol. 97, Jan. 2022, Art. no. 107655. Commun. (ICOSEC), Sep. 2020, pp. 258–263.
[38] A. R. Yeruva, D. Kamboj, P. Shankar, U. S. Aswal, A. K. Rao, and [61] S. Wang, X. Zhang, Y. Cheng, F. Jiang, W. Yu, and J. Peng, ‘‘A fast
C. S. Somu, ‘‘E-mail spam detection using machine learning—KNN,’’ content-based spam filtering algorithm with fuzzy-SVM and k-means,’’
in Proc. 5th Int. Conf. Contemp. Comput. Informat. (IC3I), Dec. 2022, in Proc. IEEE Int. Conf. Big Data Smart Comput. (BigComp), Jul. 2018,
pp. 1024–1028. pp. 301–307.
[39] M. A. Shaaban, Y. F. Hassan, and S. K. Guirguis, ‘‘Deep convolutional [62] S. A. Khan, K. Iqbal, N. Mohammad, R. Akbar, S. S. A. Ali, and
forest: A dynamic deep ensemble approach for spam detection in text,’’ A. A. Siddiqui, ‘‘A novel fuzzy-logic-based multi-criteria metric for
Complex Intell. Syst., vol. 8, no. 6, pp. 4897–4909, Dec. 2022. performance evaluation of spam email detection algorithms,’’ Appl. Sci.,
[40] M. F. Faisal, M. N. U. Saqlain, M. A. S. Bhuiyan, M. H. Miraz, and vol. 12, no. 14, p. 7043, Jul. 2022.
M. J. A. Patwary, ‘‘Credit approval system using machine learning: [63] X. Wang, Y. Zhao, and F. Pourpanah, ‘‘Recent advances in deep learning,’’
Challenges and future directions,’’ in Proc. Int. Conf. Comput., Netw., Int. J. Mach. Learn. Cybern., vol. 11, pp. 747–750, Jan. 2020.
Telecommun. Eng. Sci. Appl. (CoNTESA), 2021, pp. 76–82. [64] A. Kamilaris and F. X. Prenafeta-Boldu, ‘‘Deep learning in agriculture:
[41] F. Sebastiani, ‘‘Machine learning in automated text categorization,’’ ACM A survey,’’ Comput. Electron. Agricult., vol. 147, pp. 70–90, Apr. 2018.
Comput. Surveys, vol. 34, no. 1, pp. 1–47, Mar. 2002. [65] L. Deng and D. Yu, ‘‘Deep learning: Methods and applications,’’ Found.
Trends Signal Process., vol. 7, nos. 3–4, pp. 197–387, 2014.
[42] M. RAZA, N. D. Jayasinghe, and M. M. A. Muslam, ‘‘A comprehensive
[66] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew, ‘‘Deep
review on email spam classification using machine learning algorithms,’’
learning for visual understanding: A review,’’ Neurocomputing, vol. 187,
in Proc. Int. Conf. Inf. Netw. (ICOIN), Jan. 2021, pp. 327–332.
pp. 27–48, Apr. 2016.
[43] N. Govil, K. Agarwal, A. Bansal, and A. Varshney, ‘‘A machine learning [67] A. Baccouche, S. Ahmed, D. Sierra-Sosa, and A. Elmaghraby, ‘‘Mali-
based spam detection mechanism,’’ in Proc. 4th Int. Conf. Comput. cious text identification: Deep learning from public comments and
Methodologies Commun. (ICCMC), Mar. 2020, pp. 954–957. emails,’’ Information, vol. 11, no. 6, p. 312, Jun. 2020.
[44] C. Bansal and B. Sidhu, ‘‘Machine learning based hybrid approach for [68] M. Alauthman, ‘‘Botnet spam e-Mail detection using deep recurrent
email spam detection,’’ in Proc. 9th Int. Conf. Rel., INFOCOM Technol. neural network,’’ Int. J. Emerg. Trends Eng. Res., vol. 8, no. 5,
Optim., Sep. 2021, pp. 1–4. pp. 1979–1986, May 2020.
[45] P. Thakur, K. Joshi, P. Thakral, and S. Jain, ‘‘Detection of email spam [69] I. AbdulNabi and Q. Yaseen, ‘‘Spam email detection using deep learning
using machine learning algorithms: A comparative study,’’ in Proc. 8th techniques,’’ Proc. Comput. Sci., vol. 184, no. 2, pp. 853–858, 2021.
Int. Conf. Signal Process. Commun. (ICSC), Dec. 2022, pp. 349–352. [70] A. A. Abdullahi and M. Kaya, ‘‘A deep learning based method to detect
[46] R. P. Cota and D. Zinca, ‘‘Comparative results of spam email detection email and SMS spams,’’ in Proc. Int. Conf. Decis. Aid Sci. Appl. (DASA),
using machine learning algorithms,’’ in Proc. 14th Int. Conf. Commun. Dec. 2021, pp. 430–435.
(COMM), Jun. 2022, pp. 1–5. [71] K. F. Rafat, Q. Xin, A. R. Javed, Z. Jalil, and R. Z. Ahmad, ‘‘Evading
[47] B. K. Dedeturk and B. Akay, ‘‘Spam filtering using a logistic regression obscure communication from spam emails,’’ Math. Biosciences Eng.,
model trained by an artificial bee colony algorithm,’’ Appl. Soft Comput., vol. 19, no. 2, pp. 1926–1943, 2021.
vol. 91, Jun. 2020, Art. no. 106229. [72] T. Wen, Y. Xiao, A. Wang, and H. Wang, ‘‘A novel hybrid feature
fusion model for detecting phishing scam on Ethereum using deep neural
[48] Y. Kontsewaya, E. Antonov, and A. Artamonov, ‘‘Evaluating the
network,’’ Expert Syst. Appl., vol. 211, Jan. 2023, Art. no. 118463.
effectiveness of machine learning methods for spam detection,’’ Proc.
[73] Z. Alom, B. Carminati, and E. Ferrari, ‘‘A deep learning model for
Comput. Sci., vol. 190, pp. 479–486, Jun. 2021.
Twitter spam detection,’’ Online Social Netw. Media, vol. 18, Jul. 2020,
[49] V. Sunjaya, S. Senjaya, J. Utama, H. Lucky, and D. Suhartono, ‘‘Content Art. no. 100079.
based spam classifying algorithms in email,’’ 3rd Int. Conf. Artif. Intell. [74] A. Makkar and N. Kumar, ‘‘An efficient deep learning-based scheme for
Data Sci., vol. 94, Jul. 2020, Art. no. 101716. Web spam detection in IoT environment,’’ Future Gener. Comput. Syst.,
[50] T. Georgieva-Trifonova, ‘‘Research on filtering feature selection methods vol. 108, pp. 467–487, Jul. 2020.
for e-mail spam detection by applying K-NN classifier,’’ in Proc. [75] S. Smadi, N. Aslam, and L. Zhang, ‘‘Detection of online phishing
Int. Congr. Hum.-Comput. Interact., Optim. Robotic Appl. (HORA), email using dynamic evolving neural network based on reinforcement
Jun. 2022, pp. 1–4. learning,’’ Decis. Support Syst., vol. 107, pp. 88–102, Mar. 2018.
[51] L. N. Vejendla, B. Bysani, A. Mundru, M. Setty, and V. J. Kunta, ‘‘Score [76] S. Isik, Z. Kurt, Y. Anagun, and K. Ozkan, ‘‘Spam e-mail classification
based support vector machine for spam mail detection,’’ in Proc. 7th Int. recurrent neural networks for spam e-mail classification on an agglutina-
Conf. Trends Electron. Informat., 2023, pp. 915–920. tive language,’’ Int. J. Intell. Syst. Appl. Eng., vol. 8, no. 4, pp. 221–227,
[52] H. Faris, F. A. Alqatawna, M. Al-Zoubi, and I. Aljarah. Dec. 2020.
(2017). Improving Email Spam Detection Using Content Based [77] H. Yang, Q. Liu, S. Zhou, and Y. Luo, ‘‘A spam filtering method based
Feature Engineering Approach. [Online]. Available: http://cran.r- on multi-modal fusion,’’ Appl. Sci., vol. 9, no. 6, p. 1152, Mar. 2019.
project.org/web/packages/Boruta/index.html [78] S. Gadde, A. Lakshmanarao, and S. Satyanarayana, ‘‘SMS spam detection
[53] S. O. Olatunji, ‘‘Extreme learning machines and support vector machines using machine learning and deep learning techniques,’’ in Proc. 7th
models for email spam detection,’’ in Proc. IEEE 30th Can. Conf. Electr. Int. Conf. Adv. Comput. Commun. Syst. (ICACCS), vol. 1, Mar. 2021,
Comput. Eng. (CCECE), Apr. 2017, pp. 1–6. pp. 358–362.
[79] F. Wei and T. Nguyen, ‘‘A lightweight deep neural model for SMS [101] A. S. Mashaleh, N. F. Binti Ibrahim, M. A. Al-Betar, H. M. J. Mustafa, and
spam detection,’’ in Proc. Int. Symp. Netw., Comput. Commun. (ISNCC), Q. M. Yaseen, ‘‘Detecting spam email with machine learning optimized
Oct. 2020, pp. 1–6. with Harris hawks optimizer (HHO) algorithm,’’ Proc. Comput. Sci.,
[80] V. S. Vinitha, D. K. Renuka, and L. A. Kumar, ‘‘Long short-term memory vol. 201, pp. 659–664, Aug. 2022.
networks for email spam classification,’’ in Proc. Int. Conf. Intell. Syst. [102] M. Belgiu and L. Dragus, ‘‘Random forest in remote sensing: A review of
Commun., IoT Secur. (ICISCoIS), Feb. 2023, pp. 176–180. applications and future directions,’’ ISPRS J. Photogramm. Remote Sens.,
[81] S. Bagui, D. Nandi, S. Bagui, and R. J. White, ‘‘Machine learning and vol. 114, pp. 24–31, Apr. 2016.
deep learning for phishing email classification using one-hot encoding,’’ [103] J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, ‘‘A comparison of random
J. Comput. Sci., vol. 17, no. 7, pp. 610–623, Jul. 2021. forest variable selection methods for classification prediction modeling,’’
[82] D. A. Otchere, T. O. A. Ganat, R. Gholami, and S. Ridha, ‘‘Application Expert Syst. Appl., vol. 134, pp. 93–101, Nov. 2019.
of supervised machine learning paradigms in the prediction of petroleum [104] N. R. Kothapally and V. Kakulapati, ‘‘Classification of spam messages
reservoir properties: Comparative analysis of ANN and SVM models,’’ using random forest algorithm,’’ J. Xidian University, vol. 15, no. 8,
J. Petroleum Sci. Eng., vol. 200, May 2021, Art. no. 108182. pp. 495–505, 2021.
[83] M. Mohammadi, T. A. Rashid, S. H. T. Karim, A. H. M. Aldalwie, [105] A. Shrivastava and R. Dubey, ‘‘Classification of spam mail using
Q. T. Tho, M. Bidaki, A. M. Rahmani, and M. Hosseinzadeh, ‘‘A different machine learning algorithms,’’ in Proc. Int. Conf. Adv. Comput.
comprehensive survey and taxonomy of the SVM-based intrusion Telecommun. (ICACAT), Dec. 2018, pp. 1–10.
detection systems,’’ J. Netw. Comput. Appl., vol. 178, Mar. 2021,
[106] K. L. Goh and A. K. Singh, ‘‘Comprehensive literature review on machine
Art. no. 102983.
learning structures for web spam classification,’’ Proc. Comput. Sci.,
[84] W. Wang, X. Du, D. Shan, and N. Wang, ‘‘A hybrid cloud intrusion
vol. 70, pp. 434–441, Jun. 2015.
detection method based on SDAE and SVM,’’ in Proc. 12th Int. Conf.
Intell. Comput. Technol. Autom. (ICICTA), Oct. 2019, pp. 271–274. [107] F. Ye, G. Chen, Q. Liu, L. Zhang, Q. Qi, B. Hu, and X. Fan, ‘‘A
[85] P. Navaney, G. Dubey, and A. Rana, ‘‘SMS spam filtering using spam classification method based on naive Bayes,’’ in Proc. IEEE 6th
supervised machine learning algorithms,’’ in Proc. 8th Int. Conf. Cloud Inf. Technol. Mechatronics Eng. Conf. (ITOEC), vol. 6, Mar. 2022,
Comput., Data Sci. Eng., Jan. 2018, pp. 43–48. pp. 1856–1861.
[86] D. Sculley and G. M. Wachman, ‘‘Relaxed online SVMs for spam [108] T. S. Guzella and W. M. Caminhas, ‘‘A review of machine learning
filtering,’’ in Proc. 30th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. approaches to spam filtering,’’ Expert Syst. Appl., vol. 36, no. 7,
Retr., Jul. 2007, pp. 415–422. pp. 10206–10222, Sep. 2009.
[87] P. Haider, U. Brefeld, and T. Scheffer, ‘‘Supervised clustering of [109] M. Sahami, S. Dumais, D. Heckerman, E. Horvitz, and G. Building,
streaming data for email batch detection,’’ in Proc. 24th Int. Conf. Mach. ‘‘A Bayesian approach to filtering junk e-mail,’’ in Proc. Learning Text
Learn., Jun. 2007, pp. 345–352. Categorization, Workshop, 1998, pp. 98–105.
[88] H. Zhou, J. Zhang, Y. Zhou, X. Guo, and Y. Ma, ‘‘A feature selection [110] J. Kim, K. Chung, and K. Choi, ‘‘Spam filtering with dynamically updated
algorithm of decision tree based on feature weight,’’ Expert Syst. Appl., URL statistics,’’ IEEE Secur. Privacy Mag., vol. 5, no. 4, pp. 33–39,
vol. 164, Feb. 2021, Art. no. 113842. Jul. 2007.
[89] M. M. Ghiasi, S. Zendehboudi, and A. A. Mohsenipour, ‘‘Decision tree- [111] X. Deng, Y. Li, J. Weng, and J. Zhang, ‘‘Feature selection for text
based diagnosis of coronary artery disease: CART model,’’ Comput. classification: A review,’’ Multimedia Tools Appl., vol. 78, no. 3,
Methods Programs Biomed., vol. 192, Aug. 2020, Art. no. 105400. pp. 3797–3816, Feb. 2019.
[90] S. Rizvi, B. Rienties, and S. A. Khoja, ‘‘The role of demographics in [112] A. Çıltık and T. Güngör, ‘‘Time-efficient spam e-mail filtering using
online learning; a decision tree based approach,’’ Comput. Educ., vol. 137, n-gram models,’’ Pattern Recognit. Lett., vol. 29, no. 1, pp. 19–33,
pp. 32–47, Aug. 2019. Jan. 2008.
[91] A. Wijaya and A. Bisri, ‘‘Hybrid decision tree and logistic regression [113] T. Toma, S. Hassan, and M. Arifuzzaman, ‘‘An analysis of supervised
classifier for email spam detection,’’ in Proc. 8th Int. Conf. Inf. Technol. machine learning algorithms for spam email detection,’’ in Proc. Int.
Electr. Eng. (ICITEE), Oct. 2016, pp. 1–4. Conf. Autom., Control Mechatronics, Jul. 2021, pp. 1–5.
[92] A. F. Zulfikar, D. Supriyadi, Y. Heryadi, and Lukas, ‘‘Comparison [114] C. Li, G. Zhan, and Z. Li, ‘‘News text classification based on improved
performance of decision tree classification model for spam filtering with Bi-LSTM-CNN,’’ in Proc. 9th Int. Conf. Inf. Technol. Med. Educ. (ITME),
or without the recursive feature elimination (RFE) approach,’’ in Proc. Oct. 2018, pp. 890–893.
4th Int. Conf. Inf. Technol., Inf. Syst. Electr. Eng. (ICITISEE), Nov. 2019, [115] H. Moayedi, M. Mehrabi, M. Mosallanezhad, A. S. A. Rashid,
pp. 311–316. and B. Pradhan, ‘‘Modification of landslide susceptibility mapping
[93] Y. Zhang, S. Wang, P. Phillips, and G. Ji, ‘‘Binary PSO with mutation using optimized PSO-ANN technique,’’ Eng. Comput., vol. 35, no. 3,
operator for feature selection using decision tree applied to spam pp. 967–984, Jul. 2019.
detection,’’ Knowl.-Based Syst., vol. 64, pp. 22–31, Jul. 2014.
[116] A. Kurani, P. Doshi, A. Vakharia, and M. Shah, ‘‘A comprehensive
[94] H. Kaur and A. Sharma, ‘‘Improved email spam classification method
comparative study of artificial neural network (ANN) and support vector
using integrated particle swarm optimization and decision tree,’’ in
machines (SVM) on stock forecasting,’’ Ann. Data Sci., vol. 10, no. 1,
Proc. 2nd Int. Conf. Next Gener. Comput. Technol. (NGCT), Oct. 2016,
pp. 183–208, Feb. 2023.
pp. 516–521.
[95] S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, ‘‘Comparative [117] B. Ingre and A. Yadav, ‘‘Performance analysis of NSL-KDD dataset using
performance analysis of K-nearest neighbour (KNN) algorithm and its ANN,’’ in Proc. Int. Conf. Signal Process. Commun. Eng. Syst., Jan. 2015,
different variants for disease prediction,’’ Sci. Rep., vol. 12, no. 1, pp. 92–96.
p. 10358, Apr. 2022. [118] C. Zhan, F. Zhang, and M. Zheng, ‘‘Design and implementation of an
[96] H. Liu, J. An, W. Xu, X. Jia, L. Gan, and C. Yuen, ‘‘K-means optimization system of span filter rule based on neural network,’’ in Proc.
based constellation optimization for index modulated reconfigurable Int. Conf. Commun., Circuits Syst., vol. 3, Jul. 2007, pp. 882–886.
intelligent surfaces,’’ IEEE Commun. Lett., vol. 27, no. 8, pp. 2152–2156, [119] R. Talaei Pashiri, Y. Rostami, and M. Mahrami, ‘‘Spam detection
Jun. 2023. through feature selection using artificial neural network and sine–cosine
[97] S. A. Orazbayev, R. E. Zhumadylov, A. T. Zhunisbekov, T. S. Ramazanov, algorithm,’’ Math. Sci., vol. 14, no. 3, pp. 193–199, Sep. 2020.
and M. T. Gabdullin, ‘‘Obtaining of copper nanoparticles in combined [120] S. A. A. Ghaleb, M. Mohamad, E. F. H. S. Abdullah, and
RF+DC discharge plasma,’’ Mater. Today, Proc., vol. 20, pp. 329–334, W. A. H. M. Ghanem, ‘‘Spam classification based on supervised
Jun. 2020. learning using grasshopper optimization algorithm and artificial neural
[98] D. Ö. Sahin and S. Demirci, ‘‘Spam filtering with KNN: Investigation of network,’’ in Proc. 2nd Int. Conf., 2021, pp. 420–434.
the effect of k value on classification performance,’’ in Proc. 28th Signal [121] A. Arram, H. Mousa, and A. Zainal, ‘‘Spam detection using hybrid
Process. Commun. Appl. Conf. (SIU), Oct. 2020, pp. 1–4. artificial neural network and genetic algorithm,’’ in Proc. 13th Int. Conf.
[99] Y. K. Zamil, S. A. Ali, and M. A. Naser, ‘‘Spam image email filtering Intellient Syst. Design Appl., Dec. 2013, pp. 336–340.
using K-NN and SVM,’’ Int. J. Electr. Comput. Eng. (IJECE), vol. 9, no. 1, [122] J. Gu, ‘‘Recent advances in convolutional neural networks,’’ Pattern
p. 245, Feb. 2019. Recognit., vol. 77, pp. 354–377, May 2018.
[100] G. Hnini, J. Riffi, M. A. Mahraz, A. Yahyaouy, and H. Tairi, ‘‘Spam [123] Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, ‘‘A survey of convolutional
filtering system based on nearest neighbor algorithms,’’ in Proc. Int. Conf. neural networks: Analysis, applications, and prospects,’’ IEEE Trans.
Artif. Intell. Ind. Appl., 2021, pp. 36–46. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999–7019, Dec. 2022.
[124] V. Gupta, A. Mehta, A. Goel, U. Dixit, and A. C. Pandey, ‘‘Spam detection [146] A. L. Rosewelt, N. D. Raju, and S. Ganapathy, ‘‘An effective spam
using ensemble learning,’’ in Harmony Search and Nature Inspired message detection model using feature engineering and bi-LSTM,’’
Optimization Algorithms: Theory and Applications. Cham, Switzerland: in Proc. Int. Conf. Adv. Comput., Commun. Appl. Informat. (ACCAI),
Springer, 2019, pp. 661–668. Jan. 2022, pp. 1–6.
[125] M. Gupta, A. Bakliwal, S. Agarwal, and P. Mehndiratta, ‘‘A comparative [147] Y. Gao, M. Yang, X. Zhao, B. Pardo, Y. Wu, T. N. Pappas, and
study of spam SMS detection using machine learning classifiers,’’ in Proc. A. Choudhary, ‘‘Image spam hunter,’’ in Proc. IEEE Int. Conf. Acoust.,
11th Int. Conf. Contemp. Comput., Aug. 2018, pp. 1–7. Speech Signal Process., vol. 2, Mar. 2008, pp. 1765–1768.
[126] M. Popovac, M. Karanovic, S. Sladojevic, M. Arsenovic, and A. Anderla, [148] M. Dredze, R. Gevaryahu, and A. Elias-Bachrach. (2007). Learning Fast
‘‘Convolutional neural network based SMS spam detection,’’ in Proc. Classifiers for Image Spam. [Online]. Available: http://fuzzyocr.own-
26th Telecommun. Forum (TELFOR), Nov. 2018, pp. 1–4. hero.net/
[149] Z. Wang, W. Josephson, Q. Lv, M. Charikar, and K. Li, ‘‘Filtering image
[127] T. Sharmin, F. Di Troia, K. Potika, and M. Stamp, ‘‘Convolutional neural
spam with near-duplicate detection,’’ in Proc. CEAS, 2007, pp. 1–10.
networks for image spam detection,’’ Inf. Secur. J. A Global Perspective,
[150] D. Debarr and H. Wechsler. (2007). Spam Detection Using Clus-
vol. 29, no. 3, pp. 103–117, May 2020.
tering, Random Forests, and Active Learning. [Online]. Available:
[128] A. Farzad, H. Mashayekhi, and H. Hassanpour, ‘‘A comparative
http://trec.nist.gov/pubs/trec16/papers/SPAM.OVERVIEW1
performance analysis of different activation functions in LSTM networks [151] I. Androutsopoulos, G. Paliouras, V. Karkaletsis, G. Sakkis,
for classification,’’ Neural Comput. Appl., vol. 31, no. 7, pp. 2507–2521, C. D. Spyropoulos, and P. Stamatopoulos. (2006). Learning to Filter
Jul. 2019. Spam e-mail: A Comparison of a Naive Bayesian and a Memory-based
[129] A. Sherstinsky, ‘‘Fundamentals of recurrent neural network (RNN) Approach 1. [Online]. Available: http://www.cauce.org
and long short-term memory (LSTM) network,’’ Phys. D, Nonlinear [152] I. Koprinska, J. Poon, J. Clark, and J. Chan, ‘‘Learning to classify e-mail,’’
Phenomena, vol. 404, Mar. 2020, Art. no. 132306. Inf. Sci., vol. 177, no. 10, pp. 2167–2187, May 2007.
[130] S. Muzaffar and A. Afshari, ‘‘Short-term load forecasts using LSTM [153] B. Biggio, I. Corona, G. Fumera, G. Giacinto, and F. Roli, ‘‘Bagging
networks,’’ Energy Proc., vol. 158, pp. 2922–2927, Feb. 2019. classifiers for fighting poisoning attacks in adversarial classification
[131] B. Lindemann, B. Maschler, N. Sahlab, and M. Weyrich, ‘‘A survey tasks,’’ in Proc. 10th Int. Workshop, 2011, pp. 350–359.
on anomaly detection for technical systems using LSTM networks,’’ [154] I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and
Comput. Ind., vol. 131, Oct. 2021, Art. no. 103498. C. D. Spyropoulos. (2000). An Evaluation of Naive Bayesian Anti-Spam
[132] G. Jain, M. Sharma, and B. Agarwal, ‘‘Optimizing semantic lstm for spam Filtering. [Online]. Available: http://www.caucee.org
detection,’’ Int. J. Inf. Technol., vol. 11, pp. 239–250, Jun. 2019. [155] G. V. Cormack and T. R. Lynam, ‘‘Online supervised spam filter
[133] E. E. Eryilmaz, D. Ö. Sahin, and E. Kiliç, ‘‘Filtering Turkish spam using evaluation,’’ ACM Trans. Inf. Syst., vol. 25, no. 3, p. 11, Jul. 2007.
LSTM from deep learning techniques,’’ in Proc. 8th Int. Symp. Digit. [156] L. Zhang, J. Zhu, and T. Yao, ‘‘An evaluation of statistical spam filtering
Forensics Secur. (ISDFS), Jun. 2020, pp. 1–6. techniques,’’ ACM Trans. Asian Lang. Inf. Process., vol. 3, no. 4,
[134] S. Thanarattananakin, S. Bulao, B. Visitsilp, and M. Maliyaem, ‘‘Spam pp. 243–269, Dec. 2004.
detection using word embedding-based LSTM,’’ in Proc. Joint Int. Conf. [157] J. R. Mendez, F. Fdez-Riverola, F. Dãsaz, E. L. Iglesias, and J. M. Cor-
Digit. Arts, Media Technol. ECTI Sect. Conf. Electr., Electron., Comput. chado, ‘‘A comparative performance study of feature selection methods
Telecommun. Eng. (ECTI DAMT NCON), Jan. 2022, pp. 227–231. for the anti-spam filtering domain,’’ in Proc. 6th Ind. Conf. Data Mining,
2006, pp. 106–120.
[135] O. Yildirim, U. B. Baloglu, R.-S. Tan, E. J. Ciaccio, and U. R. Acharya,
[158] A. Attar, R. M. Rad, and R. E. Atani, ‘‘A survey of image spamming
‘‘A new approach for arrhythmia classification using deep coded features
and filtering techniques,’’ Artif. Intell. Rev., vol. 40, no. 1, pp. 71–105,
and LSTM networks,’’ Comput. Methods Programs Biomed., vol. 176,
Jun. 2013.
pp. 121–133, Jul. 2019. [159] G. Sakkis, I. Androutsopoulos, G. Paliouras, V. Karkaletsis, C. D. Spy-
[136] K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio, ‘‘On the ropoulos, and P. Stamatopoulos. (2001). Stacking Classifiers for Anti-
properties of neural machine translation: Encoder–decoder approaches,’’ spam Filtering of e-mail. [Online]. Available: www.junkemail.org
2014, arXiv:1409.1259. [160] T. A. Almeida and A. Yamakami, ‘‘Content-based spam filtering,’’ in
[137] P. T. Yamak, L. Yujian, and P. K. Gadosey, ‘‘A comparison between Proc. Int. Joint Conf. Neural Netw. (IJCNN), Jul. 2010, pp. 1–7.
ARIMA, LSTM, and GRU for time series forecasting,’’ in Proc. [161] D. Mallampati and N. P. Hegde, ‘‘Feature extraction and classification of
2nd Int. Conf. Algorithms, Comput. Artif. Intell., Dec. 2019, email spam detection using IMTF-IDF+skip-thought vectors,’’ Ingenierie
pp. 49–55. Des. Syst. Inf., vol. 27, no. 6, pp. 941–948, Dec. 2022.
[138] S. Gao, Y. Huang, S. Zhang, J. Han, G. Wang, M. Zhang, and Q. Lin, [162] H. Saini and K. S. Saini, ‘‘Hybrid model for email spam prediction
‘‘Short-term runoff prediction with GRU and LSTM networks without using random forest for feature extraction,’’ in Proc. Int. Conf. Artif.
requiring time step optimization during sample generation,’’ J. Hydrol., Intell. Appl. (ICAIA) Alliance Technol. Conf. (ATCON-1), Apr. 2023,
vol. 589, Oct. 2020, Art. no. 125188. pp. 1–4.
[139] K. A. Al-Thelaya, T. S. Al-Nethary, and E. Y. Ramadan, ‘‘Social networks [163] Q. Cheng, A. Xu, X. Li, and L. Ding, ‘‘Adversarial email gener-
spam detection using graph-based features analysis and sequence of ation against spam detection models through feature perturbation,’’
interactions between users,’’ in Proc. IEEE Int. Conf. Informat., IoT, in Proc. IEEE Int. Conf. Assured Autonomy (ICAA), Mar. 2022,
Enabling Technol. (ICIoT), Feb. 2020, pp. 206–211. pp. 83–92.
[140] T. Repke and R. Krestel. (2018). Bringing Back Structure To Free [164] M. A. Hassan and N. Mtetwa, ‘‘Feature extraction and classification of
Text Email Conversations With Recurrent Neural Networks. [Online]. spam emails,’’ in Proc. 5th Int. Conf. Soft Comput. Mach. Intell. (ISCMI),
Available: http://isc.enron.com/ Nov. 2018, pp. 93–98.
[141] T. Le, M. Vo, B. Vo, E. Hwang, S. Rho, and S. Baik, ‘‘Improving electric [165] I. Inuwa-Dutse, M. Liptrott, and I. Korkontzelos, ‘‘Detection of spam-
energy consumption prediction using CNN and bi-LSTM,’’ Appl. Sci., posting accounts on Twitter,’’ Neurocomputing, vol. 315, pp. 496–511,
vol. 9, no. 20, p. 4237, Oct. 2019. Nov. 2018.
[166] S. Aiyar and N. P. Shetty, ‘‘N-gram assisted YouTube spam comment
[142] F. Shahid, A. Zameer, and M. Muneeb, ‘‘Predictions for COVID-19 with
detection,’’ Proc. Comput. Sci., vol. 132, pp. 174–182, Jul. 2018.
deep learning models of LSTM, GRU and bi-LSTM,’’ Chaos, Solitons
[167] R. Alharthi, A. Alhothali, and K. Moria, ‘‘A real-time deep-learning
Fractals, vol. 140, Nov. 2020, Art. no. 110212.
approach for filtering Arabic low-quality content and accounts on
[143] S. M. Zaman, M. M. Hasan, R. I. Sakline, D. Das, and M. A. Alam, ‘‘A Twitter,’’ Inf. Syst., vol. 99, Jul. 2021, Art. no. 101740.
comparative analysis of optimizers in recurrent neural networks for text [168] Y. Liu, B. Pang, and X. Wang, ‘‘Opinion spam detection by incorporating
classification,’’ in Proc. IEEE Asia–Pacific Conf. Comput. Sci. Data Eng. multimodal embedded representation into a probabilistic review graph,’’
(CSDE), vol. 3, Dec. 2021, pp. 1–6. Neurocomputing, vol. 366, pp. 276–283, Nov. 2019.
[144] S. E. Rahman and S. Ullah, ‘‘Email spam detection using bidirectional [169] T. Wu, S. Liu, J. Zhang, and Y. Xiang, ‘‘Twitter spam detection based on
long short term memory with convolutional neural network,’’ in Proc. deep learning,’’ in ACM Int. Conf. Proc. Ser., Jan. 2017, pp. 1–26.
IEEE Region 10 Symp. (TENSYMP), Jun. 2020, pp. 1307–1311. [170] A. Barushka and P. Hajek, ‘‘Review spam detection using word
[145] C. M. Shaik, N. M. Penumaka, S. K. Abbireddy, V. Kumar, and embeddings and deep neural networks,’’ in Proc. 15th IFIP WG, 2019,
S. S. Aravinth, ‘‘Bi-LSTM and conventional classifiers for email spam pp. 340–350.
filtering,’’ in Proc. 3rd Int. Conf. Artif. Intell. Smart Energy (ICAIS), [171] I. Kanaris, K. Kanaris, and E. Stamatatos, ‘‘Spam detection using
Feb. 2023, pp. 1350–1355. character n-grams,’’ in Proc. 4th Helenic Conf., 2006, pp. 95–104.
EKRAMUL HAQUE TUSHER received the B.Sc. talent, in 2022. He was awarded the Higher Education Academy (HEA)
degree in computer science from International Fellowship from the U.K. He has received several prestigious international
Islamic University Chittagong (IIUC). He is research awards, notably the Best Paper Award at ICNS’15 (Italy); IC0902
currently pursuing the master’s degree in soft Grant (France); Italian Government Ph.D. Research Scholarship; the IIUM
computing and intelligent systems with Universiti Best Masters Student Award; the Best Supervisor Award at UMP; and the
Malaysia Pahang Al-Sultan Abdullah (UMPSA), Awards in International Exhibitions, including the Euro Business-HALLER
Pekan, Pahang, Malaysia. He has been a Research Poland Special Award at MTE 2022; the Best Innovation Award at MTE
Assistant with the Machine Intelligence Research 2020, Malaysia; the Diamond and Gold in BiS’17 U.K.; the Best of
Group (MIRG), UMPSA, since 2023. His current the Best Innovation Award and Most Commercial IT Innovation Award,
research interests include machine learning meth- Malaysia; and the Gold and Silver Medals in iENA’17 Germany. He served
ods, deep learning, fuzzy systems, and explainable AI. as the Specialty Chief Editor for IoT Theory and Fundamental Research
(specialty section of Frontiers in the Internet of Things); an Advisory Board
Member and an Editorial Board Member for Computer Systems Science and
MOHD ARFIAN ISMAIL received the B.Sc.,
Engineering (Tech Science Press) and Computers (MDPI); a Lead Guest
M.Sc., and Ph.D. degrees in computer sci-
Editor for IEEE ACCESS and Computers; an Associate Editor for IEEE ACCESS
ence from Universiti Teknologi Malaysia (UTM),
and Patron; the General Chair; the Organizing Committee Member; the
in 2008, 2011, and 2016, respectively. He is
Publicity Chair; the Session Chair; the Programme Committee Member;
currently an Associate Professor with the Faculty
and a member of the Technical Programme Committee (TPC) in numerous
of Computing, University Malaysia Pahang Al-
leading conferences worldwide, such as IEEE Globecom, IEEE DASC, IEEE
Sultan Abdullah, Malaysia. His current research
iSCI, and IEEE ETCCE, and journals. His name was enlisted inside the
interests include machine learning methods and
World Top 2% Scientists list released by Stanford University under the
fuzzy systems.
category of Citation Impact in Single Calendar Year in 2019, 2020, and 2021.