0% found this document useful (0 votes)
36 views9 pages

Unsolicited Spam Detection

This document presents a novel method for classifying emails into four categories (normal, fraudulent, harassment, suspicious) using Long Short-Term Memory (LSTM) based Gated Recurrent Units (GRU). The proposed LSTM-GRU model analyzes email content at various levels and achieves 98% accuracy, outperforming other machine learning algorithms. It provides an efficient way to identify harmful emails and has applications in digital forensics and cybersecurity.

Uploaded by

Ramadevi Unknown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views9 pages

Unsolicited Spam Detection

This document presents a novel method for classifying emails into four categories (normal, fraudulent, harassment, suspicious) using Long Short-Term Memory (LSTM) based Gated Recurrent Units (GRU). The proposed LSTM-GRU model analyzes email content at various levels and achieves 98% accuracy, outperforming other machine learning algorithms. It provides an efficient way to identify harmful emails and has applications in digital forensics and cybersecurity.

Uploaded by

Ramadevi Unknown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 9

UNSOLICITED E MAIL DETECTION

Ramadevi M S *1, Saru Nivedha K2


1
* Department of Computer Science and Engineering, Mount Zion College of Engineering and Technology, Tamilnadu, India
[email protected]
2
Department of Computer Science and Engineering, Mount Zion College of Engineering and Technology, Tamilnadu, India
[email protected]

ABSTRACT

The frequency of cyber security incidents has increased in recent years. Attackers use spam emails as a gateway to
infiltrate government systems, renowned companies, and the websites of politicians and social organizations across
multiple nations. Identifying spam emails within large email datasets has drawn significant public attention. It is
becoming increasingly difficult for existing detection methods to deal with the growing array of deceptive tactics
and the surge in email volume posed by spam emails. The objective of this study is to develop a novel and efficient
method for classifying large email datasets into four separate categories: Normal, Fraudulent, Harassment, and
Suspicious. In order to achieve this classification, Long Short-Term Memory (LSTM) based Gated Recurrent Units
(GRUs) are used. The proposed LSTM based GRU proves adept at capturing meaningful information from emails,
which proves valuable for forensic analysis and evidentiary purposes. The technique involves two crucial stages:
sample expansion and testing. By combining LSTMs with recurrent gradient units, Spam Spoiler outperforms
existing machine learning algorithms with an accuracy of 98%. Spam Spoiler excels at analyzing e-mail content
across diverse topics, maintaining a robust and reliable classification system.
Keywords: Email Classification, Spam, Phishing, Machine Learning, Random Forest, Cyber Security, Text
Classification ,Naïve Bayes.
I. INTRODUCTION Protocol) is used to send messages, while other protocols
In examining electronic mail (e-mail) related crimes, a like IMAP or POP are employed to retrieve messages
comprehensive analysis of both the email header and from a mail server. Accessing a mail account typically
body becomes imperative, as the semantics of involves entering a valid email address, password, and
communication play a crucial role in identifying potential mail server details for sending and receiving messages.
evidence sources. The objective is to choose the optimal While webmail servers often auto configure accounts,
model for e-mail forensic tools. This project introduces a manual configuration may be necessary when using
novel and efficient approach called the E-Mail Sink API, email clients like Microsoft Outlook or Apple Mail.
utilizing a Long Short-Term Memory (LSTM)-based Additionally, entering incoming and outgoing mail
Gated Recurrent Unit (GRU) for multiclass email servers along with correct port numbers may be required.
classification. The primary focus is on identifying Despite the widespread use of the Internet for
harmful or unfavorable e-mails received at the e-mail professional, social, and personal activities, there exists a
server end through a deep learning-based architecture. subset of individuals attempting to compromise Internet-
The proposed approach concurrently models emails at connected devices, violate privacy, and disrupt online
various levels, including the email header, email body, services. Email, as a universal service utilized by over a
character level, and word level, with the goal of billion people worldwide, has become a significant
distinguishing whether an email exhibits characteristics vulnerability. Startling statistics reveal that email remains
indicative of cybercrime. Email messages traverse email the primary threat vector for data breaches, serving as the
servers using multiple protocols within the TCP/IP suite entry point for ninety-four percent of breaches, with an
[1],[10],[14]. For instance, SMTP (Simple Mail Transfer attack occurring every 39 seconds. Over 30% of phishing
messages are opened, and 12% of users click on based detection system, identifies botnets by exploring
malicious links. In response to the escalating spatial-temporal behavioral similarities commonly
sophistication of cybercrime and its ability to bypass observed in IRC-based and HTTP-based botnets. On the
legacy controls, security measures must evolve other hand, BotMiner [7], one of the first protocol- and
accordingly. structure-independent botnet detection systems, classifies
flows into groups based on communication and
II. RELATED WORK malicious activity patterns. The intersection of these
In this section, we delve into previous work related to the groups identifies compromised machines.
identification of compromised machines. Our primary Compared to existing general botnet detection systems
focus is on studies utilizing spamming activities for bot like BotHunter, BotSniffer, and BotMiner, SPOT
detection, followed by a brief overview of various efforts distinguishes itself as a lightweight compromised
in detecting general botnets. Two recent studies [19], machine detection scheme, focusing on the economic
[20], based on email messages received by a large email incentives driving attackers to recruit a large number of
service provider, investigated the global characteristics of compromised machines. Leveraging the Sequential
spamming botnets, including botnet size and spamming Probability Ratio Test (SPRT) as a simple yet powerful
patterns. These studies employed clustering techniques statistical method, SPOT has found successful
on spam messages to reveal insights into the aggregate application in various areas of networking security,
global characteristics of spamming botnets. However, including portscan activity detection, proxy-based
their applicability is more suited to large email service spamming activity detection, anomaly-based botnet
providers for understanding global botnet characteristics detection, and MAC protocol misbehavior in wireless
rather than being deployed by individual networks to networks.
identify internal compromised machines. Additionally,
their approaches lack support for the online detection III. RESULTS AND DISCUSSION
requirement in the network environment considered in
The proposed approach involves data collection,
this paper, where we aim to develop a tool for system
preprocessing, feature extraction, parameter tuning, and
administrators to automatically detect compromised
classification through the LSTM-GRU model. E-mail
machines.
datasets in the project are categorized into normal,
Xie et al. developed DBSpam, an effective tool for
harassing, suspicious, and fraudulent classes. The E-mail
detecting proxy-based spamming activities in a network,
body is segmented into word levels, and the embedding
relying on the packet symmetry property of such
layer is utilized for training to generate the sequence of
activities [13]. While DBSpam identifies spam proxies
vectors.
translating and forwarding non-SMTP packets upstream,
A. Long Short-Term Memory (LSTM)
our goal is to identify all types of compromised machines
involved in spamming. Moving on to general botnet LSTMs, a specialized type of Recurrent Neural Network
detection schemes, Bot Hunter [8], developed by Gu et (RNN), excel in learning long-term dependencies,
al., correlates the Intrusion Detection System (IDS) overcoming the challenge of retaining information over
dialog trace in a network to detect compromised extended periods. Comprising units known as LSTM
machines. It is designed based on the observation that a units or blocks, these form the building components for
complete malware infection process has well-defined layers in an RNN, collectively referred to as an LSTM
stages, and by correlating inbound intrusion alarms with network [8]. A standard LSTM unit consists of a cell, an
outbound communication patterns, Bot Hunter identifies input gate, an output gate, and a forget gate. The cell's
potential infected machines. function involves retaining values across arbitrary time
In contrast to Bot Hunter, SPOT focuses on the economic intervals, addressing the long-term memory aspect of
incentives behind compromised machines and their LSTMs. The gates, resembling conventional neurons,
involvement in spamming. Bot Sniffer [9], an anomaly- regulate the flow of values through connections in the
LSTM.
The term "long short-term" signifies that LSTMs model effectiveness in overcoming the vanishing gradient
short-term memory capable of lasting for an extended problem is attributed to the utilization of an update gate
duration [18]. LSTMs are adept at classifying, and a reset gate. The update gate manages the
information flowing into memory, while the reset gate
processing, and predicting time series data, particularly
governs the information flowing out of memory. Both
in scenarios with unknown time lags and durations gates, represented as vectors, determine the information
between crucial events. Developed to address the transmitted to the output. Their training can prioritize
challenges of exploding and vanishing gradient problems retaining relevant past information or discarding
in training traditional RNNs, LSTMs have proven irrelevant details, contributing to GRU's ability to
effective in handling complex temporal relationships. mitigate the vanishing gradient problem in recurrent
neural networks. GRU proves to be a valuable tool for
addressing the vanishing gradient problem, a challenge
that arises when the gradient diminishes significantly,
impeding weight adjustments. Notably, GRU
demonstrates superior performance compared to LSTM,
especially when handling smaller datasets.

Figure 1: Long Short-Term Memory (LSTM)

The appeal of LSTM stems from its inclusion of the


Gating mechanism within each LSTM cell. Unlike a
typical RNN cell, which processes input from the current
timestamp and the hidden state from the previous step
through an activation layer to produce a new state,
LSTM introduces a more intricate process. In the
depicted architecture, at each time step, it takes input
from three distinct states: the current input state, the Figure 2: Gated Recurrent Unit (GRU))
short-term memory from the previous cell, and the long-
term memory. C. LSTM BASED GRU CLASSIFICATION
These cells leverage gates to control the retention or MODEL
discarding of information during loop operations before LSTM comprises the classification of E-mails as
passing on both the long-term and short-term information Normal, Harassing, Suspicious, and Fraudulent.
to the subsequent cell. Conceptually, these gates function The Stand GRU is both based on the gated
as filters, eliminating undesired, selected, and irrelevant network architecture, due to which we combined
information. LSTM employs three gates in total: the
the GRU and LSTM to utilize the gated
Input Gate, Forget Gate, and Output Gate.
architecture of both of them. The DL models'
B. Gated Recurrent Unit (GRU)) layered structure helps in learning without
Introduced by Cho et al. in 2014, the Gated Recurrent intervention in ML model implementation.
Unit (GRU) addresses the vanishing gradient problem Several libraries provide an in-depth learning
encountered by conventional Recurrent Neural implementation structure. We split the data into
Networks (RNN). Similar to the Long Short-Term three training, validation, and testing sets with a
Memory (LSTM), GRU incorporates a gating
65 V 10 V 25 ratio. We extracted the features
mechanism to regulate the memorization process [16].
Operating as a gating mechanism in RNNs, GRU shares from textual data of E-mail using the word
similarities with LSTM but lacks an output gate. It Embedding technique. We encode the target
endeavors to resolve the vanishing gradient problem values using the one-hot encoding technique into
often associated with standard RNNs. Considered a 4-distinct classes. We pass all pre-processed data
variation of the LSTM unit, GRU boasts a similar design to the novel architecture of LSTM layers variants
and produces comparable results in certain scenarios. Its for the perfect classification of E-mails. We use
the LSTM layers with different GRU and An email message comprises two primary components:
Convo1D layer variants to transform the input the header and the body. The header provides broader
textual data into an efficient E-mail classification content-related information, including the subject,
system. sender, recipient email addresses, and timestamp. The
email's core lies in its body, containing variable content
such as web pages, files, analog data, images, audio,
video, and HTML markup, among other elements [17].
The "From" field typically initiates the header line, and
as the email traverses servers, its content may undergo
modifications, allowing users to inspect the routing path
in the headers. Before serving as input to the classifier,
the contents may undergo pre-processing. In this stage,
we extract email messages for classification, removing
Uniform Resource Locators (URLs), Hypertext Markup
Language (HTML), cascading style sheets (CSS),
JavaScript code, and special symbols. This process
results in unformatted text, ensuring a streamlined input
Figure 3: LSTM and GRU based solution. for subsequent classification.
Textual data needs special attention when feature
extraction comes in the proposed methodology.
Different feature extraction methods need to be
implemented when solving the Natural language
processing problem using DL. The main point is to
convert the textual data into real-valued vectors. There is
a unique name for the vector in natural language
processing, ``embedding vector''. There are multiple
ways to generate the embedding vector from the textual
data, but famous methods are GLOVE and Word2Vec Figure 4: Overall Process
techniques. Embedding vector dimensions are essential We first proceed to the standard ML algorithm. For
to get all the features extracted from the data. Let us text-based processing, the NLTK library makes the
suppose if we have 8 samples of textual data. The data text of the information in order, using the TFIDF
have two distinct classes. Each sample has a maximum method to count each word.Within the proposed
of five tokens in the vocabulary size will be the unique framework, the model extricates the features using
words in 8 samples, and the vocabulary size needs to be TFIDF Vectorise followed by Gaussian Naïve Bayes
higher than the available unique tokens in the dataset to (comes with the sklearn python library) for
avoid collisions with a hash function. In this case, the prediction. The case of deep machine learning is a
dimensions of the embedding vector will be 4 x 8. In the special one. Tokenization layer maintains a
case of the classification problem of NLP, we need to dictionary that maps words to indexes. The
encode the target values using the one-hot encoding embedding layer internally maintains a lookup table,
method. After getting the vectors from the words, the which maps the index/token to a vector. Words are
similarity between the words is measured using the represented by vectors in complex dimensional
similarity measure between the corresponding vectors spaces. The process appears as email
using Equation
→Token→ lookuptable→ vector.
D. IMPLEMENTATION
We use components such as Embedding and LSTM Data preprocessing involves standardizing text and
Layers and GRU in building our network. The preparing it for analysis based on natural language.
purpose of using bidirectional is that the situational
information often comes at the end of the sentence Tokenization
sometimes. Without using this information, Breaking up the original text into component pieces is
uncertainty might ascend. The first LSTM network the tokenization step in natural language processing.
feeds in the input sequence as per usual. The second There are predefined rules for tokenization of the
LSTM network reverses the input sequence and documents into words. The tokenization step is
feeds it back into the first. After that, the merger of performed in Python by using the SpaCy library.
these two networks served as input to the next layer.
As compared to 1) train the Embedding Layer from
scratch and 2) employ open source pre-trained
embeddings, we chose the Glove word embedding
method. from Stanford NLP Group. To apply the
Word Cloud embedding, we first converted email
message text to sequences. NLP then defined a
vocabulary where each word had an inimitable
index. We padded shorter sentences to the max
length (most extended length rendered after pre- Figure 5: Data Cleaning and Tokenizing phases of text processing
processing). After that, LSTM served to get the
Normalization
context representations. We further extended the These are the steps needed for translating text from
testing phase by processing the emails by including human language (English) to machine-readable format
and excluding digits/figures from their main body, for further processing. The process starts with:
respectively.
o changing all alphabets to lower or upper case
1. Dataset: o expanding abbreviations
The dataset used in this project is an amalgamation o excluding numbers or changing those to words
of four different datasets. The dataset contains o removing white spaces
Normal e-mails from Enron Corpora, Fraudulent e- o removing punctuation, inflection marks, and
mails provided by Phished e-mails corpora which other circumflexes
contain misleading information, Harassment o removing stop words, sparse terms, and
messages selected from Hate Speech, Offensive particular words
dataset. We enhance the dataset of Email Forensics o text canonicalization
by adding the suspicious emails data from email Stop Words Removal
sources, and twitter source. The suspicious dataset Words like ``a'' and ``the'' that appear so frequently are
includes some Twitter messages related to terrorism. notrelevant to the context of the E-mail and create
In order to classify E-mails in more than one noise in thetext data. These words are called stop
category, these different datasets are combined into a words, and they canbe filtered from the text to be
structural file. processed. We utilized the``NLTK'' Python library to
remove stop words from the text.
Dataset Annotation:
Punctuation Removal
In this project, E-mails are divided into normal, Punctuation includes (e.g., full stop (.), comma (,),
harassing, suspicious, and fraudulent classes.The E- brackets) to separate sentences and clarify meaning.
mail is divided into word levels of the E-mail body, For punctuation removal, we utilize the ``NLTK''
and the embedding layer is applied to train and library.
obtain the sequence of vectors.
2. Feature Extraction length context is also controlled by the given below
mathematics.
After eliminating irrelevant information, the
elaborated list of words is converted into numbers.
The TF-IDF method is applied to accomplish this
task. Term Frequency is several occurrences of a
word in a document, and IDF is the ratio of a total
number of documents and the number of documents 3. Embedding Layer
containing the term. A popular and straightforward
Embedding is the representation of words into real
method of feature extraction with text data is called
numbers. Many machine learning and DL Algorithms
the bag-of-words model of text. A bag-of-words
cannot process data in raw form (text form) and can
model, or BoW for short, is a way of extracting
only process numerical values as input for learning.
features from the text for use in modelling, such as
Word embedding organizes texts which are converted
machine learning algorithms. A bag-of-words are
into numbers. It extracts relevant features from the
presentation of text that describes the occurrence of
textual data and structures them up in the form of real
words within a document. It involves two things (1)
values. Word embedding uses a word mapping
A vocabulary of known words, (2) A measure of the
dictionary to convert the terms (words) to a real value
presence of known words. We extract features on the
vector. There are two main problems with machine
basis of Equations Here tf represents term frequency
learning feature engineering techniques, one problem is
and df represents document frequency.
the sparse vectors for data representation, and the
second issue is that; it does not take into account the
meaning of words to some extent. In embedding
vectors, similar words will be represented by the
almost near real- valued numbers. For example, the
terms love and affection will be near to each other in
the embedding vector.
The embedding vector as a data structure in the DL
algorithm is used to accomplish the learning. In the
experimental setup, the word embedding layer contains
Equation 1
information about the sequence length of E-mails. We
Feature extraction in DL with the context of words consider the sequence length of the E-mail 600
is also essential. The technique used for this purpose characters each. The embedding dimensions used in
is word2vecneural network- based algorithm. SeFACED are 800. The vocabulary size is set to 70;
Equation 5 given below shows how word2 vec 000 at the start because we set this value after
manages the word-context with thehelp of generating the unique tokens of our dataset. The
probability measures. The D represents the pair-wise embedding layer takes three arguments such as input
illustration of a set of words, and (w; c) is the word- dimensions, output dimensions, and input length. In
context pair drawn from the large set D. our proposed study, the input dimensions are 800,
vocabulary size is 70; 000, and input length is 600. We
need to be curious when setting the embedding layer
dimensions because sometimes we skip the essential
features when dealing with the large size of textual
input. The embedding layer output will be used for the
The multi-word context is also a variant of
Stand GRU layers in adjacent layers.
word2vec,as shown in Equation 6. The variable-
Algorithm 4: TF
Input:
1. D: emails
2. T: the unique term in all emails
Output: weight matrix
Procedure:
1. For each ti €T do
2. For each email € do
3. Wij = number of appearances of term t i in
email di
Figure 6: All tokenized emails are converted into vectors in the
4. End for of email
embedding phase
5. End for of term
IV. ALGORITHM Algorithm 5: IDF
Algorithm 1: Extract subject and body from email Input:
Input: Email e in text format with meta data 1. T: the unique term in all emails
Output: Email subject and the first body 2. D: emails
convert e to lower case; 3. weight matrix from TF step
subject ← first line of e; Output:
TF-IDF weight for each term
Algorithm 2 – Pre-processing Procedure:
Input: uncleaned email 1. for each term ti € T do
Output: processed/cleaned email 2. for each emails di € E
Procedure: 3. If TFij not equal zero, then EFi ++
1. For each document 2 uncleaned document 4. End for of document
2. Delete all the special characters 5. Idf - log(E/EFi)
3. Delete all single characters 6. End for of term
4. Delete single characters from the begin 7. For each term ti do
5. Substituting multiple spaces with a single 8. For each document dj € E do
space 9. Tf- IDF = TFij *IEFi
6. Delete prexed `b' 10. End for of Document
7. Transforming to Lowercase 11. End for of term
8. Lemmatization Algorithm 6: LSTM - GRU
9. End for of uncleaned document Input: A email dataset;
Output: The converted synthetic sentence format F
Algorithm 3 TF-IDF (D);
TF-IDF algorithm split into two terms M ← Construct feature-feature correlation matrix;
TF, which means how many words are in the current L ← Construct feature-label correlation vector;
news. Feature reordering set S = [w1, w2, ... , wm ]
TF (word) =number of repeated words appear in the construction:
document/total number of words in the document 1. S = ∅;
where IDF refers to how necessary any terms are in 2. h ← Obtain the index of the feature that has
all news. IDF gave a score to words. the strongest correlation with the class label
IDF(word) = log (total amount of according to feature-label correlation vector
documents)/number of document where the word L;
appears 3. V ← Sort features in a descending order
according to their correlation to the hth
4. feature by utilizing feature-feature
correlation matrix M;
5. w1 = V: Determine the elements in the
feature reordering vector w1;
6. for each i ∈ [2, ... , m] do
7. for each j ∈ [1, 2, ... , m] do

V. RESULT AND ANALYSIS

The results and comparisons of different classifiers after


data training and testing are presented in this section.
We gathered 5000 emails from the online resource
Figure 7: Accuracy comparison of ML and DL Models
‘kaggle’ and translated them into Urdu using the python
VI. CONCLUSION
library Google trans, which uses the Google Translate
With the growing trend of cybercrime and accidents
Ajax API. Four thousand emails were used to train
resulting from vulnerabilities, proactive monitoring
various ML and DL models. One thousand emails were
and post-incident analysis of email data is crucial for
used for testing in order to quantify accuracy and
organizations. Cybercrimes like hacking, spoofing,
assessment metrics. As explained about evaluation
phishing, E-mail bombing, whaling, and spamming
measures in section 5, we have evaluated accuracy,
are being performed through E-mails. The existing
precision, recall, and f-measures that are evaluation
email classification approaches lead towards
measures measured using SVM and Naive Bayes, LSTM
irrelevant E-mails and/or loss of valuable
and. GRU are used to measure ROC-AUC and model
information. Keeping in sight these limitations, we
loss values. Finally, using various graphs, a comparison
designed a novel efficient approach named E-Mail
of models is presented below. The findings in Table 4
Sink AI for E-mail classification into four different
show that the deep learning algorithm (LSTM) is a
classes: Normal, Fraudulent, Threatening, and
stronger method for detecting Urdu spam emails, with
Suspicious E-mails by using LSTM based GRU that
high accuracy of 98.4%.
not only deals with short sequences as well long
Models Accuracy (%)
dependencies of 1000C characters. We evaluated the
LSTM 98.4
proposed E-Mail Sink AI model using evaluation
CNN 96.2
metrics such as precision, recall, accuracy, and f-
Naïve Bayes 98.0
score. Experimental results revealed that E-Mail Sink
SVM 97.5
AI performed better than existing ML algorithms and
Table 1: Accuracy of different Models
achieved a classification accuracy of 95% using the
In the mentioned Table, we have compared the accuracy novel technique of LSTM with recurrent gradient
of four different ML and DL models. We can see that units.
the DL model (LSTM) is the most accurate among all
the models, but it takes a long time to train. ML models
like SVM and Naive Bayes are around the same
accuracy percentage lower than LSTM/CNN, which is REFERENCES
also a DL model and has the lowest accuracy
percentage. Figure 12 shows accuracy comparison of [1]. S. Sinha, I. Ghosh, and S. C. Satapathy, “A study for
ML and DL models. ANN model for spam classification,'' in Intelligent Data
Engineering and Analytics. Singapore: Springer, 2021, pp.
331-343.
[2]. Q. Li, M. Cheng, J. Wang, and B. Sun, “LSTM based detection system,” Engineering Application Artificial.
phishing detection for big email data,'” IEEE Trans. Big Data, Intelligence. vol. 39, pp. 33-44, Mar. 2015.
early access, Mar. 12, 2020, [16]. M. Shuaib, O. Osho, I. Ismaila, and J. K. Alhassan,
doi: v10.1109/TBDATA.2020.2978915. “Comparative analysis of classification algorithms for email
[3]. T. Gangavarapu, C. D. Jaidhar, and B. Chanduka, spam detection,” International Journal in Computer Networks
“Applicability of machine learning in spam and phishing and Information Security., vol. 10, no. 1, pp. 60-67, Aug.
email filtering: Review and approaches,” Artif. Intell. Rev., 2018.
vol. 53, no. 7, pp. 5019-5081, Oct. 2020,doi: 10.1007/s10462- [17]. A. Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, and K.
020-09814-9. Kifayat, “A comprehensive survey of AI-enabled phishing
[4]. E. Bauer. 15 Outrageous Email Spam Statistics That Still attacks detection techniques,” Telecommunication System.,
Ring True in 2018, RSS. Accessed: Oct. 10, 2020. [Online]. pp. 116, Oct. 2020.
Available:https://www.propellercrm.com/blog/email-spam- [18]. C. Iwendi, Z. Jalil, A. R. Javed, T. Reddy, R. Kaluri, G.
statistics. Srivastava, and O. Jo, ``Key Split Watermark: Zero
[5]. A.Karim, S. Azam, B. Shanmugam, K. Kannoorpatti, and watermarking algorithm for software protection against cyber-
M. Alazab, “A comprehensive survey for intelligent spam attacks,'' IEEE Access, vol. 8, pp. 7265072660, 2020.
email detection”, IEEE Access, vol. 7, pp. 168261-168295, [19]. S. U. Rehman, M. Khaliq, S. I. Imtiaz, A. Rasool, M.
2019. Shaq, A. R. Javed, Z. Jalil, and A. K. Bashir, ``DIDDOS: An
[6]. K. Singh, S. Bhushan, and S. Vij, “Filtering spam approach for detection and identication of distributed denial of
messages and mails using fuzzy C means algorithm”, in Proc. service (DDoS) cyberattacks using gated recurrent units
4th Int. Conf. Internet Things, Smart Innov. Usages (IoT- (GRU),'' Future Gener. Comput. Syst., vol. 118, pp. 453466,
SIU), Apr. 2019, pp. 1-5. May 2021.
[7]. R. S. H. Ali and N. E. Gayar,”Sentiment analysis using [20]. S. I. Imtiaz, S. U. Rehman, A. R. Javed, Z. Jalil, X. Liu,
unlabeled email data”, in Proc. Int. Conf. Comput. Intell. and W. S. Alnumay, ``Deep AMD: Detection and
Knowl. Economy (ICCIKE), Dec. 2019, pp. 328-333. identification of Android malware using high efficient deep
[8]. K. Agarwal and T. Kumar, ``Email spam detection using artificial neural network,'' Future Generation Computer
integrated approach of naïve Bayes and particle swarm Systems., vol. 115, pp. 844856, Feb. 2021
optimization,'' in Proc. 2nd International Conference on
Intelligent Computing and Control System. (ICICCS), June.
2018, pp. 685-690.
[9]. M. Shuaib, O. Osho, I. Ismaila, and J. K. Alhassan,
“Comparative analysis of classification algorithms for email
spam detection”, Int. J. Computer Networks. Information
Security. vol. 10, no. 1, pp. 60-67, Aug. 2018.
[10]. G. Mujtaba, L. Shuib, R. G. Raj, N. Majeed, and M. A.
Al-Garadi, “Email classification research trends: Review
and open issues,” IEEE Access, vol. 5, pp. 9044-9064, 2017.
[11]. Z. Chen, Y. Yang, L. Chen, L.Wen, J.Wang, G. Yang,
and M. Guo, ``Email visualization correlation analysis
forensics research,'' in Proc. IEEE 4thInternational Conference
on Cyber Security. Cloud Computing. (CSCloud), Jun. 2017,
pp. 339-343.
[12]. N. Moradpoor, B. Clavie, and B. Buchanan, “Employing
machine learning techniques for detection and classification of
phishing emails,” in Proc. Computing Conference., Jul. 2017,
pp. 149-156.
[13]. A.S. Aski and N. K. Sourati, “Proposed efficient
algorithm to filter spam using machine learning techniques,”
Pacific Sci. Rev. A, Natural Sci. Eng., vol. 18, no. 2, pp. 145-
149, Jul. 2016.
[14]. Y. Kaya and O. F. Ertusrul, “A novel approach for spam
email detection based on shifted binary patterns,” Security in .
Communication Networks., vol. 9, no. 10, pp. 1216-1225, Jul.
2016.
[15]. I. Idris, A. Selamat, N. T. Nguyen, S. Omatu, O. Krejcar,
K. Kuca, andM. Penhaker, “A combined negative selection
algorithm-particle swarm optimization for an email spam

You might also like