Decision Tree Model For Email Classification: Ivana Čavor

The document discusses a decision tree model for email classification. It presents an approach using semantic feature selection from email bodies and the ID3 algorithm to generate a decision tree to categorize emails as spam or legitimate. The proposed system is evaluated based on accuracy, precision, and recall. The performance is also measured based on dataset and feature size.

Uploaded by

asmm.rahaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views4 pages

Decision Tree Model For Email Classification: Ivana Čavor

Uploaded by

asmm.rahaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

25th International Conference on Information Technology (IT)

Žabljak, 16 – 20 February 2021

Decision Tree Model for Email Classification

Ivana Čavor

selection from email body is very important. In this paper,


2021 25th International Conference on Information Technology (IT) | 978-1-7281-9103-4/20/$31.00 ©2021 IEEE | DOI: 10.1109/IT51528.2021.9390143

Abstract— In addition to the undeniable benefits, the

development of the Internet has led to many undesirable security semantic properties of email content are used for feature
effects. Spam emails are one of the most challenging issues faced selection and reduction. In order to efficiently detect spam
by the Internet users. Spam refers to all emails of unsolicited emails various preprocessing steps need to be done, such as
content that arrive in a user's email box. Spam can often lead to stop words removal, stemming, term frequency [5], [6], [7].
network congestion and blocking or even damage to the system The aim is to preserve the most important features and to
for receiving and sending electronic messages. Thus, appropriate
reduce computations demand. After feature selection, the ID3
classification of spam email from legitimate email has become
quite important. This paper presents a new approach for feature algorithm is used to generate a decision tree that categorizes
selection and Iterative Dichotomiser 3 (ID3) algorithm designed emails as spam or ham [8], [9]. The proposed approach is
to generate the decision tree for email classification. The evaluated using accuracy, precision and recall. The
experimental results indicate that the proposed model achieves performance of proposed system is measured against the size
very high accuracy. of dataset and feature size.
This paper is organized as follows. Section II explains
I. INTRODUCTION proposed approach for spam detection in detail. Section III
The Internet as a “network of networks” has expanded the summarizes the results while Section IV gives the conclusion.
possibilities of communication and placement of content.
Email system is one of the most effective and commonly used II. SPAM DETECTION SYSTEM
sources of communication [1]. Unfortunately, the continuous This section presents proposed Spam Detection (SD)
rise of email users has led to a massive increase of spam system in detail. The system goes through two stages: training
emails. Spam emails are usually sent in bulk and do not target and testing. The training stage has four modules: Data
individual recipients. Whether it is commercial in nature or preparation, Feature selection, Feature reduction and
not, spam emails can cause serious problems in electronic Classification. The testing stage consist of Data preparation
communication. Spam emails produce huge amount of and Classification modules. SD process is presented in Fig.1
unsolicited data and thus affect the network bandwidth and and the proposed procedure is briefly explained in sections
storage capacity. Due to the large number of spam emails to below.
users of email services it is difficult to distinguish useful from
unsolicited emails. Thus, managing and filtering emails is an
important challenge. The filtering purpose is to detect and
isolate spam emails.

There are two main approaches for spam detection. The

first approach is based on email header analysis and the
second one is based on email body analysis. Spam filters
usually combine those two approaches. Email header includes
fields like From, To, Subject, CC (Carbon Copy), BCC (Blind
Carbon Copy) which almost reveals the nature of the email.
The recent studies have shown that the information provided
by email header is quite important [2], [3]. Content based
filtering relies on the assumption that the body content of
spam email is different than the legitimate or ham mail. In
recent years, a number of Machine Learning (ML) and data
mining techniques have been engaged in order to classify
email messages based on its content. Classification methods,
such as Naïve Bayes, Support Vector Machine, Decision
Tree, Random Forest and Neural Networks are commonly
used to develop efficient email classifier [4]. For the most Figure 1. Spam detection process
classification problems the process of feature extraction and

Ivana Čavor (email: [email protected])

Faculty of Maritime Studies Kotor, University of Montenegro,
Put I Bokeljske brigade 44, 85330 Kotor, Montenegro.
978-1-7281-9103-4/21/$31.00 ©2021 IEEE

Authorized licensed use limited to: Jahangirnagar University. Downloaded on September 05,2021 at 04:32:49 UTC from IEEE Xplore. Restrictions apply.
A. Email dataset task is to understand that are there any specific words or
The dataset used for the classification purpose consists of sequence of words that determine whether an email is a spam
4000 entries [10]. The dataset contains 3465 ham and 535 or not. For this purpose, the Term Frequency (TF) method is
spam messages. This dataset is divided into two subsets: used. TF can be defined as a numerical statistic which is
training set and testing set. The size of dataset assigned for intended to reflect how crucial a word is to a document
training purpose can affect systems performance which will present in a corpus. The TF value is directly proportional to
be shown further on. the number of times a word appears in a document. Fig. 2
illustrates a word cloud of common words in spam email. The
B. Preprocessing of dataset size of word in Fig.2 is proportional to its occurrence in spam
The email dataset considered needs to be preprocessed emails. Words like ‘free’, ‘txt ’, ‘call’ have large TF weights
before performing feature selection. It is well known that which makes them good indicators of spam.
spam mails usually contain phone numbers, emails, website
URLs, money amounts, and a lot of whitespace and
punctuation. Instead of removing the following terms, for
each training example, the terms are replaced with a specific
string as follows:

1. Replace email addresses with ‘emailaddr’

2. Replace URLs with ‘httpaddr’
3. Replace money symbols with ‘moneysymb’
4. Replace phone numbers with ‘phonenumbr’
5. Replace numbers with ‘numbr’
Figure 2. Visual representation of important words for spam email
Punctuations are also removed from the text and all
whitespaces (spaces, line breaks, tabs) are replaced with The TF method is used to represent text data for ML
single space. Entire dataset is also lowercased. The sentences algorithm. Data representation is needed because it is hard to
are split into words known as tokens. Each email is tokenized do computation with the textual data. Accordingly, the
in order to reveal characteristic spam words. Stop words are frequency of all words in the preprocessed spam dataset is
also removed. Stop words are words that do not have calculated and twenty most frequent spam words are selected
linguistic meaning, e.g. ‘a’, ‘an’, ‘the’, and ‘is’, etc. The next as features. Next, the occurrence of each feature in an email is
step in the preprocessing stage is the stemming. “Stemming mapped in feature matrix shown in Table 1. In order to
usually refers to a crude heuristic process that chops off the enhance the ML algorithm accuracy, one more feature is
ends of words in the hope of achieving this goal correctly added. That feature represents the total number of important
most of the time, and often includes the removal of spam words in a specific email. The experimental results
derivational affixes” [11]. The preprocessing phase is quite indicate that the corresponded feature has the biggest effect on
important as it reduces the search space for efficient feature the appropriate classification decision. In fact, for most
extraction and selection. features, it is shown that it is not important how many times a
certain spam word occurred in an email but whether it
appeared at all. This conclusion has enabled data
C. Feature extraction and selection dimensionality reduction, since there are some features that
In this process the emails are analyzed to find out the do not have an effect on the decision. The feature that has no
features (words) which would be most useful in the influence on the class labels can be discarded. The feature
classification stage. The main idea is to find out words that reduction has made the data less sparse and more statistically
are frequently occurring in the dataset or to find out the words significant for the classification algorithm.
which hold relatively higher importance in understanding the
class of an email. In case of identifying an email as spam, the

TABLE I
FEATURE MATRIX: EACH ROW REPRESENTS AN EMAIL WITH THE FEATURES PRESENTED IN COLUMNS

FEATURES
EMAIL Numbr Call Txt Free Claim Httpaddr Moneysymb Total_spam_words DECISION/CLASS
Email_1 0 1 0 0 0 0 0 1 Ham
Email_2 2 0 0 1 1 1 0 4 Spam
Email_3 1 0 0 3 0 0 0 2 Spam
Email_4 1 0 0 0 0 0 0 0 Ham

Authorized licensed use limited to: Jahangirnagar University. Downloaded on September 05,2021 at 04:32:49 UTC from IEEE Xplore. Restrictions apply.
D. Decision tree Information gain is calculated to split the attributes
A decision tree uses a tree-like model to represent a further in the tree. The attribute with the highest
number of possible decision paths as well as their potential information gain is always preferred first. Entropy and
outcomes [13]. Each decisions tree node represents a information gain is related by (2):
feature, each branch represents a decision and each leaf
represents an outcome (class or decision). Decision trees gain(S,Ai )=Entropy(S)-EntropyA (S) (2)
i
can be used to predict the class of an unknown query
instance by building a model trained on set of labeled data. where EntropyAi(S) is the expected entropy if attribute Ai is
Each training example should be characterized by a used to partition the data.
number of descriptive features or attributes. The features The algorithm was implemented according to the
can have either nominal or continuous values. following steps:
A decision tree consists of root node, internal nodes and 1. Create a root node
leaf nodes. Internal nodes represent the conditions based on 2. Calculate the entropy of the whole (sub) dataset
which the tree splits into branches and the leaf nodes 3. Calculate the information gain for each feature
represent possible outcomes for each path. Each node and select the feature with the largest information
typically has two or more nodes extending from it. “When gain.
classifying an unknown instance, the unknown instance is 4. Assign the (root) node the label of the feature with
routed down the tree according to the values of the maximum information gain. Grow for each
attributes in the successive nodes and when a leaf is feature value an outgoing branch and add
reached the instance is classified according to class unlabeled nodes at the end.
assigned to the leaf” [14]. The main advantage for using 5. Split the dataset along the values of maximum
decision tree is that it is easy to follow and understand. Fig. information gain feature and remove this feature
4 presents example of a typical decision tree. Words “free” from dataset.
and “money” are typical spam words and they are used as 6. For each sub-dataset, repeat steps 3 to 5 until a
features. If the word “free” appears more than two times in stopping criteria is satisfied.
an email than the email is classified as spam. Otherwise,
we are asking does the email contain the word “money”. If Since the chosen features have continuous values, to
the word “money” appears more than three times than the perform a binary split, it is needed to convert continuous
email is certainly spam, otherwise it is ham. values to nominal ones. That is done using threshold value.
The threshold value is a value that offers maximum
information gain for that attribute. For example, the
information gain maximizes when threshold is equal to two
for total_spam_words feature from Table 1.

III. EXPERIMENTAL RESULTS

The performance of proposed SD system is evaluated
using accuracy, prediction and recall. In order to compute
these measures confusion matrix is created. The confusion
matrix has four outputs:
1. True Positive (TP): the number of instances
correctly classified as spam.
2. True Negative (TN): the number of instances
Figure 3. An example of decision tree
correctly classified as ham.
3. False Positive (FP): the number of instances
The ID3 algorithm is based on the Decision tree incorrectly classified as spam.
algorithm. ID3 algorithm builds the decision tree based on 4. False Negative (FN): the number of instances
entropy and the information gain. “Entropy measures the incorrectly classified as ham.
impurity of an arbitrary collection of samples while the Table 2 represents confusion matrix for email spam
information gain calculates the reduction in entropy by classification.
partitioning the sample according to a certain attribute” TABLE II
CONFUSION MATRIX
[15]. If the target attribute (class) takes on n different
values, then the entropy S relative to this n-wise
Predicted HAM Predicted SPAM
classification is defined as shown in (1): Actual HAM True Negative False Positive
Actual SPAM False Negative True Positive
Entropy(S)= ∑ni=1 -pi ∙ log2 pi (1)
Accordingly, accuracy, precision and recall can be defined
where pi is the proportion/probability of S belonging to as follows:
class Cn.

Authorized licensed use limited to: Jahangirnagar University. Downloaded on September 05,2021 at 04:32:49 UTC from IEEE Xplore. Restrictions apply.
TP+TN the system achieves high accuracy with a few features and
accuracy= (3)
TP+TN+FP+FN
with relatively small training dataset. In the near future, it
TP is planned to incorporate other classifiers and to compare
precision= (4) their performances with the proposed approach.
TP+FP

TP
recall= (5) REFERENCES
TP+FN
[1] P. Sharma and U. Bhardwaj, Machine Learning based Spam E-Mail
Detection, in International Journal of Intelligent Engineering &
For a classifier, accuracy is the proportion of the total Systems, vol. 11, no. 3, 2017
testing examples which classifier predicted correct, [2] A. S. Rajput, J. S. Sohal, V. Athavale, “Email Header Feature
precision is ratio of total number of correctly classified Extraction using Adaptive and Collaborative approach for Email
spam emails and the total number of emails predicted as Classification”, in International Journal of Innovative Technology
and Exploring Engineering (IJITEE), ISSN: 2278-3075, vol.8, Issue
spam and recall represents proportion of emails correctly 7S, May 2019
classified as spam among all spam emails. The [3] P. Kulkarni, J.R. Saini and H. Acharya, “Effect of Header-based
performance of proposed SD system is measured against Features on Accuracy of Classifiers for Spam Email Classification”,
the size of dataset and the features size. The result are in: International Journal of Advanced Computer Science and
Applications (IJACSA), vol. 11, no. 3, 2020
presented in Table 3. [4] E. G. Dada, S. B. Joseph, H. Chiroma, S. Abdulhamid, A.
TABLE III Adetunmbi, E. Opeyemi and Ajibuwa, “Machine learning for email
CLASSIFICATION RESULTS BASED ON DATASET SIZE AND spam filtering: review, approaches and open research problems”. in
FEATURE SIZE Heliyon, June 2019
[5] E. M. Bahgat, S. Rady, W. Gad and I. F. Moawad, “Efficient email
Dataset Feature Accuracy[%] Precision[%] Recall[%] classification approach based on semantic methods”, In: Ain Shams
size size Eng. J., vol. 9, no. 4, pp. 3259-3269, December 2018.
1000 7 97.4 92.01 87.21 [6] F. Ruskanda, “Study on the Effect of Preprocessing Methods for
1000 3 96.63 85.61 88.51 Spam Email Detection”, in: Indonesian Journal on Computing
1500 7 97.32 92.28 86.21 (Indo-JC). 4. 109, March 2019.
1500 3 96.56 85.62 87.77 [7] A. Sharma, Manisha, D. Manisha and D.R. Jain, “Data Pre-
3000 7 97.2 91.52 85.71 Processing in Spam Detection”, in: International Journal of Science
3000 3 96.3 83.96 87.30 Technology & Engineering (IJSTE), vol. 1, Issue 11, May 2015
[8] L. Shi, Q. Wang, X. Ma, M. Weng and H. Qiao, “Spam Email
Classification Using Decision Tree Ensemble”,in Journal of
The dataset of different sizes are used for measuring the Computational Information Systems 8, March 2012
performance. For example, in case of 1000 emails and 7 [9] S. Balamurugan and R. Rajaram, “Suspicious E-mail Detection via
features being used for the training process, accuracy was Decision Tree: A Data Mining Approach”, January 2007.
97.4% using decision tree classifier. The precision and [10] T. A. Almeida and J.M. Gómez Hidalgo, SMS Spam Collection,
UCI Machine Learning Repository, viewed 12 September 2020,
recall values are 92.01% and 87.21% respectively. https://archive.ics.uci.edu/ml/datasets/sms+spam+collection
Reducing the number of features affects the accuracy by [11] C. D. Manning, P. Raghavan and H. Schütze, “Introduction to
decreasing it to 96.63%, with precision and recall value of Information Retrieval”, in Cambridge University Press, 2008.
85.61% and 88.51% respectively. The dataset size slightly [12] A. Bhowmick and S. M. Hazarika, “Machine Learning for E-mail
Spam Filtering: Review, Techniques and Trends”, 2016
affects accuracy: the accuracy for 1500 training examples [13] J. Grus, “Data Science from Scratch: First Principles with Python”,
and 3000 training examples was 97.32% and 97.2% O'Reilly Media. Inc., April 2015
respectively. [14] I.H. Witten and E. Frank, “Data Mining: Practical Machine
Learning Tools and Techniques with Java Implementations”,
Morgan Kaufmann, San Francisco, 2000
IV. CONCLUSION [15] T. Kristensen and G. Kumar, “Entropy based disease classification
In this paper, decision tree-based classification is employed of proteomic mass spectrometry data of the human serum by a
support vector machine”, Proceedings. 2005 IEEE International
for spam email detection. A novel approach for feature Joint Conference on Neural Networks, 2005
selection and reduction is also presented. It is shown that

Authorized licensed use limited to: Jahangirnagar University. Downloaded on September 05,2021 at 04:32:49 UTC from IEEE Xplore. Restrictions apply.

C2 Speaking Useful Phrases
71% (7)
C2 Speaking Useful Phrases
1 page
Big Writing Booklet PDF
80% (5)
Big Writing Booklet PDF
26 pages
Reyes J Loob and Kapwa - An Introduction To A Filipino Virtue Ethics PDF
No ratings yet
Reyes J Loob and Kapwa - An Introduction To A Filipino Virtue Ethics PDF
24 pages
List of National Anthems
No ratings yet
List of National Anthems
28 pages
1Z0-061 Sample Questions Answers
0% (1)
1Z0-061 Sample Questions Answers
6 pages
44 Decision Tree Model For Email Classification
No ratings yet
44 Decision Tree Model For Email Classification
4 pages
(IJCST-V11I2P16) :shikha, Jatinder Singh Saini
No ratings yet
(IJCST-V11I2P16) :shikha, Jatinder Singh Saini
9 pages
Major-Final Research Paper
No ratings yet
Major-Final Research Paper
3 pages
46 - Ijme... Mech Engg..Research Paper-1
No ratings yet
46 - Ijme... Mech Engg..Research Paper-1
10 pages
Email Spam Detection Using Machine Learning
No ratings yet
Email Spam Detection Using Machine Learning
2 pages
Synopsis Email Spam
No ratings yet
Synopsis Email Spam
9 pages
Ijirt156181 Paper
No ratings yet
Ijirt156181 Paper
5 pages
Amrit Science Campus: Submitted by
No ratings yet
Amrit Science Campus: Submitted by
35 pages
Email Spam Detection
No ratings yet
Email Spam Detection
8 pages
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
No ratings yet
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
64 pages
Survey On Spam Filtering in Text Analysis: Saksham Sharma, Rabi Raj Yadav
No ratings yet
Survey On Spam Filtering in Text Analysis: Saksham Sharma, Rabi Raj Yadav
7 pages
Machine Learning Based Spam E-Mail Detection
No ratings yet
Machine Learning Based Spam E-Mail Detection
10 pages
Presentation 3
No ratings yet
Presentation 3
13 pages
Spam Detection in Email Using Machine Le
No ratings yet
Spam Detection in Email Using Machine Le
8 pages
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
No ratings yet
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
7 pages
Madhavan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012113
No ratings yet
Madhavan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012113
12 pages
A Comparative Performance Evaluation of Content Based Spam and Malicious URL Detection in E-Mail
No ratings yet
A Comparative Performance Evaluation of Content Based Spam and Malicious URL Detection in E-Mail
6 pages
Jebin 2
No ratings yet
Jebin 2
22 pages
Considering Behavior of Sender in Spam Mail Detection: S. Naksomboon, C. Charnsripinyo and N. Wattanapongsakorn
No ratings yet
Considering Behavior of Sender in Spam Mail Detection: S. Naksomboon, C. Charnsripinyo and N. Wattanapongsakorn
5 pages
Email Classification Using Naive Bayes Classifier: Domain Algorithms Framework Platform
No ratings yet
Email Classification Using Naive Bayes Classifier: Domain Algorithms Framework Platform
7 pages
Using Support Vector Machine For Classification and Feature Extraction of Spam in Email
No ratings yet
Using Support Vector Machine For Classification and Feature Extraction of Spam in Email
7 pages
Content Based Spam Detection in Email Us PDF
No ratings yet
Content Based Spam Detection in Email Us PDF
5 pages
10-2018-Composite Email Features For Spam Identification
No ratings yet
10-2018-Composite Email Features For Spam Identification
9 pages
VBK23 Cse 041
No ratings yet
VBK23 Cse 041
6 pages
PPT
0% (1)
PPT
15 pages
$RB0DCAN
No ratings yet
$RB0DCAN
10 pages
E-Mail Spam Detection
No ratings yet
E-Mail Spam Detection
8 pages
IJRPR8167
No ratings yet
IJRPR8167
7 pages
Id - 3747 - Literature Review
No ratings yet
Id - 3747 - Literature Review
3 pages
Survey On Spam Filtering in Text Analysis: Saksham Sharma, Rabi Raj Yadav
No ratings yet
Survey On Spam Filtering in Text Analysis: Saksham Sharma, Rabi Raj Yadav
7 pages
Aayush Nihar Spam Mail Filtering
No ratings yet
Aayush Nihar Spam Mail Filtering
18 pages
Feature Selection and Similarity Coefficient Based Method For Email Spam Filtering
No ratings yet
Feature Selection and Similarity Coefficient Based Method For Email Spam Filtering
4 pages
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
100% (2)
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
58 pages
Chung-Kwei Spam IA
No ratings yet
Chung-Kwei Spam IA
18 pages
E-Mail Spam Detection Using Machine Learning KNN
No ratings yet
E-Mail Spam Detection Using Machine Learning KNN
5 pages
Email Spam Detection (Research Paper)
No ratings yet
Email Spam Detection (Research Paper)
8 pages
Enhancing Spam Detection Using Harris Hawks Optimization Algorithm
No ratings yet
Enhancing Spam Detection Using Harris Hawks Optimization Algorithm
8 pages
Research Article On The Forensic
No ratings yet
Research Article On The Forensic
14 pages
Maths Answers
No ratings yet
Maths Answers
4 pages
Project Report Emaildetection 4 44
No ratings yet
Project Report Emaildetection 4 44
41 pages
Moutafis EWS 098
No ratings yet
Moutafis EWS 098
8 pages
Email (Research) 3
No ratings yet
Email (Research) 3
7 pages
1822 B Deleted Merged Cropped
No ratings yet
1822 B Deleted Merged Cropped
40 pages
Kongunadu College of Engineering and Technology: Automated Spam Filtering: A Fuzzy Similarity Approach
No ratings yet
Kongunadu College of Engineering and Technology: Automated Spam Filtering: A Fuzzy Similarity Approach
6 pages
Spam Email Detection Using Python and Machine Learning
No ratings yet
Spam Email Detection Using Python and Machine Learning
14 pages
2nd Seminar
No ratings yet
2nd Seminar
7 pages
Pruthviraj Micor Foml
No ratings yet
Pruthviraj Micor Foml
26 pages
NLP Report
No ratings yet
NLP Report
19 pages
IJCRT23A5429
No ratings yet
IJCRT23A5429
7 pages
Research Paper Spam Detection
No ratings yet
Research Paper Spam Detection
4 pages
Ijresm V6 I9 3 2
No ratings yet
Ijresm V6 I9 3 2
5 pages
(IJCST-V12I1P3) :ipsita Panda, Sidharth Dash
No ratings yet
(IJCST-V12I1P3) :ipsita Panda, Sidharth Dash
6 pages
Email Spam PDF
No ratings yet
Email Spam PDF
5 pages
A Study of Machine Learning Algorithms On Email Spam Classification
No ratings yet
A Study of Machine Learning Algorithms On Email Spam Classification
10 pages
Second Progress Report
No ratings yet
Second Progress Report
17 pages
Email Classification Using Machine Learning
No ratings yet
Email Classification Using Machine Learning
22 pages
Print 22may2023
No ratings yet
Print 22may2023
54 pages
Vishal FOML Micro Project Vishal & Milan
No ratings yet
Vishal FOML Micro Project Vishal & Milan
26 pages
Zoom
No ratings yet
Zoom
20 pages
Building Desktop Applications with Electron: Definitive Reference for Developers and Engineers
From Everand
Building Desktop Applications with Electron: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Instructions
No ratings yet
Instructions
1 page
Detection and Elimination of Spyware and Ransomware by Intercepting Kernel-Level System Routines
No ratings yet
Detection and Elimination of Spyware and Ransomware by Intercepting Kernel-Level System Routines
13 pages
Digital Investigation: Lorenz Liebler, Patrick Schmitt, Harald Baier, Frank Breitinger
No ratings yet
Digital Investigation: Lorenz Liebler, Patrick Schmitt, Harald Baier, Frank Breitinger
10 pages
Certificate JMRTEC 25jan2024
No ratings yet
Certificate JMRTEC 25jan2024
2 pages
New Text Document
No ratings yet
New Text Document
2 pages
A Study of Ransomware Attacks: Evolution and Prevention: Jstard
No ratings yet
A Study of Ransomware Attacks: Evolution and Prevention: Jstard
8 pages
Distance Vector Routing Example
No ratings yet
Distance Vector Routing Example
42 pages
12 Ijcse 027422
No ratings yet
12 Ijcse 027422
7 pages
Alshaikh 2020 Ijca 919899
No ratings yet
Alshaikh 2020 Ijca 919899
10 pages
SecureFace Face Template Protection
No ratings yet
SecureFace Face Template Protection
16 pages
Appendix
No ratings yet
Appendix
72 pages
IJIGSP-Template 2nd Paper
No ratings yet
IJIGSP-Template 2nd Paper
11 pages
Lecture - 4 - Real-Time Software Engineering
100% (1)
Lecture - 4 - Real-Time Software Engineering
59 pages
Project - Scope - Time Management - ASMMR
No ratings yet
Project - Scope - Time Management - ASMMR
69 pages
Lec6 - Nondeterministic TMs
No ratings yet
Lec6 - Nondeterministic TMs
12 pages
4 Software
No ratings yet
4 Software
148 pages
Lec2 1 Nondeterminism
No ratings yet
Lec2 1 Nondeterminism
9 pages
Lecture2 - SW Processes and Agile
No ratings yet
Lecture2 - SW Processes and Agile
79 pages
Lecture3 - Software Reuse
No ratings yet
Lecture3 - Software Reuse
58 pages
Lecture 5 Sequential Circuit
No ratings yet
Lecture 5 Sequential Circuit
64 pages
Lec2 Logic Gates Chap2
No ratings yet
Lec2 Logic Gates Chap2
61 pages
Lec1 Digitalsystem Chap1
No ratings yet
Lec1 Digitalsystem Chap1
35 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
45 pages
Lecture4 AccessControl Authorization
No ratings yet
Lecture4 AccessControl Authorization
89 pages
Lecture3 AccessControl
No ratings yet
Lecture3 AccessControl
56 pages
Lecture - 3 - Gate Level Minimization
No ratings yet
Lecture - 3 - Gate Level Minimization
47 pages
Software Sit - Cobas - e - 411
No ratings yet
Software Sit - Cobas - e - 411
6 pages
Birth Application Fillable Form 1
No ratings yet
Birth Application Fillable Form 1
1 page
The Early Trinity
No ratings yet
The Early Trinity
14 pages
Megamat - Swiss Instruments LTD
No ratings yet
Megamat - Swiss Instruments LTD
74 pages
Ldica Unit IV
No ratings yet
Ldica Unit IV
72 pages
Fruits of The Spirit
No ratings yet
Fruits of The Spirit
33 pages
Eskimo Words For Snow
No ratings yet
Eskimo Words For Snow
7 pages
Design Rationale
No ratings yet
Design Rationale
9 pages
Merge Sort: SCJ2013 Data Structure & Algorithms
No ratings yet
Merge Sort: SCJ2013 Data Structure & Algorithms
16 pages
SQL. Interview Questions
No ratings yet
SQL. Interview Questions
37 pages
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
9 pages
Cot1 Ap 2019
67% (3)
Cot1 Ap 2019
2 pages
Grade 9 Annual Exam Timetable With Portions
No ratings yet
Grade 9 Annual Exam Timetable With Portions
4 pages
Q 4
No ratings yet
Q 4
27 pages
Project On Banking System in Mis PDF
No ratings yet
Project On Banking System in Mis PDF
43 pages
Structured Data in Example
No ratings yet
Structured Data in Example
19 pages
How To Help Your Child Succeed in School Spanish
No ratings yet
How To Help Your Child Succeed in School Spanish
1 page
CSE2005 Lab Da1
No ratings yet
CSE2005 Lab Da1
25 pages
Theories of Gender and Media
No ratings yet
Theories of Gender and Media
3 pages
Dissertation de Philosophie Sur La Culture
100% (1)
Dissertation de Philosophie Sur La Culture
7 pages
EngLish 4
No ratings yet
EngLish 4
12 pages
Introduction To LANSA For I
No ratings yet
Introduction To LANSA For I
129 pages
English Exercises - Word Formation - Prefixes & Suffixes
No ratings yet
English Exercises - Word Formation - Prefixes & Suffixes
3 pages
Jewish Beliefs Quote Bank
No ratings yet
Jewish Beliefs Quote Bank
3 pages
Applications of Double Laplace Transform To Bounda
100% (1)
Applications of Double Laplace Transform To Bounda
5 pages

Decision Tree Model For Email Classification: Ivana Čavor

Uploaded by

Decision Tree Model For Email Classification: Ivana Čavor

Uploaded by

25th International Conference on Information Technology (IT)

Žabljak, 16 – 20 February 2021

Decision Tree Model for Email Classification

selection from email body is very important. In this paper,

Abstract— In addition to the undeniable benefits, the

There are two main approaches for spam detection. The

Ivana Čavor (email: [email protected])

1. Replace email addresses with ‘emailaddr’

III. EXPERIMENTAL RESULTS

You might also like