Spam Email Dection
Spam Email Dection
Classifier
Outline
1. Problem Statement
2. Preprocessing
3. Models
4. Performance
5. Conclusion
1. Problem
Statement
The dataset used for this project is sourced from Kaggle, containing
labeled emails as spam or ham. The dataset includes various features
such as the email text, subject lines, and other metadata. It will serve as
the foundation for training and testing our model.
Example
Ham:
“Did you catch the bus ? Are you frying an egg ? Did you make a tea? Are
you eating your mom's left over dinner ? Do you feel my Love ? “
Spam:
“Thanks for your subscription to Ringtone UK your mobile will be charged
£5/month Please confirm by replying YES or NO. If you reply NO you will
not be charged. “
2. Data preprocessing
1. SVM
2. XGBoost
3. Random Forest
4. Logistic Regression
3.1. SVM
PROS CONS
PROS CONS
PROS CONS
Sigmoid function
3.4. Logistic Regression
PROS CONS
for listening