E-Mail Spam Classification Via Machine Learning and Natural Language Processing
E-Mail Spam Classification Via Machine Learning and Natural Language Processing
ABSTRACT:
In our current modern scenario, majority of thecorrespondence and exchange in all
business sectors take place through emails. As the rate of exchange of information via
emailsis increasing exponentially, the amount of unsolicited bulk mail orSpam. These Emails
are sent for a number of reasons: Extractingconfidential information from individuals,
promotion of adultcontent and marketing/advertising of products and services.Thus, keeping
this mind, it is of paramount importance to build acomprehensive system for Spam
Classification based on semantics-based text classification using NLP and URL based
filtering.Various Machine Learning algorithms have been surveyed andthe objective is to
create a model with high performance andefficiency.
EXISTING SYSYEM:
In an integrated approach involving all the three processeshas
more accuracy than any of the processes havinga standalone approach (URL Analysis, NLP,
ML). Furtherhighlights the system of URL Classification and the DecisionTree algorithm is
used, and the model is trained using a dataset from Phish tank.
DISADVANTAGES:
Accuracy is Low.
The drawbacks of models based on term frequency were considered which leads to
huge computational load and slow training speed due to the size of huge feature
vector space.
PROPSOED SYSTEM:
The project presents the design and implementationof back-propagation neural
networks for spam classificationusing behaviour-based features. In Spam classifier whichis
based on ML algorithm is used, they have used parameteroptimization and feature selection,
optimization of two parameters of ML algorithms to maximize the Spam detectionrates is
done, and also provided the importance of individualfeature selection. It can detect spam with
low processingresources with high accuracy.
ADVANTAGES:
Accuracy is Very high.
It improved the classifying predictability andgave a better feedback.
SYSTEM SPECIFICATION:
SOFTWARE REQUIREMENTS:
• Operating System: Windows
• Coding Language: Python
HARDWARE REQUIREMENTS:
• Processor – i3
• Speed – 2.4 GHz
• RAM – 4 GB
• Hard Disk - 500 GB