0% found this document useful (0 votes)
64 views8 pages

Spam Detection 6

Uploaded by

mkesav3070
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views8 pages

Spam Detection 6

Uploaded by

mkesav3070
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

EMAIL SPAM DETECTION

PROBLEM STATEMENT:

Email spam detection is the process of identifying and filtering unwanted or


unsolicited emails from a user's inbox.With the increasing volume of emails
exchanged every day, the problem of email spam has become a significant
concern for email service providers and users alike. Spam emails not only waste
the recipient's time and resources but also pose security threats by spreading
malware, phishing attacks, and other malicious content.

DESCRIPTION:
Email spam detection involves identifying and filtering unwanted or unsolicited
emails from a user's inbox. The goal is to distinguish between legitimate and
unsolicited emails, often through the use of machine learning or deep learning
techniques. These techniques typically involve training a model on a labeled
dataset of spam and non-spam emails, and then using the model to classify new
incoming emails as spam or non-spam based on features extracted from the
email, such as the email's content, header information, metadata, and sender
information.

The model's performance can be evaluated using various metrics, such as


accuracy, precision, recall, and F1 score. To address the dynamic nature of
spam emails, the model can be periodically retrained on new data to ensure that
it remains effective in detecting the latest spam techniques. Additionally, the
model can be designed to handle imbalanced datasets, where the number of
non-spam emails far outweighs the number of spam emails. Overall, the goal of
email spam detection is to provide accurate and efficient techniques to identify
and filter spam emails while minimizing false positives and negative. This
adaptability ensures that detection mechanisms remain robust and effective
In the face of ever-changing spam methodologies. Customizable filters provide
users with the flexibility to tailor spam detection settings according to their
specific needs, striking a balance between stringent security measures and
operational convenience. As cyber threats continue to evolve, the importance of
robust email spam detection mechanisms cannot be overstated, making ongoing
innovation and refinement essential in the ongoing battle against spam.
SOFTWARE REQUIREMENT:

1.Programming Language

2.Development Environment

3.Libraries and Frameworks

4. Machine Learning Model

5. Text Vectorization

Data Processing and Analysis

Data Storage

Version Control

Web Framework (optional) 10.Model Deployment (optional) 11.Testing and


Validation
12.Logging and Monitoring (optional)

HARDWARE REQUIREMENT:

1. Processor (CPU)
2. Memory (RAM)
3. Storage
4. Graphics Processing Unit (GPU)
5. Network Bandwidth
6. Monitoring and Management Tools
7. Scalability
8. Storage
FLOW CHART:
CODING:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
data = pd.read_csv('/content/spam.csv')

print(data.isna().sum())

data['Spam'] = data['Category'].apply(lambda x: 1 if x == 'spam' else 0)

X_train, X_test, y_train, y_test = train_test_split(data.Message, data.Spam,


test_size=0.25)

clf = Pipeline([
('vectorizer', CountVectorizer()),
('nb', MultinomialNB())
])

clf.fit(X_train, y_train)

emails = [
'Sounds great! Are you home now?',
'Will u meet ur dream partner soon? Is ur career off 2 a flyng start? 2 find out
free, txt HORO followed by ur star sign, e. g. HORO ARIES'
]
prediction = clf.predict(emails)

print(prediction)

print(clf.score(X_test, y_test))
OUTPUT:
Message 0
Category 0
Spam 0
dtype: int64
[0 1]
# Output for print(clf.score(X_test, y_test))
0.9765

RESULT:

The above email spam detection program was executed successfully.

You might also like