Spam Detection 6
Spam Detection 6
PROBLEM STATEMENT:
DESCRIPTION:
Email spam detection involves identifying and filtering unwanted or unsolicited
emails from a user's inbox. The goal is to distinguish between legitimate and
unsolicited emails, often through the use of machine learning or deep learning
techniques. These techniques typically involve training a model on a labeled
dataset of spam and non-spam emails, and then using the model to classify new
incoming emails as spam or non-spam based on features extracted from the
email, such as the email's content, header information, metadata, and sender
information.
1.Programming Language
2.Development Environment
5. Text Vectorization
Data Storage
Version Control
HARDWARE REQUIREMENT:
1. Processor (CPU)
2. Memory (RAM)
3. Storage
4. Graphics Processing Unit (GPU)
5. Network Bandwidth
6. Monitoring and Management Tools
7. Scalability
8. Storage
FLOW CHART:
CODING:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
data = pd.read_csv('/content/spam.csv')
print(data.isna().sum())
clf = Pipeline([
('vectorizer', CountVectorizer()),
('nb', MultinomialNB())
])
clf.fit(X_train, y_train)
emails = [
'Sounds great! Are you home now?',
'Will u meet ur dream partner soon? Is ur career off 2 a flyng start? 2 find out
free, txt HORO followed by ur star sign, e. g. HORO ARIES'
]
prediction = clf.predict(emails)
print(prediction)
print(clf.score(X_test, y_test))
OUTPUT:
Message 0
Category 0
Spam 0
dtype: int64
[0 1]
# Output for print(clf.score(X_test, y_test))
0.9765
RESULT: