0% found this document useful (0 votes)

13 views

Lab2.ipynb - Colaboratory

This document contains code to analyze an email dataset using machine learning algorithms. It loads the dataset, explores the data distribution and features, splits the data into training and test sets, trains a support vector machine classifier on the training set, predicts labels on the test set, and calculates the accuracy of the predictions at 81%.

Uploaded by

Ajinkya Somawanshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Lab2.ipynb - Colaboratory

Uploaded by

Ajinkya Somawanshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

10/29/23, 10:57 PM Lab2.

ipynb - Colaboratory

1 import pandas as pd
2 import numpy as np
3 import matplotlib.pyplot as plt
4 import seaborn as sns
5 from sklearn.model_selection import train_test_split

1 df = pd.read_csv('emails.csv')

1 df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5172 entries, 0 to 5171
Columns: 3002 entries, Email No. to Prediction
dtypes: int64(3001), object(1)
memory usage: 118.5+ MB

1 df.shape

(5172, 3002)

1 df.head()

output Email
the to ect and for of a you hou ... connevey jay valued lay i
No.

Email
0 0 0 1 0 0 0 2 0 0 ... 0 0 0 0
1

Email
1 8 13 24 6 6 2 102 1 27 ... 0 0 0 0
2

Email
2 0 0 1 0 0 0 8 0 0 ... 0 0 0 0
3

E il

1 df.isnull()

Email No. the to ect and for of a you hou ... connevey jay valued lay infrastructure military

0 False False False False False False False False False False ... False False False False False False

1 False False False False False False False False False False ... False False False False False False

2 False False False False False False False False False False ... False False False False False False

3 False False False False False False False False False False ... False False False False False False

4 False False False False False False False False False False ... False False False False False False

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .

5167 False False False False False False False False False False ... False False False False False False

5168 False False False False False False False False False False ... False False False False False False

5169 False False False False False False False False False False ... False False False False False False

5170 False False False False False False False False False False ... False False False False False False

5171 False False False False False False False False False False ... False False False False False False

5172 rows × 3002 columns

1 df.isnull().sum()

Email No. 0
the 0
to 0
ect 0
and 0
..
military 0
allowing 0
ff 0
dry 0
Prediction 0
Length: 3002, dtype: int64

1 df.duplicated().sum()

https://colab.research.google.com/drive/1KCznbsGxVrKTR0dg9xipgLeY4kz4iWr1#scrollTo=9e7040e0&printMode=true 1/2
10/29/23, 10:57 PM Lab2.ipynb - Colaboratory
1 df.drop(columns=['Email No.'],inplace=True)

1 df['Prediction'].unique()

array([0, 1])

1 y = df['Prediction']

1 X = df.drop(columns=['Prediction'])
2
3
4

1 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=101)

1 from sklearn.svm import SVC #svm is package & svc is a class

1 classifier = SVC() #SVC is a class in which classifier is a object in it

1 classifier.fit(X_train,y_train)

▾ SVC
SVC()

1 y_pred = classifier.predict(X_test)

1 from sklearn.metrics import accuracy_score

1 accuracy_score(y_test,y_pred) #test the accuracy

0.8106280193236715

https://colab.research.google.com/drive/1KCznbsGxVrKTR0dg9xipgLeY4kz4iWr1#scrollTo=9e7040e0&printMode=true 2/2

Design & Verification of FIFO
No ratings yet
Design & Verification of FIFO
7 pages
ML Practical 2D
No ratings yet
ML Practical 2D
6 pages
2 - Jupyter Notebook
No ratings yet
2 - Jupyter Notebook
6 pages
Email Spam Classification
No ratings yet
Email Spam Classification
4 pages
Loading The Dataset: Import As Import As Import As Import As From Import From Import From Import From Import From Import
No ratings yet
Loading The Dataset: Import As Import As Import As Import As From Import From Import From Import From Import From Import
3 pages
ML Assignment8
No ratings yet
ML Assignment8
4 pages
P2) Code Email Spam Detection
No ratings yet
P2) Code Email Spam Detection
3 pages
Emails ml2 - Jupyter Notebook
No ratings yet
Emails ml2 - Jupyter Notebook
2 pages
Siddhesh Asati: #Group: B (ML) #Assignment: 7
No ratings yet
Siddhesh Asati: #Group: B (ML) #Assignment: 7
9 pages
Twitter Data Pull
No ratings yet
Twitter Data Pull
10 pages
Manual
No ratings yet
Manual
48 pages
Mail Spam
No ratings yet
Mail Spam
4 pages
Pandas Python 1667717677
No ratings yet
Pandas Python 1667717677
12 pages
notebook - text classification
No ratings yet
notebook - text classification
7 pages
Email Spam Detection Final Presentation-21BSCHH010002
No ratings yet
Email Spam Detection Final Presentation-21BSCHH010002
17 pages
02 - Email - Spam - Ipynb - Colab
No ratings yet
02 - Email - Spam - Ipynb - Colab
11 pages
AIML ASSIGNMENT-2
No ratings yet
AIML ASSIGNMENT-2
8 pages
ML Practical 2
No ratings yet
ML Practical 2
6 pages
DWDM_pavan_final[1]
No ratings yet
DWDM_pavan_final[1]
10 pages
Python Code
No ratings yet
Python Code
5 pages
Information Security Awareness - Refresher Course
100% (2)
Information Security Awareness - Refresher Course
83 pages
Pandas Library Documentation
No ratings yet
Pandas Library Documentation
16 pages
Lab 3 ML
No ratings yet
Lab 3 ML
19 pages
Dev New
No ratings yet
Dev New
44 pages
AI Phase4
No ratings yet
AI Phase4
11 pages
spam-detection-1
No ratings yet
spam-detection-1
31 pages
notes on CSV Filespdf
No ratings yet
notes on CSV Filespdf
11 pages
Practical 2
No ratings yet
Practical 2
4 pages
ML 2
No ratings yet
ML 2
1 page
Ass 3
No ratings yet
Ass 3
2 pages
Adobe Scan 02 Dec 2024 (1)
No ratings yet
Adobe Scan 02 Dec 2024 (1)
30 pages
Email Spam Detection System using Logistic Regression
No ratings yet
Email Spam Detection System using Logistic Regression
6 pages
DEV Manual - ESEC
No ratings yet
DEV Manual - ESEC
27 pages
FAM PR-10
No ratings yet
FAM PR-10
4 pages
SMS Spam Prediction
No ratings yet
SMS Spam Prediction
18 pages
dsp-N211010-1
No ratings yet
dsp-N211010-1
25 pages
CSV File Handling
No ratings yet
CSV File Handling
3 pages
MOD-3 Dap
No ratings yet
MOD-3 Dap
41 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Fds Unit - III
No ratings yet
Fds Unit - III
58 pages
CSV File
No ratings yet
CSV File
9 pages
Coding for Bulk Email Sender
No ratings yet
Coding for Bulk Email Sender
4 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
EDA - Session-1 - Basic Dataframe Opertaions-1
No ratings yet
EDA - Session-1 - Basic Dataframe Opertaions-1
7 pages
III Unit Fds
No ratings yet
III Unit Fds
24 pages
Pyspark Basics
No ratings yet
Pyspark Basics
16 pages
sms-spam-filter code
No ratings yet
sms-spam-filter code
1 page
Email Spam A Comprehensive Review of Optimize Detection Methods Challenges and Open Research Problems
No ratings yet
Email Spam A Comprehensive Review of Optimize Detection Methods Challenges and Open Research Problems
31 pages
Chapter5 3CSVFile
No ratings yet
Chapter5 3CSVFile
7 pages
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
No ratings yet
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
20 pages
CSV File: Python With CSV Files
No ratings yet
CSV File: Python With CSV Files
19 pages
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
No ratings yet
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
26 pages
Lab Manual 5
No ratings yet
Lab Manual 5
5 pages
2. Python for Data Science
No ratings yet
2. Python for Data Science
115 pages
Data frames pandas, handout 1 (1)
No ratings yet
Data frames pandas, handout 1 (1)
16 pages
CSV_and_Excel_File_Handling
No ratings yet
CSV_and_Excel_File_Handling
5 pages
Arnav MLlab04
No ratings yet
Arnav MLlab04
7 pages
DataFrame.docx
No ratings yet
DataFrame.docx
95 pages
Aayush Nihar Spam Mail Filtering
No ratings yet
Aayush Nihar Spam Mail Filtering
18 pages
Tomorrow’s World: Unlocking AI’s Secrets
From Everand
Tomorrow’s World: Unlocking AI’s Secrets
DALIA IBRAHIM
No ratings yet
Keep Your Day Job: How to AI-Proof Your Career
From Everand
Keep Your Day Job: How to AI-Proof Your Career
Koby Ofek
No ratings yet
Nlp2.ipynb - Colab
No ratings yet
Nlp2.ipynb - Colab
3 pages
2025admin Guide en
No ratings yet
2025admin Guide en
11 pages
NIFS2024 Application Guideline
No ratings yet
NIFS2024 Application Guideline
10 pages
Lab3.ipynb - Colaboratory
No ratings yet
Lab3.ipynb - Colaboratory
7 pages
Stqa Miniproject
No ratings yet
Stqa Miniproject
18 pages
English 8 Quarter 1 (1st Summative)
No ratings yet
English 8 Quarter 1 (1st Summative)
6 pages
Module 3 - Supply Chain Strategies
No ratings yet
Module 3 - Supply Chain Strategies
28 pages
Dao No 96-26
No ratings yet
Dao No 96-26
21 pages
MOS Inverters Report
No ratings yet
MOS Inverters Report
16 pages
The Quest For God and Infinity
No ratings yet
The Quest For God and Infinity
65 pages
Atomy Membership Status Change Application (MS-03-04)
67% (3)
Atomy Membership Status Change Application (MS-03-04)
1 page
Đề Thi Thử Lần 5- Theo Form Minh Họa 2025- Biên Soạn Cô Phạm Liễu- Ngày 22122024
No ratings yet
Đề Thi Thử Lần 5- Theo Form Minh Họa 2025- Biên Soạn Cô Phạm Liễu- Ngày 22122024
6 pages
Schedule No.: Bill of Materials With Pipe Cut Lengths
No ratings yet
Schedule No.: Bill of Materials With Pipe Cut Lengths
3 pages
Cross Cultural Management
No ratings yet
Cross Cultural Management
8 pages
Multicriteria Analysis For Quantifying Sustainability of Developed Load Bearing Lightweight Geopolymer
No ratings yet
Multicriteria Analysis For Quantifying Sustainability of Developed Load Bearing Lightweight Geopolymer
13 pages
Meditation and Yoga As Alternative Therapy For Primary Dysmenorrhea
No ratings yet
Meditation and Yoga As Alternative Therapy For Primary Dysmenorrhea
6 pages
Seating and Complex Arrangements
No ratings yet
Seating and Complex Arrangements
54 pages
Micropara Lab Experiments
No ratings yet
Micropara Lab Experiments
8 pages
Public Intellectuals Against The Neoliberal University
No ratings yet
Public Intellectuals Against The Neoliberal University
19 pages
Life Insurance Cover Letter
100% (1)
Life Insurance Cover Letter
8 pages
Business Management
No ratings yet
Business Management
13 pages
Removal of Colour in Sugar Cane Juice Clarificatio
No ratings yet
Removal of Colour in Sugar Cane Juice Clarificatio
9 pages
Brightness Calculation in Digital Image Processing
No ratings yet
Brightness Calculation in Digital Image Processing
6 pages
Manual WaterCAD V8i - Guia Del Usuario (Ingles) (0601-0800) PDF
No ratings yet
Manual WaterCAD V8i - Guia Del Usuario (Ingles) (0601-0800) PDF
200 pages
What is Electroplating
No ratings yet
What is Electroplating
32 pages
5th Semester Question Paper (DSS and SurveyingII)
No ratings yet
5th Semester Question Paper (DSS and SurveyingII)
3 pages
Limit State Design Flanged Beams
No ratings yet
Limit State Design Flanged Beams
13 pages
1 Bmy Ds OO0 OKd 4 LSZD K9 M7 U 0 VG 6 ZUZ7 Gva
No ratings yet
1 Bmy Ds OO0 OKd 4 LSZD K9 M7 U 0 VG 6 ZUZ7 Gva
35 pages
Mid Term Final Datesheet 2024-25
No ratings yet
Mid Term Final Datesheet 2024-25
1 page
how-to-write-a-characterisation
No ratings yet
how-to-write-a-characterisation
2 pages
TLOC
No ratings yet
TLOC
3 pages
Assessment in Speech Language Pathology A Resource Manual 5th Edition by Kenneth Shipley, Julie McAfee 1285198050 978-1285198057 download
100% (5)
Assessment in Speech Language Pathology A Resource Manual 5th Edition by Kenneth Shipley, Julie McAfee 1285198050 978-1285198057 download
49 pages
Thesis About Factors Affecting Course Preference
100% (2)
Thesis About Factors Affecting Course Preference
9 pages
Hodograph
No ratings yet
Hodograph
6 pages

Lab2.ipynb - Colaboratory

Uploaded by

Lab2.ipynb - Colaboratory

Uploaded by

10/29/23, 10:57 PM Lab2.

5172 rows × 3002 columns

1 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=101)

1 from sklearn.svm import SVC #svm is package & svc is a class

1 classifier = SVC() #SVC is a class in which classifier is a object in it

1 from sklearn.metrics import accuracy_score

1 accuracy_score(y_test,y_pred) #test the accuracy

You might also like