Lab2.ipynb - Colaboratory
Lab2.ipynb - Colaboratory
ipynb - Colaboratory
1 import pandas as pd
2 import numpy as np
3 import matplotlib.pyplot as plt
4 import seaborn as sns
5 from sklearn.model_selection import train_test_split
1 df = pd.read_csv('emails.csv')
1 df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5172 entries, 0 to 5171
Columns: 3002 entries, Email No. to Prediction
dtypes: int64(3001), object(1)
memory usage: 118.5+ MB
1 df.shape
(5172, 3002)
1 df.head()
output Email
the to ect and for of a you hou ... connevey jay valued lay i
No.
Email
0 0 0 1 0 0 0 2 0 0 ... 0 0 0 0
1
Email
1 8 13 24 6 6 2 102 1 27 ... 0 0 0 0
2
Email
2 0 0 1 0 0 0 8 0 0 ... 0 0 0 0
3
E il
1 df.isnull()
Email No. the to ect and for of a you hou ... connevey jay valued lay infrastructure military
0 False False False False False False False False False False ... False False False False False False
1 False False False False False False False False False False ... False False False False False False
2 False False False False False False False False False False ... False False False False False False
3 False False False False False False False False False False ... False False False False False False
4 False False False False False False False False False False ... False False False False False False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .
5167 False False False False False False False False False False ... False False False False False False
5168 False False False False False False False False False False ... False False False False False False
5169 False False False False False False False False False False ... False False False False False False
5170 False False False False False False False False False False ... False False False False False False
5171 False False False False False False False False False False ... False False False False False False
1 df.isnull().sum()
Email No. 0
the 0
to 0
ect 0
and 0
..
military 0
allowing 0
ff 0
dry 0
Prediction 0
Length: 3002, dtype: int64
1 df.duplicated().sum()
https://colab.research.google.com/drive/1KCznbsGxVrKTR0dg9xipgLeY4kz4iWr1#scrollTo=9e7040e0&printMode=true 1/2
10/29/23, 10:57 PM Lab2.ipynb - Colaboratory
1 df.drop(columns=['Email No.'],inplace=True)
1 df['Prediction'].unique()
array([0, 1])
1 y = df['Prediction']
1 X = df.drop(columns=['Prediction'])
2
3
4
1 classifier.fit(X_train,y_train)
▾ SVC
SVC()
1 y_pred = classifier.predict(X_test)
0.8106280193236715
https://colab.research.google.com/drive/1KCznbsGxVrKTR0dg9xipgLeY4kz4iWr1#scrollTo=9e7040e0&printMode=true 2/2