Email Spam Detection System using Logistic Regression
Email Spam Detection System using Logistic Regression
In [2]: df = pd.read_csv('mail_data.csv')
In [3]: df.head(10)
Out[3]:
Category Message
In [5]: data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5572 entries, 0 to 5571
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Category 5572 non-null object
1 Message 5572 non-null object
dtypes: object(2)
memory usage: 87.2+ KB
localhost:8888/notebooks/Resume_projects/Email Spam Classifier Project/Email_Spam_Detection_System_using_Logistic_Regression.ipynb 1/6
6/2/24, 12:17 PM Email_Spam_Detection_System_using_Logistic_Regression - Jupyter Notebook
In [6]: data.shape
Out[6]: (5572, 2)
In [8]: data.head()
Out[8]:
Category Message
In [9]: X = data['Message']
Y = data['Category']
In [10]: print(X)
In [11]: print(Y)
0 1
1 1
2 0
3 1
4 1
..
5567 0
5568 1
5569 1
5570 1
5571 1
Name: Category, Length: 5572, dtype: object
In [13]: print(X.shape)
print(X_train.shape)
print(X_test.shape)
(5572,)
(4457,)
(1115,)
In [14]: print(Y.shape)
print(Y_train.shape)
print(Y_test.shape)
(5572,)
(4457,)
(1115,)
In [16]: print(X_train)
In [17]: print(X_train_features)
Out[19]: ▾ LogisticRegression
LogisticRegression()
Enter your mail: GENT! We are trying to contact you. Last weekends draw sh
ows that you won a £1000 prize GUARANTEED. Call 09064012160. Claim Code K
52. Valid 12hrs only. 150ppm
This is a Spam mail