0% found this document useful (0 votes)

10 views7 pages

Lecture 10-Logistic Regression - Part - 2 - Jupyter Notebook

The document outlines a Jupyter Notebook lecture on Logistic Regression, detailing the process of importing a dataset, cleaning the data, and preparing it for analysis. It includes steps for handling missing data, converting categorical features, and splitting the dataset into training and testing sets. Finally, it demonstrates training a logistic regression model and evaluating its performance using precision, recall, and F1-score metrics.

Uploaded by

pateljil0247

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views7 pages

Lecture 10-Logistic Regression - Part - 2 - Jupyter Notebook

Uploaded by

pateljil0247

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

Lecture 10-Part2

Logistic Regression
In [14]: 1 import pandas as pd
2 import numpy as np
3 import matplotlib.pyplot as plt
4 import seaborn as sns
5 %matplotlib inline

The Data
Import the Dataset.

In [17]: 1 data = pd.read_csv('Downloads/Facebook.csv')

2 data.head(5)

Out[17]:
Time
Names emails Country Spent on Salary
Site

Martina
0 [email protected] Bulgaria 25.649648 55330.06006
Avila

Harlan
1 [email protected] Belize 32.456107 79049.07674
Barnes

Naomi
2 vulputate.mauris.sagittis@ametconsectetueradip... Algeria 20.945978 41098.60826
Rodriquez

Jade Cook
3 [email protected] 54.039325 37143.35536
Cunningham Islands

Cedric
4 [email protected] Brazil 34.249729 37355.11276
Leach

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 1/7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

In [18]: 1 data.head()

Out[18]:
Time
Names emails Country Spent on Salary
Site

Martina
0 [email protected] Bulgaria 25.649648 55330.06006
Avila

Harlan
1 [email protected] Belize 32.456107 79049.07674
Barnes

Naomi
2 vulputate.mauris.sagittis@ametconsectetueradip... Algeria 20.945978 41098.60826
Rodriquez

Jade Cook
3 [email protected] 54.039325 37143.35536
Cunningham Islands

Cedric
4 [email protected] Brazil 34.249729 37355.11276
Leach

Missing Data
We can use seaborn to create a simple heatmap to see where we are missing data!

In [19]: 1 data.isnull()

Out[19]:
Names emails Country Time Spent on Site Salary Clicked

0 False False False False False False

1 False False False False False False

2 False False False False False False

3 False False False False False False

4 False False False False False False

... ... ... ... ... ... ...

494 False False False False False False

495 False False False False False False

496 False False False False False False

497 False False False False False False

498 False False False False False False

499 rows × 6 columns

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 2/7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

Explore the dataset

In [38]: 1 click = data[data['Clicked']==1]
2 no_click = data[data['Clicked']==0]

In [39]: 1 print("Total num of data =", len(data))

2
3 print("Number of customers who clicked on Ad =", len(click))
4 print("Percentage Clicked =", 1.*len(click)/len(data)*100.0, "%")
5
6 print("Did not Click =", len(no_click))
7 print("Percentage who did not Click =", 1.*len(no_click)/len(data)*100.0,
8
9

Total num of data = 499

Number of customers who clicked on Ad = 250
Percentage Clicked = 50.1002004008016 %
Did not Click = 249
Percentage who did not Click = 49.899799599198396 %

Data Cleaning
We want to fill in missing age data instead of just dropping the missing age data rows. One way
to do this is by filling in the mean age of all the passengers (imputation). However we can be
smarter about this and check the average age by passenger class. For example:

Now apply that function!

Let's go ahead and drop the Cabin column and the row in Embarked that is NaN.

In [20]: 1 data.drop(['Names', 'emails', 'Country'],axis = 1,inplace=True)

In [21]: 1 data.head()

Out[21]:
Time Spent on Site Salary Clicked

0 25.649648 55330.06006 0

1 32.456107 79049.07674 1

2 20.945978 41098.60826 0

3 54.039325 37143.35536 1

4 34.249729 37355.11276 0

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 3/7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

In [22]: 1 data.dropna(inplace=True)

Converting Categorical Features

In [40]: 1 data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 499 entries, 0 to 498
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time Spent on Site 499 non-null float64
1 Salary 499 non-null float64
2 Clicked 499 non-null int64
dtypes: float64(2), int64(1)
memory usage: 15.6 KB

Logistic Regression model

Train Test Split
In [41]: 1 from sklearn.model_selection import train_test_split

In [42]: 1 X_train, X_test, y_train, y_test = train_test_split(data.drop('Clicked',ax

2 data['Clicked'], test_
3 random_state=101)

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 4/7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

In [43]: 1 X_train

Out[43]:
Time Spent on Site Salary

187 46.995205 89227.57988

55 27.432028 40814.47633

457 25.366808 37192.01715

57 47.070590 80709.83902

308 43.880448 77371.64859

... ... ...

63 31.518373 35277.25683

326 42.903343 78401.67203

337 37.278453 50158.74558

11 34.530898 30221.93714

351 30.391102 59519.43092

399 rows × 2 columns

In [44]: 1 y_train

Out[44]: 187 1
55 0
457 0
57 1
308 1
..
63 0
326 1
337 0
11 0
351 1
Name: Clicked, Length: 399, dtype: int64

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 5/7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

In [45]: 1 X_test

Out[45]:
Time Spent on Site Salary

246 19.919153 30201.25465

491 37.173216 63750.41558

330 43.750975 50777.99687

453 29.156654 39394.28363

155 30.730586 47012.72759

... ... ...

98 12.866031 27148.27919

183 23.653926 29808.11365

72 26.410241 55388.71453

367 44.661437 75426.28108

405 30.916826 19123.46645

100 rows × 2 columns

In [46]: 1 y_test

Out[46]: 246 0
491 1
330 1
453 0
155 0
..
98 0
183 0
72 0
367 1
405 0
Name: Clicked, Length: 100, dtype: int64

Training and Predicting

In [47]: 1 from sklearn.linear_model import LogisticRegression

In [48]: 1 logmodel = LogisticRegression()

2 logmodel.fit(X_train,y_train)

Out[48]: LogisticRegression()

In [49]: 1 predictions = logmodel.predict(X_test)

Let's move on to evaluate our model!

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 6/7

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

Evaluation

We can check precision,recall,f1-score using classification report.

In [36]: 1 from sklearn.metrics import classification_report

In [37]: 1 print(classification_report(y_test,predictions))

precision recall f1-score support

0 0.94 0.89 0.92 57

1 0.87 0.93 0.90 43

accuracy 0.91 100

macro avg 0.91 0.91 0.91 100
weighted avg 0.91 0.91 0.91 100

In [51]: 1 from sklearn.metrics import classification_report, confusion_matrix

2 cm = confusion_matrix(y_test, predictions)
3 sns.heatmap(cm, annot=True, fmt="d")

Out[51]: <AxesSubplot:>

In [ ]: 1

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 7/7

Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
Sandhya Goli Responses
No ratings yet
Sandhya Goli Responses
5 pages
English Idioms & Phrasal Verbs
No ratings yet
English Idioms & Phrasal Verbs
15 pages
National Transportation Safety Board Aviation Accident Preliminary Report
No ratings yet
National Transportation Safety Board Aviation Accident Preliminary Report
3 pages
DBMS CIS-Theory Spring 2023
No ratings yet
DBMS CIS-Theory Spring 2023
5 pages
Petrophysics Logging Tools PDF
No ratings yet
Petrophysics Logging Tools PDF
5 pages
Introduction To Data Analytics: ITE 5201 Lecture10-Logistic Regression
No ratings yet
Introduction To Data Analytics: ITE 5201 Lecture10-Logistic Regression
50 pages
Krause Letter Re Squires
No ratings yet
Krause Letter Re Squires
49 pages
DTS 101 Lecture 3
No ratings yet
DTS 101 Lecture 3
21 pages
DA Programs
No ratings yet
DA Programs
44 pages
Predictive HR Analytics DR Martin R Edwards Kirsten Edwards PDF Download
No ratings yet
Predictive HR Analytics DR Martin R Edwards Kirsten Edwards PDF Download
29 pages
United States of America V Micus
No ratings yet
United States of America V Micus
101 pages
A Good Beginner Project With Logistic Regression by Jacob Toftgaard Rasmussen - Fragment
No ratings yet
A Good Beginner Project With Logistic Regression by Jacob Toftgaard Rasmussen - Fragment
15 pages
What Is A Tourism Management Committee
No ratings yet
What Is A Tourism Management Committee
44 pages
Logistic Regression in Python Tutorial
100% (2)
Logistic Regression in Python Tutorial
23 pages
Supervised Learning
100% (1)
Supervised Learning
15 pages
GPON-OLT Neutral WEB User Manual-V1.0.0 20201013
No ratings yet
GPON-OLT Neutral WEB User Manual-V1.0.0 20201013
128 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Logistic Regression
100% (1)
Logistic Regression
10 pages
LAB 1 Amin Modified
No ratings yet
LAB 1 Amin Modified
8 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
Day.12 Logistic Regression
No ratings yet
Day.12 Logistic Regression
8 pages
Logistic Regression
100% (2)
Logistic Regression
30 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
SMDS Unit 5
No ratings yet
SMDS Unit 5
21 pages
MLS 2 - Classification
No ratings yet
MLS 2 - Classification
13 pages
Data Analytics Program
No ratings yet
Data Analytics Program
11 pages
CCD - Ipynb - Colab
No ratings yet
CCD - Ipynb - Colab
6 pages
Shivansh Exp7
No ratings yet
Shivansh Exp7
5 pages
Logestic Regression
No ratings yet
Logestic Regression
4 pages
Chapter 4 Statistical Classification Methods
No ratings yet
Chapter 4 Statistical Classification Methods
63 pages
Task 1
No ratings yet
Task 1
7 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Linear Regression Report
No ratings yet
Linear Regression Report
4 pages
Perceptron Regression
No ratings yet
Perceptron Regression
7 pages
Train
No ratings yet
Train
17 pages
29 - ML Exp - 03
No ratings yet
29 - ML Exp - 03
4 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Data Analytcs 2
No ratings yet
Data Analytcs 2
2 pages
Globalisation and Education For All
No ratings yet
Globalisation and Education For All
12 pages
Week-7 DS Practical
No ratings yet
Week-7 DS Practical
8 pages
Assignment 9
No ratings yet
Assignment 9
2 pages
2022 Ieeesb Ddu
No ratings yet
2022 Ieeesb Ddu
5 pages
Homework Manager Mcgraw Hill
100% (1)
Homework Manager Mcgraw Hill
5 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regression Algorithm
No ratings yet
Logistic Regression Algorithm
8 pages
Invoice
No ratings yet
Invoice
1 page
PS Project - Jupyter Notebook
No ratings yet
PS Project - Jupyter Notebook
6 pages
Introduction To Logistics Regression.
No ratings yet
Introduction To Logistics Regression.
4 pages
Logistic Regression
No ratings yet
Logistic Regression
3 pages
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
No ratings yet
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
7 pages
Logistic - Regression - Ipynb - Colaboratory
No ratings yet
Logistic - Regression - Ipynb - Colaboratory
3 pages
International Handbook of Competition
100% (2)
International Handbook of Competition
361 pages
800xa Power of Integration-Magnus Hammar
No ratings yet
800xa Power of Integration-Magnus Hammar
46 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
Wa0004.
No ratings yet
Wa0004.
9 pages
22se02cs039 DS P-11
No ratings yet
22se02cs039 DS P-11
10 pages
Experiment No 3
No ratings yet
Experiment No 3
7 pages
Lab1,2 Stack Queue
No ratings yet
Lab1,2 Stack Queue
7 pages
Audition Pack and Jo
No ratings yet
Audition Pack and Jo
15 pages
8 - Logistic - Regression - Multiclass - Ipynb - Colaboratory
No ratings yet
8 - Logistic - Regression - Multiclass - Ipynb - Colaboratory
6 pages
Module 5 Personal Hygiene
No ratings yet
Module 5 Personal Hygiene
36 pages
MES Wadia College of Engineering Pune-01 Department of Computer Engineering
No ratings yet
MES Wadia College of Engineering Pune-01 Department of Computer Engineering
2 pages
Griha
No ratings yet
Griha
9 pages
Project-1 (Data Preprocessing)
No ratings yet
Project-1 (Data Preprocessing)
5 pages
B-56 Sanket Jambhulkar MLA-3
No ratings yet
B-56 Sanket Jambhulkar MLA-3
7 pages
Research Chapter 1 To 3 Final
100% (1)
Research Chapter 1 To 3 Final
23 pages
ML in Python Part-2
No ratings yet
ML in Python Part-2
21 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
Logistic Regression
No ratings yet
Logistic Regression
30 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
Exp2 Milf
No ratings yet
Exp2 Milf
7 pages
Data Analytics - Project Report
No ratings yet
Data Analytics - Project Report
2 pages
Logistic Regression
No ratings yet
Logistic Regression
3 pages
Experiment 5B - Minor
No ratings yet
Experiment 5B - Minor
1 page
Agra Supplementary DPR
No ratings yet
Agra Supplementary DPR
67 pages
Data Analysis in Python-3
No ratings yet
Data Analysis in Python-3
4 pages
Advanced Regression
No ratings yet
Advanced Regression
13 pages
Process Mapping and Waste
No ratings yet
Process Mapping and Waste
52 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Jadual Pembahagian Bahan Pengajaran Bertulis (Wim)
No ratings yet
Jadual Pembahagian Bahan Pengajaran Bertulis (Wim)
13 pages
Christie Bot Primer 2014
No ratings yet
Christie Bot Primer 2014
12 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Filtro Parker
No ratings yet
Filtro Parker
2 pages
LAb Sonic 03-09-2013
No ratings yet
LAb Sonic 03-09-2013
1 page
Birth Form
No ratings yet
Birth Form
4 pages
Undisturbed Soil Block
No ratings yet
Undisturbed Soil Block
6 pages
130+ Ridiculously Funny Christmas Jokes for Kids. So Terrible, Even Santa and Rudolph Will Laugh Out Loud! Silly Santa Jokes and Riddles (With Pictures!)
From Everand
130+ Ridiculously Funny Christmas Jokes for Kids. So Terrible, Even Santa and Rudolph Will Laugh Out Loud! Silly Santa Jokes and Riddles (With Pictures!)
Bim Bam Bom Funny Joke Books
No ratings yet
140+ Ridiculously Funny Cat Jokes. Hilarious & Silly Clean Cat Jokes for Kids. So good, Even Your Cat or Kitten Will Laugh Out Loud! (With Pictures!)
From Everand
140+ Ridiculously Funny Cat Jokes. Hilarious & Silly Clean Cat Jokes for Kids. So good, Even Your Cat or Kitten Will Laugh Out Loud! (With Pictures!)
Bim Bam Bom Funny Joke Books
No ratings yet

Lecture 10-Logistic Regression - Part - 2 - Jupyter Notebook

Uploaded by

Lecture 10-Logistic Regression - Part - 2 - Jupyter Notebook

Uploaded by

3/29/23, 11:20 AM Lecture 10-Logistic Regression_Part_2 - Jupyter Notebook

In [17]: 1 data = pd.read_csv('Downloads/Facebook.csv')

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 1/7

0 False False False False False False

1 False False False False False False

2 False False False False False False

3 False False False False False False

4 False False False False False False

... ... ... ... ... ... ...

494 False False False False False False

495 False False False False False False

496 False False False False False False

497 False False False False False False

498 False False False False False False

499 rows × 6 columns

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 2/7

Explore the dataset

In [39]: 1 print("Total num of data =", len(data))

Total num of data = 499

Now apply that function!

In [20]: 1 data.drop(['Names', 'emails', 'Country'],axis = 1,inplace=True)

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 3/7

Converting Categorical Features

Logistic Regression model

In [42]: 1 X_train, X_test, y_train, y_test = train_test_split(data.drop('Clicked',ax

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 4/7

187 46.995205 89227.57988

457 25.366808 37192.01715

308 43.880448 77371.64859

... ... ...

326 42.903343 78401.67203

337 37.278453 50158.74558

351 30.391102 59519.43092

399 rows × 2 columns

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 5/7

246 19.919153 30201.25465

491 37.173216 63750.41558

330 43.750975 50777.99687

453 29.156654 39394.28363

155 30.730586 47012.72759

... ... ...

183 23.653926 29808.11365

367 44.661437 75426.28108

405 30.916826 19123.46645

100 rows × 2 columns

Training and Predicting

In [48]: 1 logmodel = LogisticRegression()

In [49]: 1 predictions = logmodel.predict(X_test)

Let's move on to evaluate our model!

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 6/7

We can check precision,recall,f1-score using classification report.

In [36]: 1 from sklearn.metrics import classification_report

precision recall f1-score support

0 0.94 0.89 0.92 57

accuracy 0.91 100

In [51]: 1 from sklearn.metrics import classification_report, confusion_matrix

localhost:8888/notebooks/Lecture 10-Logistic Regression_Part_2.ipynb# 7/7

You might also like