0% found this document useful (0 votes)

21 views6 pages

PS Project - Jupyter Notebook

This Jupyter notebook document outlines the steps to build a logistic regression model for a weather prediction problem. It loads and explores a weather dataset, preprocesses the data, splits it into training and test sets, trains a logistic regression model on the training set, evaluates the model's performance on the test set using various metrics, and interprets the model coefficients. The document provides code examples for each step of the predictive modeling process.

Uploaded by

M. Mobeen Khattak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views6 pages

PS Project - Jupyter Notebook

Uploaded by

M. Mobeen Khattak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [3]: import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

In [4]: # Step 1: Load your dataset

df = pd.read_csv('weather.csv') # Replace 'your_dataset.csv' with you

In [5]: # Step 2: Explore and clean the dataset

# (Assuming your target variable is 'RainTomorrow')
df.dropna(inplace=True) # Handle missing values, you might want a mor
df['RainTomorrow'].value_counts() # Check the balance of classes

Out[5]: No 1274
Yes 416
Name: RainTomorrow, dtype: int64

In [6]: # Step 3: Feature Engineering (if needed)

# No specific feature engineering is done in this example

In [7]:
# Step 4: Data Preprocessing
X = pd.get_dummies(df.drop('RainTomorrow', axis=1)) # One-hot encodin
y = df['RainTomorrow'].map({'Yes': 1, 'No': 0}) # Convert target vari

In [8]:
# Step 5: Data Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.

In [14]: from sklearn.preprocessing import StandardScaler

# Step 4: Data Preprocessing
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Out[14]: LogisticRegression(max_iter=1000)
In a Jupyter environment, please rerun this cell to show the HTML representation or
trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.

In [15]:
# Step 6: Choose a Classification Model
model = LogisticRegression()

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 1/6

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [16]:
# Step 7: Model Training
model.fit(X_train, y_train)

/opt/anaconda3/lib/python3.9/site-packages/sklearn/linear_model/_log
istic.py:458: ConvergenceWarning: lbfgs failed to converge (status=
1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as sh

own in:
https://scikit-learn.org/stable/modules/preprocessing.html (http
s://scikit-learn.org/stable/modules/preprocessing.html)
Please also refer to the documentation for alternative solver option
s:
https://scikit-learn.org/stable/modules/linear_model.html#logist
ic-regression (https://scikit-learn.org/stable/modules/linear_model.
html#logistic-regression)
n_iter_i = _check_optimize_result(

Out[16]: LogisticRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or
trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 2/6

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [17]: from sklearn.metrics import confusion_matrix

import seaborn as sns
import matplotlib.pyplot as plt

# Step 8: Model Evaluation
y_pred = model.predict(X_test)

# Create a confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize the confusion matrix using seaborn
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', annot_kws={"size":
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.show()

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 3/6

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [18]: from sklearn.metrics import roc_curve, auc

# Get predicted probabilities
y_pred_proba = model.predict_proba(X_test)[:, 1]

# Calculate ROC curve and AUC
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
roc_auc = auc(fpr, tpr)

# Plot ROC curve
plt.figure(figsize=(8, 8))
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area =
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc='lower right')
plt.show()

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 4/6

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [26]: from sklearn.metrics import precision_score, recall_score, f1_score, r

# Example metrics calculation
y_pred = model.predict(X_test)
print("Precision:", precision_score(y_test, y_pred))
print("Recall:", recall_score(y_test, y_pred))
print("F1 Score:", f1_score(y_test, y_pred))
print("AUC-ROC Score:", roc_auc_score(y_test, y_pred_proba))

Precision: 1.0
Recall: 0.9473684210526315
F1 Score: 0.972972972972973
AUC-ROC Score: 0.9998267273121074

In [19]: print("Accuracy:", accuracy_score(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred

Accuracy: 0.985207100591716
Classification Report:
precision recall f1-score support

0 0.98 1.00 0.99 243

1 1.00 0.95 0.97 95

accuracy 0.99 338

macro avg 0.99 0.97 0.98 338
weighted avg 0.99 0.99 0.99 338

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 5/6

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [20]: # Step 9: Model Interpretation

# Coefficients for logistic regression model
coefficients = pd.DataFrame({'Feature': X.columns, 'Coefficient': mode

# Sort coefficients by absolute value
coefficients = coefficients.reindex(coefficients['Coefficient'].abs().

# Plot coefficients
plt.figure(figsize=(12, 6))
sns.barplot(x='Coefficient', y='Feature', data=coefficients, palette='
plt.xlabel('Coefficient Value')
plt.ylabel('Feature')
plt.title('Logistic Regression Coefficients')
plt.show()

In [ ]:
# Step 10: Fine-Tuning and Optimization
# For logistic regression, fine-tuning may involve adjusting regulariz

# Step 11: Deployment (Not shown in code, as it depends on your deploy

# Step 12: Continuous Monitoring and Updating
# Monitor model performance over time and update as needed

In [ ]:

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 6/6

Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
Manual For TP-329 CHG 9 (PS-835 CMM)
No ratings yet
Manual For TP-329 CHG 9 (PS-835 CMM)
108 pages
Sensors, Actuators, and Their Interfaces - Nathan Ida - IET - 2nd Edition - 2020
100% (5)
Sensors, Actuators, and Their Interfaces - Nathan Ida - IET - 2nd Edition - 2020
923 pages
Odata Interview Question
20% (5)
Odata Interview Question
4 pages
ML in Python Part-2
No ratings yet
ML in Python Part-2
21 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
ML Lab Manual1
No ratings yet
ML Lab Manual1
23 pages
Supervised Classi & Regression
No ratings yet
Supervised Classi & Regression
5 pages
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
No ratings yet
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
3 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
Da Program
No ratings yet
Da Program
18 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
B-56 Sanket Jambhulkar MLA-3
No ratings yet
B-56 Sanket Jambhulkar MLA-3
7 pages
Machine Learning Final Manual
No ratings yet
Machine Learning Final Manual
45 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
ML Assignment 4
No ratings yet
ML Assignment 4
7 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
No ratings yet
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
24 pages
Experiment ML
No ratings yet
Experiment ML
14 pages
TP - Ipynb - Colab
No ratings yet
TP - Ipynb - Colab
6 pages
Linear Regression (Code)
No ratings yet
Linear Regression (Code)
9 pages
Lab4 - Jupyter Notebook
No ratings yet
Lab4 - Jupyter Notebook
7 pages
ML Lap
No ratings yet
ML Lap
23 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
21CSC305P ML - Lab Programs 1 - 9
No ratings yet
21CSC305P ML - Lab Programs 1 - 9
36 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
7 pages
LAB 1 Amin Modified
No ratings yet
LAB 1 Amin Modified
8 pages
ML 4,5,6 (Sample1)
No ratings yet
ML 4,5,6 (Sample1)
6 pages
ML Lab
No ratings yet
ML Lab
23 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
DA Programs
No ratings yet
DA Programs
44 pages
Binary Classification
No ratings yet
Binary Classification
2 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
ADS Expt5 BE9 29
No ratings yet
ADS Expt5 BE9 29
3 pages
CP4252 Lab Manual
No ratings yet
CP4252 Lab Manual
13 pages
ML Practical File
No ratings yet
ML Practical File
30 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
Untitled Document
No ratings yet
Untitled Document
19 pages
01 Machine Learning
No ratings yet
01 Machine Learning
25 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Week-7 DS Practical
No ratings yet
Week-7 DS Practical
8 pages
Apply Logistic Regression To Amazon Reviews Data Set (M)
No ratings yet
Apply Logistic Regression To Amazon Reviews Data Set (M)
11 pages
ML Algorithms
100% (1)
ML Algorithms
1 page
Python Lab10b
No ratings yet
Python Lab10b
2 pages
ML Record
No ratings yet
ML Record
23 pages
TD2345
No ratings yet
TD2345
3 pages
MLT 1 - 7 Kanish
No ratings yet
MLT 1 - 7 Kanish
24 pages
23BCE7199 ML Lab Assignment
No ratings yet
23BCE7199 ML Lab Assignment
15 pages
Assignment 9
No ratings yet
Assignment 9
2 pages
22se02cs039 DS P-11
No ratings yet
22se02cs039 DS P-11
10 pages
MD Asaduzzaman - 213002257
No ratings yet
MD Asaduzzaman - 213002257
3 pages
ML Lab Manual
No ratings yet
ML Lab Manual
36 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
ML File
No ratings yet
ML File
10 pages
ML Lab Record - 250625 - 105014
No ratings yet
ML Lab Record - 250625 - 105014
29 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
2024 Hober Solar AC Pumps Surface2 Catalog List
No ratings yet
2024 Hober Solar AC Pumps Surface2 Catalog List
4 pages
COPE WAME Best Practices
No ratings yet
COPE WAME Best Practices
3 pages
ITNE3013 Assignment Sem 2 2020 PDF
No ratings yet
ITNE3013 Assignment Sem 2 2020 PDF
4 pages
1.e10-Unit 3-On TX-GV
No ratings yet
1.e10-Unit 3-On TX-GV
2 pages
Mabani PEB Manual PDF
No ratings yet
Mabani PEB Manual PDF
265 pages
ERP Issues and Challenges: A Research Synthesis: Faisal Mahmood Abdul Zahid Khan
100% (1)
ERP Issues and Challenges: A Research Synthesis: Faisal Mahmood Abdul Zahid Khan
31 pages
Evaluating Limits of Trigonometric and Exponential Functions
No ratings yet
Evaluating Limits of Trigonometric and Exponential Functions
12 pages
ECA-II Manual Complete
No ratings yet
ECA-II Manual Complete
100 pages
Chapter 13 Chapter 13: Building Information Systems Building Information Systems
No ratings yet
Chapter 13 Chapter 13: Building Information Systems Building Information Systems
12 pages
319itsc - Internet of Things: Face Recognition System
No ratings yet
319itsc - Internet of Things: Face Recognition System
18 pages
A Survey of Deep Learning For Mathematical Reasoning
No ratings yet
A Survey of Deep Learning For Mathematical Reasoning
24 pages
P6 File Corruption
No ratings yet
P6 File Corruption
20 pages
Rcrit 18V818 1724
No ratings yet
Rcrit 18V818 1724
40 pages
OTA Project (A1-G4)
No ratings yet
OTA Project (A1-G4)
12 pages
AMPRI-Project - Associate-Project-Assistant-Jobs-2021
No ratings yet
AMPRI-Project - Associate-Project-Assistant-Jobs-2021
14 pages
PMP ITTO Process Chart PMBOK Guide 6th Edition-1a
No ratings yet
PMP ITTO Process Chart PMBOK Guide 6th Edition-1a
14 pages
ITP-False Celing-NS-MSS-A-003-R-01
100% (1)
ITP-False Celing-NS-MSS-A-003-R-01
2 pages
Expanded Device Audit
No ratings yet
Expanded Device Audit
3 pages
Aesthetic Wallpapers - Google Search
No ratings yet
Aesthetic Wallpapers - Google Search
1 page
BOV Visa Classic - 112022
No ratings yet
BOV Visa Classic - 112022
14 pages
Ijctt V68i7p105 July2020publication
No ratings yet
Ijctt V68i7p105 July2020publication
7 pages
Annex 5.2, TL - DEI - ZXSDR R8998E M3537 Product Description - V1.0 - 20181127
No ratings yet
Annex 5.2, TL - DEI - ZXSDR R8998E M3537 Product Description - V1.0 - 20181127
14 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Prospectus: Schedule of Availability of Online Application Form
No ratings yet
Prospectus: Schedule of Availability of Online Application Form
29 pages
Upendra Internship Final
No ratings yet
Upendra Internship Final
39 pages
DLP PHY G10 W8 D1 Rev
No ratings yet
DLP PHY G10 W8 D1 Rev
7 pages
IDS805 Installer
100% (1)
IDS805 Installer
48 pages

PS Project - Jupyter Notebook

Uploaded by

PS Project - Jupyter Notebook

Uploaded by

11/12/2023, 20:22 PS Project - Jupyter Notebook

In [3]: import pandas as pd

In [4]: # Step 1: Load your dataset

In [5]: # Step 2: Explore and clean the dataset

In [6]: # Step 3: Feature Engineering (if needed)

In [14]: from sklearn.preprocessing import StandardScaler

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 1/6

Increase the number of iterations (max_iter) or scale the data as sh

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 2/6

In [17]: from sklearn.metrics import confusion_matrix

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 3/6

In [18]: from sklearn.metrics import roc_curve, auc

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 4/6

In [26]: from sklearn.metrics import precision_score, recall_score, f1_score, r

In [19]: print("Accuracy:", accuracy_score(y_test, y_pred))

0 0.98 1.00 0.99 243

accuracy 0.99 338

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 5/6

In [20]: # Step 9: Model Interpretation

localhost:8888/notebooks/Data_Science_Course_Rune/PS Project.ipynb 6/6

You might also like