Logistic Regression

The document discusses logistic regression, which is a statistical model used when the dependent variable is binary and categorical. It explains that logistic regression can be used to predict the probability of an outcome being true or false based on predictor variables. The document then provides an overview of the logistic regression process including data preparation, identifying variables, model creation and validation, and presenting results.

Uploaded by

Shaafici

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Logistic Regression

Uploaded by

Shaafici

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

MD Arshad Ahmad

15 Years+ Experience in Data Science

Mentored 100+ people
Logistic Regression – Introduction
In Linear regression, the outcome variable is continuous and the predictor variables can be a mix of numeric and
categorical. But often there are situations where we wish to evaluate the effects of multiple explanatory variables on a
binary outcome variable

For example, the effects of a number of factors on the development or otherwise of a disease. A patient may be cured or
not; a prospect may respond or not, should we grant a loan to particular person or not, etc.

When the outcome or dependent variable is binary, and we wish to measure the effects of several independent variables on
it, we uses Logistic Regression

 The binary outcome variable can be coded as 0 or 1.

 The logistic curve is shown in the figure below:

We estimate the probability of success by the

equation:
Process Flow
Data Identification Factor
Data In
Preparation/ of Variables Analysis or
Python
Cleaning Correlation

Data is obtained in ▪ Missing Value Imputation Independent and dependent ▪ FA is done in order to
pandas dataframe ▪ Trash value variables should be identified get the variables into
▪ Outlier Treatment groups
▪ Good to choose factor
solution near the Eigen
value of 1
▪ As a further Check
Correlation Analysis is
done

Creation of Logistic Regression Validate

Modeling KS Statistic Output
in Python Output
Dataset

Divide data into ▪ Assume all assumptions ▪ Validate the output on Results will be
Development and hold the Validation sample , presented in
Validation Sample in ▪ Check for the by running the same PowerPoint
ratio 70:30 or 80:20 significance of the model on the Validation
variables sample
▪ Run Regression on
Development sample
Python code
Step 1: Importing the dataset
dataset = pd.read_csv(‘car_purchase_Ads.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

Step 2: Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

Step 2: Feature Scaling

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Step 4 : Training the Logistic Regression model on the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)
Python code
Step 5: Predicting a new result
print(classifier.predict(sc.transform([[30,87000]])))

Step 6: Predicting the Test set results

y_pred = classifier.predict(X_test)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

Step 7: Making the Confusion Matrix

from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)
Practice

For location of code and dataset

https://drive.google.com/drive/folders/1CMYQT
Nd02MraMAQ1V-T2eNvicedvLlAu
Thank You!
To know more Get In Touch!

Kick start your Data Science Career

Book Mentoring Session

Analytics Blog

QuantEconlectures Python3 PDF
100% (1)
QuantEconlectures Python3 PDF
1,125 pages
Logistic Regression
100% (2)
Logistic Regression
30 pages
1694600777-Unit2.2 Logistic Regression CU 2.0
100% (1)
1694600777-Unit2.2 Logistic Regression CU 2.0
37 pages
DATA SCIENCE Indeks Standar Pencemaran Udara (ISPU) PROVINSI DKI JAKARTA Tahun 2020
No ratings yet
DATA SCIENCE Indeks Standar Pencemaran Udara (ISPU) PROVINSI DKI JAKARTA Tahun 2020
21 pages
ClassXI DS Student Handbook
No ratings yet
ClassXI DS Student Handbook
107 pages
DBMS Lab Manual R20
No ratings yet
DBMS Lab Manual R20
71 pages
Ebook Research Methodology and Statistical Quality Methods (Vol. 03) Oktober 2022
No ratings yet
Ebook Research Methodology and Statistical Quality Methods (Vol. 03) Oktober 2022
77 pages
PerceptiLabs-ML Handbook
No ratings yet
PerceptiLabs-ML Handbook
31 pages
Modul 7 Praktikum Machine Learning Python
No ratings yet
Modul 7 Praktikum Machine Learning Python
32 pages
Logistic Regression
100% (1)
Logistic Regression
29 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Logistic Regression Example
100% (1)
Logistic Regression Example
22 pages
[Ebooks PDF] download Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition) Prateek Gupta full chapters
100% (4)
[Ebooks PDF] download Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition) Prateek Gupta full chapters
50 pages
Logistic Regression: Jia Li
No ratings yet
Logistic Regression: Jia Li
44 pages
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Data Science
No ratings yet
Data Science
39 pages
Logistics Regression
100% (1)
Logistics Regression
5 pages
Python Vs R in Data and Machine Learning PDF
100% (1)
Python Vs R in Data and Machine Learning PDF
6 pages
Employee Attrition Miniblogs
100% (1)
Employee Attrition Miniblogs
15 pages
Python Setup For Machine Learning
100% (1)
Python Setup For Machine Learning
3 pages
Machine Learning With Real Life Project: by - Rishabh Gaur
100% (2)
Machine Learning With Real Life Project: by - Rishabh Gaur
26 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Tutor
100% (1)
Tutor
309 pages
Introduction
100% (1)
Introduction
49 pages
ML0101EN Clas Logistic Reg Churn Py v1
100% (1)
ML0101EN Clas Logistic Reg Churn Py v1
13 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Dokumen - Pub Approaching Almost Any Machine Learning Problem 9788269211528 L 5276104
100% (1)
Dokumen - Pub Approaching Almost Any Machine Learning Problem 9788269211528 L 5276104
151 pages
(IJETA-V8I5P1) :yew Kee Wong
No ratings yet
(IJETA-V8I5P1) :yew Kee Wong
5 pages
LPTHW
100% (1)
LPTHW
220 pages
Logistic Regression
No ratings yet
Logistic Regression
41 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
TP Regression
100% (1)
TP Regression
1 page
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
100% (1)
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
27 pages
EDA Lecture Module 2
100% (1)
EDA Lecture Module 2
42 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Unit 4 Basics of Feature Engineering
100% (1)
Unit 4 Basics of Feature Engineering
33 pages
MySQL Tutorial PDF
No ratings yet
MySQL Tutorial PDF
24 pages
Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
30 pages
Data Preprocessing
No ratings yet
Data Preprocessing
57 pages
Machine Learning and Neural Networks: Riccardo Rizzo
100% (1)
Machine Learning and Neural Networks: Riccardo Rizzo
113 pages
SMOTE: Synthetic Minority Over-Sampling Technique: Nitesh V. Chawla
No ratings yet
SMOTE: Synthetic Minority Over-Sampling Technique: Nitesh V. Chawla
37 pages
Linear Regression (Check List)
100% (1)
Linear Regression (Check List)
2 pages
Logistic Regression Model Study Assignment
100% (1)
Logistic Regression Model Study Assignment
5 pages
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
100% (1)
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
16 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
LDA 01 Linear Discriminant Analysis
No ratings yet
LDA 01 Linear Discriminant Analysis
65 pages
Data Preprocessing
No ratings yet
Data Preprocessing
38 pages
Unit V - Classification and Prediction 2020-21
100% (1)
Unit V - Classification and Prediction 2020-21
68 pages
Data Science
100% (2)
Data Science
38 pages
Multicollinearity Exercise
100% (1)
Multicollinearity Exercise
6 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Blank: CFC Cumulative Forecast Error or Bias Error
100% (1)
Blank: CFC Cumulative Forecast Error or Bias Error
2 pages
Scip y Lectures
100% (1)
Scip y Lectures
329 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Applications Guide
No ratings yet
Applications Guide
404 pages
Diabetes Prediction Report
No ratings yet
Diabetes Prediction Report
16 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
7 pages
Logistic Reg2
No ratings yet
Logistic Reg2
7 pages
MLRMSB2
No ratings yet
MLRMSB2
21 pages
Ba ZG524 Course Handout
No ratings yet
Ba ZG524 Course Handout
7 pages
Statistics
No ratings yet
Statistics
6 pages
Naive Bayes Classifier: K M M I I M
No ratings yet
Naive Bayes Classifier: K M M I I M
16 pages
Indicator Variables: Variable or Dummy Variables
No ratings yet
Indicator Variables: Variable or Dummy Variables
11 pages
Group - RM Report
No ratings yet
Group - RM Report
5 pages
Box Jenkins Methodology
100% (1)
Box Jenkins Methodology
29 pages
Spss Mini Project (1)
No ratings yet
Spss Mini Project (1)
8 pages
Chapter Viii
No ratings yet
Chapter Viii
4 pages
TESTING THE SIGNIFICANCE OF R Example
No ratings yet
TESTING THE SIGNIFICANCE OF R Example
4 pages
Managerial Economics in A Global Economy, 5th Edition by Dominick Salvatore
No ratings yet
Managerial Economics in A Global Economy, 5th Edition by Dominick Salvatore
26 pages
Data Warehousing, Mining, Neural Network
No ratings yet
Data Warehousing, Mining, Neural Network
26 pages
CEE 105 Inferential Stat Parametric Test Feb22
No ratings yet
CEE 105 Inferential Stat Parametric Test Feb22
132 pages
The Impact of Corruption On Unemployment in Indonesia
No ratings yet
The Impact of Corruption On Unemployment in Indonesia
5 pages
Ipjugaad - Bba 2nd Sem Quantitative Techniques and Operations Research in Management Paper 2008
No ratings yet
Ipjugaad - Bba 2nd Sem Quantitative Techniques and Operations Research in Management Paper 2008
3 pages
Jurnal 2021
No ratings yet
Jurnal 2021
12 pages
A Hierarchical Bayesian Analysis of Hors
No ratings yet
A Hierarchical Bayesian Analysis of Hors
13 pages
Regression and Factor
No ratings yet
Regression and Factor
95 pages
Wooldridge IE AISE SSM ch13
No ratings yet
Wooldridge IE AISE SSM ch13
7 pages
PSQ Q2
No ratings yet
PSQ Q2
2 pages
Andi Batari Khairunnisa.
No ratings yet
Andi Batari Khairunnisa.
19 pages
Logistic Regression in Minitab
No ratings yet
Logistic Regression in Minitab
4 pages
Unit Iv BRM
No ratings yet
Unit Iv BRM
15 pages
Questions For Viva
No ratings yet
Questions For Viva
4 pages
A Simple Explanation of Partial Least Squares
No ratings yet
A Simple Explanation of Partial Least Squares
10 pages
Rural Livelihood Diversification in Rice-Based Areas of Bangladesh
100% (1)
Rural Livelihood Diversification in Rice-Based Areas of Bangladesh
29 pages
Predicting Job Salaries From Text Descriptions
No ratings yet
Predicting Job Salaries From Text Descriptions
6 pages
MOOC Econometrics: Test Exercise 2
No ratings yet
MOOC Econometrics: Test Exercise 2
1 page
ARCH For IPython Notebook - Kevin Sheppard (2021)
No ratings yet
ARCH For IPython Notebook - Kevin Sheppard (2021)
470 pages
08 Dummy Variable
No ratings yet
08 Dummy Variable
24 pages