0% found this document useful (0 votes)
4 views39 pages

Logistic Regression

The document outlines a training session on Logistic Regression, covering topics such as classification problems, fitting lines, and interpreting coefficients. It includes examples of predicting machine failures, employee resignations, and patient re-admissions, along with exercises for understanding odds ratios and model coefficients. The session also emphasizes the logistic function and its application in predicting probabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views39 pages

Logistic Regression

The document outlines a training session on Logistic Regression, covering topics such as classification problems, fitting lines, and interpreting coefficients. It includes examples of predicting machine failures, employee resignations, and patient re-admissions, along with exercises for understanding odds ratios and model coefficients. The session also emphasizes the logistic function and its application in predicting probabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Logistic Regression - Day 1

Agenda
Logistic Regression - Day 1
In this session, you will be learning:

01 02 03 04 05

Classification Solving Interpreting Finding out the Python demo


problems classification coefficients of optimal
problems using a logistic model coefficients
linear
regression
Classification Problems
Classification Problems

Predict if a machine will fail in the next 14 days.

VibrationX_14day VibrationY_14day VibrationZ_14day Failed


… … … Yes

… … … No
Classification Problems

The HR department of a company wants to understand which employees are at risk of resigning.

# promotions Current salary Market Salary Resigned


... ... ... Yes

... ... ... No


Classification Problems

Can we predict which patients are at risk of re-admission?

Patient ID Age Gender ….


001 23 M …

Patient ID Age Gender ….


Classification Problems: Class Exercise

Take five minutes and discuss two scenarios in which the


prediction problem is a classification problem. Also,
discuss what kind of data you will need to collect.

• Customer Churn Prediction: Demographics, Past


association with a product, and Number of times
complaints registered
• Predicting Fraud: Demographics, Financial History, and
Circumstances of transaction
Fitting Lines
Supervised Learning: Fitting lines

Feature 1 Feature 2 Response Feature 1


(Age_Normalized) (Income_Normalized) (Good/Bad) (Age_Normalized)
….. …… Good = 1 …..

….. …… Bad = 0 …..

Fit a model of the form, Response=b0+b1Feature1


Supervised Learning: Fitting lines

We fit a straight line to the data, where the response is


binary in nature 𝑦∈{0,1}.

Notice the predictions in red color


1. Predictions < 0 or Predictions > 1
2. 0 < Predictions < 1

We are trying to fit:


𝑦 ̂= β0+β1𝑋1+β2𝑋2+ε=𝑓(𝑋1,𝑋2)
The problem with the fitting linear model is:
𝑓(𝑋1,𝑋2)∈(−∞,+∞)
𝑦∈{0,1}
Supervised Learning: Fitting lines

Age Target
20 1 Age Proportion (1)
20 0 20 0.50

21 1 21 0.50

21 0
Supervised Learning: Fitting lines

Instead of estimating 𝑦∈{0,1} we can try to estimate


the Prob(y=1) = 𝑝 ̂
𝑝 ̂∈(0,1)

These estimates make sense now.

We are trying to fit:


𝑝 ̂ = β0+β1𝑋1+β2𝑋2+ε = 𝑓(𝑋1,𝑋2)

The problem still is:


𝑓(𝑋1,𝑋2) ∈ (−∞,+∞)
𝑝 ̂ ∈ (0,1)
Supervised Learning: Fitting lines

Age Target
20 1

20 1 Age Pr (1) Pr/1-Pr


20 1 20 0.75 3

20 0 21 0.50 0.5

21 1

21 0
Supervised Learning: Fitting lines

Age Target
20 1

20 1 Age Pr (1) Pr/1-Pr


20 1 20 0.75 3

20 0 21 0.50 0.5

21 1
Maximum and Minimum value Pr/(1-Pr)?
21 0
Minimum value of Pr(1)? Pr(1) = 0

Maximum value of Pr(1)? Pr(1) = 1


Min Pr(1) = 0, Pr/(1-Pr) = ? Pr/(1-Pr) = 0

Max Pr(1) = 1, Pr/(1-Pr) = ? Pr/(1-Pr) = Infinity

0<=Pr/(1-Pr)<=Infinity
Supervised Learning: Fitting lines

Instead of estimating 𝑦∈{0,1} we can try to estimate


the 𝑝 ̂/(1−𝑝 ̂ )
𝑝 ̂/(1−𝑝 ̂ ) ∈ (0,+∞)

These estimates make sense now.

We are trying to fit:


𝑝 ̂/(1−𝑝 ̂ )= β0+β1𝑋1+β2𝑋2+ε = 𝑓(𝑋1,𝑋2)
The problem still is:
𝑓(𝑋1,𝑋2)∈(−∞,+∞)
𝑝 ̂/(1−𝑝 ̂ )∈(0,+∞)
Supervised Learning: Fitting lines

Age Target
20 1

20 1 Age Pr (1) Pr/1-Pr log(Pr/(1-Pr))


20 1 20 0.75 3 1.09

20 0 21 0.50 0.5 -0.70

21 1

21 0
0 <= Pr/(1-Pr) <= Infinity

-Infinity <= log(Pr/(1-Pr)) <= Infinity


Supervised Learning: Fitting lines

Instead of estimating 𝑦∈{0,1} we can try to estimate


the log(𝑝 ̂/(1−𝑝 ̂ ))
log𝑝 ̂/(1−𝑝 ̂ )∈(−∞,+∞)

All these estimates make sense now.

We are trying to fit:


log𝑝 ̂/(1−𝑝 ̂ )= β0+β1𝑋1+β2𝑋2+ε=𝑓(𝑋1,𝑋2)
𝑓(𝑋1,𝑋2)∈(−∞,+∞)log𝑝 ̂/(1−𝑝 ̂ )∈(−∞,+∞)
Supervised Learning: Fitting lines

𝑝ො
log ො = β0 + β1𝑋1 + β2𝑋2 + ε
1−𝑝

𝑒 β0+β1𝑋1+β2𝑋2+ε
𝑝Ƹ =
1 + 𝑒 β0+β1𝑋1+β2𝑋2+ε

Logistic Function
Classification Problems: Class Exercise

Imagine that there are 125 customers of age 25 years. Of


them, 25 have subscribed to a premium subscription of an
OTT platform. Find out the odds ratio.

25 100 𝑝 25 125 1
𝑝= ,1−𝑝 = , 𝑠𝑜 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 = = ∗ =
125 125 1 − 𝑝 125 100 4
Interpreting Coefficients of a Logistic Model
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log(𝑝/(1−𝑝)) = 2.1+0.08𝐴𝑔𝑒

Age Churned

28 Yes

32 Yes

40 No
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log(𝑝/(1−𝑝)) = 2.1+0.08𝐴𝑔𝑒

Changing Age by 1 unit


will change the log odds
of someone churning by
0.08 units.
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log⁡(𝑝/(1−𝑝)) = 2.1+0.08𝐴𝑔𝑒

Changing Age by 1 unit


will change the log odds
of someone churning by
0.08 units.
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log(𝑝/(1−𝑝)) = 2.1+0.08 * 20
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log(𝑝/(1−𝑝)) = 2.1 + 1.6


Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log(𝑝/(1−𝑝)) = 3.7
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log⁡(𝑝/(1−𝑝)) = e3.7
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log⁡(𝑝/(1−𝑝)) = 2.713.7
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

log⁡(𝑝/(1−𝑝)) = 39.99
Interpreting Logistic Regression Coefficients

Predicting churn based on a person’s age

𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝

𝑃 = 0.97
Class Exercise

• Imagine you were trying to model the propensity of customers


to churn, and you were able to build the following logistic
regression model:
𝑝
log = 1.95 − 2.5𝐴𝑔𝑒 + 1.68𝐼𝑛𝑐𝑜𝑚𝑒
1−𝑝
• Can we conclude that as age increases, the propensity for a
customer to churn decreases, keeping all else constant?
• What would be the churn probability for a person aged 25 with
an income of 15000?
Class Exercise

• Imagine you were trying to model the propensity of


customers to churn, and you were able to build the
following logistic regression model:
𝑝
log = 0.0001 − 0.005𝐴𝑔𝑒 + 0.008𝐼𝑛𝑐𝑜𝑚𝑒
1−𝑝
• Can we conclude that as age increases, the propensity
for a customer to churn decreases, keeping all else
constant? (Yes, negative sign on the coefficient of Age.)
• What would be the probability of churn for a person with
𝐸
age 25 and income 150?(𝑝 = , where E =
1+𝐸
e0.0001−0.005∗25+0.008∗150 , p = 0.75)
Finding Out the Optimal Coefficients
Estimating Coefficients of Logistic Model

Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0
29 0
30 1
38 1

β0 = 0.7 𝑒 0.7+1.7𝐴𝑔𝑒
𝑝Ƹ =
𝛽1 = 1.7 1 + 𝑒 0.7+1.7𝐴𝑔𝑒
Estimating Coefficients of Logistic Model

Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0 Good_Bad
Age Prediction
29 0 (Good =1)
30 1 20 1
38 1 21 1
24 0 β0 = 0.3
β0 = 0.7 25 0 𝛽1 = 2.2
𝛽1 = 1.7 29 0
30 1
38 1
Estimating Coefficients of Logistic Model

Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0 Good_Bad
Age Prediction
29 0 (Good =1)
30 1 20 1 0.70
38 1 21 1 0.60
24 0 0.50 β0 = 0.3
β0 = 0.7 25 0 0.45 𝛽1 = 2.2
𝛽1 = 1.7 29 0 0.70
30 1 0.62
38 1 0.40
Estimating Coefficients of Logistic Model

Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0 Good_Bad
Age Prediction
29 0 (Good =1)
30 1 20 1 0.70
38 1 21 1 0.60
24 0 0.50 β0 = 0.3
β0 = 0.7 25 0 0.45 𝛽1 = 2.2
𝛽1 = 1.7 29 0 0.70
30 1 0.62
Clearly β0 = 0.7 and β1 = 1.7 is a
better choice 38 1 0.40
Estimating Coefficients of Logistic Model

• One would like to choose the model coefficients so that the


model gives a high score to events and a low score to non-
events.
• But how will we measure a model’s ability to assign a high
score to events and a low to non-events? Cost Function.
See the excel sheet logistic_cost.xlsx
Thank You!

Copyright © HeroX Private Limited, 2023. All rights reserved.

You might also like