Logistic Regression
Logistic Regression
Agenda
Logistic Regression - Day 1
In this session, you will be learning:
01 02 03 04 05
… … … No
Classification Problems
The HR department of a company wants to understand which employees are at risk of resigning.
Age Target
20 1 Age Proportion (1)
20 0 20 0.50
21 1 21 0.50
21 0
Supervised Learning: Fitting lines
Age Target
20 1
20 0 21 0.50 0.5
21 1
21 0
Supervised Learning: Fitting lines
Age Target
20 1
20 0 21 0.50 0.5
21 1
Maximum and Minimum value Pr/(1-Pr)?
21 0
Minimum value of Pr(1)? Pr(1) = 0
0<=Pr/(1-Pr)<=Infinity
Supervised Learning: Fitting lines
Age Target
20 1
21 1
21 0
0 <= Pr/(1-Pr) <= Infinity
𝑝ො
log ො = β0 + β1𝑋1 + β2𝑋2 + ε
1−𝑝
𝑒 β0+β1𝑋1+β2𝑋2+ε
𝑝Ƹ =
1 + 𝑒 β0+β1𝑋1+β2𝑋2+ε
Logistic Function
Classification Problems: Class Exercise
25 100 𝑝 25 125 1
𝑝= ,1−𝑝 = , 𝑠𝑜 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 = = ∗ =
125 125 1 − 𝑝 125 100 4
Interpreting Coefficients of a Logistic Model
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 2.1+0.08𝐴𝑔𝑒
Age Churned
28 Yes
32 Yes
40 No
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 2.1+0.08𝐴𝑔𝑒
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 2.1+0.08𝐴𝑔𝑒
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 2.1+0.08 * 20
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 3.7
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = e3.7
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 2.713.7
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
log(𝑝/(1−𝑝)) = 39.99
Interpreting Logistic Regression Coefficients
𝑝
log = 𝛽0 + 𝛽1 𝑋1
1−𝑝
𝑃 = 0.97
Class Exercise
Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0
29 0
30 1
38 1
β0 = 0.7 𝑒 0.7+1.7𝐴𝑔𝑒
𝑝Ƹ =
𝛽1 = 1.7 1 + 𝑒 0.7+1.7𝐴𝑔𝑒
Estimating Coefficients of Logistic Model
Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0 Good_Bad
Age Prediction
29 0 (Good =1)
30 1 20 1
38 1 21 1
24 0 β0 = 0.3
β0 = 0.7 25 0 𝛽1 = 2.2
𝛽1 = 1.7 29 0
30 1
38 1
Estimating Coefficients of Logistic Model
Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0 Good_Bad
Age Prediction
29 0 (Good =1)
30 1 20 1 0.70
38 1 21 1 0.60
24 0 0.50 β0 = 0.3
β0 = 0.7 25 0 0.45 𝛽1 = 2.2
𝛽1 = 1.7 29 0 0.70
30 1 0.62
38 1 0.40
Estimating Coefficients of Logistic Model
Good_Bad 𝑝ො
Age Prediction log1−𝑝ො = β0 + β1𝐴𝑔𝑒
(Good =1)
20 1
𝑒 β0+β1𝐴𝑔𝑒
21 1 𝑝Ƹ =
1 + 𝑒 β0+β1𝐴𝑔𝑒
24 0
25 0 Good_Bad
Age Prediction
29 0 (Good =1)
30 1 20 1 0.70
38 1 21 1 0.60
24 0 0.50 β0 = 0.3
β0 = 0.7 25 0 0.45 𝛽1 = 2.2
𝛽1 = 1.7 29 0 0.70
30 1 0.62
Clearly β0 = 0.7 and β1 = 1.7 is a
better choice 38 1 0.40
Estimating Coefficients of Logistic Model