Logistic Regression Notes
Logistic Regression Notes
By:
Dr. Elya Nabila Abdul Bahri
1
Objectives
2
Overview
Response Analysis
Linear
Regression
Analysis
Continuous
Logistic
Regression
Analysis
Categorical
3
Types of Logistic Regression
Response Type of
Variable Logistic Regression
Binary
Two Binary
Categories
YES NO
Nominal Nominal
Three
or More
Categories Ordinal
Ordinal
4
What Does Logistic
Regression Do?
The logistic regression model uses the predictor variables, which
can be categorical or continuous, to predict the probability of
specific outcomes.
In other words, logistic regression is designed to describe
probabilities associated with the values of the response variable.
5
Logistic Regression Curve
1.0
0.9
0.8
0.7
Probability
0.6
0.5
0.4
0.3
0.2
0.1
0.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
x 6
Logit Transformation
pi
logit( pi ) log
1 pi
Logistic regression models transform probabilities
called logits.
where
i indexes all cases (observations).
pi is the probability that the event (a sale,
for example) occurs in the ith case.
log is the natural log (to the base e).
7
Assumption
pi Logit (pi)
Logit
Transform
Predictor Predictor
8
Logistic Regression Model
9
Open file logit_multi
10
Reference Cell Coding: Two
Levels
Design Variables
Class Value 1
purchase Yes 1
No 0
11
Reference Cell Coding: Three
Levels
Design Variables
12
Quick Quiz: Reference Cell
Coding
13
Reference Cell Coding: An
Example
logit(p) = 0 + 1 * DHigh income + 2* DMedium income
14
Effect Coding: Two Levels
Design Variables
Class Value 1
gender Female 1
Male -1
15
Effect Coding: Three Levels
Design Variables
16
Effect Coding: An Example
logit(p) = 0 + 1 * DHigh income + 2* DMedium income
17
Binary Logistic Regression
18
Binary Logistic Regression
Task
Analyze Regression
Binary Logistic
19
What Is an Odds Ratio?
20
Frequency
21
Probability of Outcome
Outcome
Yes No Total
Group A 101 139 240
Group B 61 130 191
Total 162 269 431
Probability of Probability of
a Yes a No
outcome in outcome in
Group A Group A
23
Odds
Odds of Outcome in Group B
Probability of Probability of
a Yes a No
outcome in outcome in
Group B Group B
24
Odds Ratio
Odds Ratio of Group A to Group B
Odds of
outcome in
Group A
Odds of
outcome in
Group B
25
Properties of the Odds Ratio
No Association
Group B Group A
More Likely More Likely
0 1
26
Odds Ratio Calculation from the
Current Logistic Regression Model
Logistic regression model:
oddsfemales e 0 1
oddsmales e 0
odds ratio = e 0 1 e
1
=
0
e
27
1 Independent Variable:
Gender
28
Goodness-of-fit statistics for
Logistic Regression Model
29
Compare Means :
Independent-Samples T Test
30
Compare Means of Gender
31
Comparing Pairs
To find concordant, discordant, and tied pairs, compare
everyone who had the outcome of interest against
everyone who did not.
< $100 $100 +
32
Concordant Pair
Compare a woman who bought more than $100 worth
of goods from the catalog and a man who did not.
33
Discordant Pair
Compare a man who bought more than $100 worth
of goods from the catalog and a woman who did not.
< $100 $100 +
34
Tied Pair
Compare two women. One bought more than $100 worth
of goods from the catalog, but the other did not.
35
Concordant versus Discordant
Customer Purchasing Over $100
Predicted
Predicted Females
Females Males
Males
Outcome
Outcome (0.42) (0.32)
Probability (0.72) (0.47)
Probability
Customer
Females
Females Tie Discordant
Purchasing
(0.42)
(0.72) Tie Discordant
Pair Pair
Less Than
$100
Males
Males Concordant Tie
(0.47)
(0.32) Pair Pair
Concordant Tie
36
Model: Concordant,
Discordant, and Tied Pairs
37
Multiple Logistic Regression
38
Multiple Logistic
Regression
39
Objectives
40
Multiple Logistic Regression
41
Backward Elimination
Method
Full
Model
Purchase Gender Income Age
? ?
Reduced
Model
Purchase
42
Adjusted Odds Ratio
Predictor Outcome
Gender Purchase
Controlling for
Income Age
43
Multiple Logistic Regression
44
Multiple Logistic
Regression with
Interactions
45
Multiple Logistic Regression
Purchase
46
Backward Elimination Method
Full Model
Reduced Model
Purchase
.
Gender
.
Income
.
Age ? ?
. . .
. . .
47
Comparing Models
Gender + Income Gender + Income + (Gender *
Income)
AIC 44.257 AIC 39.260
SIC 60.251 SIC 63.657
-2 Log likelihood 36.257 -2 Log likelihood 27.260
Concordant 54.0% Concordant 54.8%
Discordant 29.4% Discordant 28.6%
Ties 16.6% Ties 16.6%
Somers’ D 0.246 Somers’ D 0.261
Goodman and 0.295 Goodman and 0.314
Kruskal’s Gamma Kruskal’s Gamma
Kendall’s Tau-a 0.116 Kendall’s Tau-a 0.123
Concordance Index c 0.623 Concordance Index c 0.631
48
Graph Plot
49