Logistic Regression
Logistic Regression
Model-
Model
research
evaluation
fit
Model- Model
data fit specification
Any statistical technique is as
strong as the individual using it,
the research method used, the
theory and the research questions
it is based (Leclair, 1981)
Getting to know LogReg
• Modeling approach to
predict a binary
(dichotomous/mized)
outcome (DV) based on
continuous or
categorical IV(s)
• Not the same as Logit
Regression
An illustration
1=Adequate
Support
0=Inadequate
system
LogReg v LinReg
• Except for binary DV, other “things” are the
same
– Could be simple or multiple
– Presence of constant, coefficients, R-square
• Because of binary DV, other “things” are
different
– Assumptions
– Meaning of constant, coefficients, R-square, etc
Why Not LinReg
• Non-normal distribution
– Hard to approximate normality with only 2
possible values of DV (1 or 0)
• “Hetero”scedasticity
– Variance of DV is not constant accross IVs
– Prediction errors are likely non-normal; usual
significance tests yield biased results
• DV side is nominal and IV side unconstrained
A Little Bit of Math
Odds
Probabilities Logit
ratio
^ ^
p(1) Y Y ^
Y
^ ^
ln ^
p(0) 1 Y 1 Y
1Y
A Little More: Simple LogReg
^
Y
ln(odds) ln ^
0 1 X 1 Eqn 1
1 Y
^
Y B0 1 X 1
^
e Eqn 2
1 Y
B0 1X1
^ e
Y B0 1X1
Eqn 3
1 e
A Little Bit More: Multiple LogReg
^
Y
ln(odds) ln ^
1 Y 0 1 X 1 2 X 2 ... n X n Eqn 4
^
Y
^
e X 1 X 2 ... X n Eqn 5
1 Y
X 1 X 2 ... X n
^ e
Y Eqn 6
1 e X1 X 2 ... X n
Things to Bear in Mind
• Common assumptions
– Adequate N (lest non-
convergence of solution)
– No multicollinearity
• Specific assumptions
– Categorical DV
– Linear logit
Hypothesis in LogReg
Eqn. 1 or 3
Eqn. 2 or 5
Eqn. 3 or 6
Getting Real with
1=With
0=With out Coaching
Questions to be addressed
1. Do the self-efficacy score, GPA, and coaching
among examinees significantly predict the
likelihood of passing the licensure exam?
p[ LEO 1)
ln 0 1SE 2GPA 3COA
1 p[ LEO 1)
Chi-square value to
be reported (df=3)
54-72% of the
variability in the
likelihood of passing
the LEO is explained by
model
Data-model fit is
something to think
about (?)
p[ LEO 1)
ln 4.82 0.96 * SE 1.72 * GPA 0.413* COA
1 p[ LEO 1)
So what now?