0% found this document useful (0 votes)
16 views27 pages

Logistic Regression

Uploaded by

Chenelly Alcasid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views27 pages

Logistic Regression

Uploaded by

Chenelly Alcasid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

LOGISTIC REGRESSION

Predicting Binary Outcomes


JOHN STEPHEN CURADA, PhD
ADDU_SBG
Scope
• Getting to know LogReg
• Some fundamentals:
Probabilities, odds & logits
• Form & substance of
LogReg models
• Getting real with SPSS
• SPSS experience
A Guide to Modeling

Model-
Model
research
evaluation
fit

Model- Model
data fit specification
Any statistical technique is as
strong as the individual using it,
the research method used, the
theory and the research questions
it is based (Leclair, 1981)
Getting to know LogReg
• Modeling approach to
predict a binary
(dichotomous/mized)
outcome (DV) based on
continuous or
categorical IV(s)
• Not the same as Logit
Regression
An illustration

1=Male Stand on 1=Yes


0=Females Gender 0=No
abortion
Another illustration
7-point Prod.
Likert scale know.

7-point System Franchise 1=Succeded


Likert scale know. Outlet Perf. 0=Failed

1=Adequate
Support
0=Inadequate
system
LogReg v LinReg
• Except for binary DV, other “things” are the
same
– Could be simple or multiple
– Presence of constant, coefficients, R-square
• Because of binary DV, other “things” are
different
– Assumptions
– Meaning of constant, coefficients, R-square, etc
Why Not LinReg
• Non-normal distribution
– Hard to approximate normality with only 2
possible values of DV (1 or 0)
• “Hetero”scedasticity
– Variance of DV is not constant accross IVs
– Prediction errors are likely non-normal; usual
significance tests yield biased results
• DV side is nominal and IV side unconstrained
A Little Bit of Math

Odds
Probabilities Logit
ratio

^ ^
p(1)  Y Y  ^ 
 Y 
^ ^
ln ^ 
p(0)  1  Y  1 Y 
1Y  
A Little More: Simple LogReg
 ^ 
 Y 
ln(odds)  ln ^ 
  0  1 X 1 Eqn 1
1 Y 
 
^
Y B0  1 X 1
^
e Eqn 2
1 Y
B0  1X1
^ e
Y B0  1X1
Eqn 3
1 e
A Little Bit More: Multiple LogReg
 ^ 
 Y 
ln(odds)  ln ^
1 Y    0  1 X 1   2 X 2 ...   n X n Eqn 4
 
^
Y
^
 e  X 1  X 2 ... X n Eqn 5
1 Y
  X 1   X 2 ... X n
^ e
Y Eqn 6
1  e  X1  X 2 ... X n
Things to Bear in Mind

• Common assumptions
– Adequate N (lest non-
convergence of solution)
– No multicollinearity
• Specific assumptions
– Categorical DV
– Linear logit
Hypothesis in LogReg

H0 : s in the model are equal to 0

H1 : At least 1  in the model > 0;


that is, the FULL MODEL is better
than the NULL MODEL in DV in
predicting that DV=1
A Process in Performing LogReg

Eqn. 1 or 3

Eqn. 2 or 5

Eqn. 3 or 6
Getting Real with

7-point Likert Self-


scale efficacy

1=85% & up 1=Pass


0=84% & GPA LEO 0=Fail
below

1=With
0=With out Coaching
Questions to be addressed
1. Do the self-efficacy score, GPA, and coaching
among examinees significantly predict the
likelihood of passing the licensure exam?

2. Which among self-efficacy score, GPA, and


coaching significantly predict the likelihood that
they will pass the licensure exam?
The conceptual model

 p[ LEO  1) 
ln    0  1SE   2GPA   3COA
 1  p[ LEO  1) 

Natural log Constant B:GPA


of the odds
ratio B:Self-efficacy B:Coaching
% of N that passed
as predicted; we
hope to improve
later
Full model is sig.
better than the Null
model (Block 0)

Chi-square value to
be reported (df=3)

54-72% of the
variability in the
likelihood of passing
the LEO is explained by
model
Data-model fit is
something to think
about (?)

Specificity (true negatives)


An improvement from
Sensitivity (true positives) the 51% prediction by
the Null model (PAC)
Is the relationship What’s the
+ or -? likelihood of those
coded 1/high?

Grde pt. ave.

Are the IVs sig. Not sig. if the


predictors of interval includes 1
p(LEO=1)?
The empirical model

 p[ LEO  1) 
ln   4.82  0.96 * SE  1.72 * GPA 0.413* COA
 1  p[ LEO  1) 
So what now?

SE=4.57; GPA=82.60; COA=1


p(LEO=1)=?
Evaluating Model “Soundness”

• Overall model evaluation


• Goodness-of-fit statistics
• Statistical tests of
individual predictors
• Validations of predicted
probabilities
Fit vs. Parsimony
• Well-fitting models are preferable to poorly fitting ones

• When it comes to parameters, all other things being equal,


LESS IS MORE

• “...Impossible to define one best way to combine measures


of complexity and measures of badness-of-fit in a single
numerical index, because the precise nature of the best
numerical trade-off between complexity and fit is, to some
extent, a matter of personal taste (Steiger, 1990 as cited in
Arbuckle, 2006)
All models are wrong... some
are useful...(George Box)

You might also like