Logistic Regression
Logistic Regression
Multivariate Data
Analysis in Research
using SPSS
Logistic Regression Analysis
O This is used to predict a categorical dependent
variable from a set of continuous and/or
dichotomous predictors.
O The objective is to predict and explain the bases
for each object’s group membership through a
set of independent variables selected by the
researcher.
Logistic Regression Analysis
O This is used to predict a categorical dependent
variable from a set of continuous and/or
dichotomous predictors.
O The objective is to predict and explain the bases
for each object’s group membership through a
set of independent variables selected by the
researcher.
Some Variations
O Binary Logistic Regression – is used when the
dependent variable is a binary or dichotomous
variable.
O Multinomial Logistic Regression - is used when
the dependent variable has more than two
categories.
O Ordinal Logistic Regression – is used if the
categories are ranked in some increasing or
decreasing order.
Some Applications of Logistic
Regression Analysis
O In marketing and finance, this can be used to
predict the success or failure of a new product
or determine the category of credit risk for a
person. This may also be used to predict if a
firm will be successful.
O In education, it can help administrators to
decide whether a student should be admitted to
graduate school, classify students as to
vocational interests, etc.
Some Researches
General Goals
O Determine the effects of the independent
variables on the probability.
O Attain the highest predictive accuracy possible
with a given set of predictor variables.
Some Important Terms
O Odds (success) – is the ratio of the probability of
success, P, to the probability of failure,1–P.
Suppose that 80 students (50 are female and
30 are male) took an achievement test and results
shows that 40 female passed while 25 male
passed), the odds are:
𝑂𝑑𝑑𝑠 𝑠𝑢𝑐𝑐𝑒𝑠𝑠, 𝑖𝑓 𝑓𝑒𝑚𝑎𝑙𝑒 = 0.80ൗ1 − .80 = 4
𝑂𝑑𝑑𝑠 𝑠𝑢𝑐𝑐𝑒𝑠𝑠, 𝑖𝑓 𝑚𝑎𝑙𝑒 = 0.833ൗ1 − .833 = 5
Some Important Terms
O Odds Ratio (OR). Given the odds for males and
females for success, the odds ratio for success
is:
𝑂𝑑𝑑𝑠 (𝑠𝑢𝑐𝑐𝑒𝑠𝑠, 𝑚𝑎𝑙𝑒) 5
𝑂𝑅 = = = 1.25
𝑂𝑑𝑑𝑠 (𝑠𝑢𝑐𝑐𝑒𝑠𝑠, 𝑓𝑒𝑚𝑎𝑙𝑒) 4
ln(odds ) = b 0 + b1X1i + b 2 X 2i + + b k X ki
Binary Logistic Regression
Analysis provides…
O Predicted category membership of each case
O Probability of membership
O Classification table
O Ordering of the relative importance or impact of
the predictor variables
Requirements for Binary Logistic
Regression Analysis
O The predictors must be of the interval, ratio or
dichotomous categorical.
O The form of relationship must be linear and
must only include relevant predictors.
O The expected value of the error term is zero.
O There is no correlation between the error and
the predictors.
O There is an absence of perfect multi-collinearity
between the predictors.
Sample Size consideration
O The recommended sample size for each group
in the dependent variable is at least ten (10)
observations per estimated parameter. (Hair,
2010)
Binary Logistic Regression
Analysis with one predictor
O Using birthweight.sav data
O Is age significantly related to birthweight of babies?
O Dependent variable: birthweight of baby (0 –
normal, 1 – low birthweight)
Model:
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
Q1 1
**
Q2 .387 1
** **
Q3 .340 .500 1
** ** **
Q4 .341 .301 .397 1
** ** ** **
Q5 .273 .319 .223 .193 1
** ** ** ** **
Q6 .274 .203 .199 .188 .456 1
** ** ** ** ** **
Q7 .318 .341 .339 .260 .537 .514 1
** ** ** * ** ** **
Q8 .186 .375 .203 .180 .305 .259 .402 1
** ** ** ** ** ** ** **
Q9 .284 .320 .224 .222 .296 .322 .365 .380 1
** ** ** ** ** ** ** ** **
Q10 .282 .322 .271 .189 .348 .427 .461 .371 .479 1
Meritorious!
Significant since
p<0.05
Communalities
What are communalities?
O A variable’s communality is the estimate of its
shared or common variance among the
variables as represented by the derived factors.
Common variance is accounted for based on a
variable’s correlation with all other variables in
the analysis.
O Variables with communalities less than 0.5 are
considered as not having acceptable levels of
explanation.
Eigenvalues and Total Variance