0% found this document useful (0 votes)
28 views34 pages

5.2) Multinomial Logistic Regression

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views34 pages

5.2) Multinomial Logistic Regression

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Multinomial Logistic Regression

Getu Degu (PhD)

March 2013
1
Multinomial Logistic Regression
 Multinomial logistic regression is used to analyze relationships between a
non-metric dependent variable and metric or dichotomous independent
variables.
 It is usually used to determine factors that affect the presence or absence
of a characteristic when the dependent variable has three or more levels.

 Multinomial logistic regression compares multiple groups through a


combination of binary logistic regressions.

 The group comparisons are equivalent to the comparisons for a dummy-


coded dependent variable, with the group with the highest numeric
score used as the default reference group.

 For example, if we wanted to study differences in BSc, MSc, and PhD


students using multinomial logistic regression, the analysis would
compare BSc students to PhD students and MSc students to PhD students.
For each independent variable, there would be two comparisons. 2
What multinomial logistic regression predicts

• Multinomial logistic regression provides a set of coefficients for


each of the two comparisons. The coefficients for the reference
group are all zeros, similar to the coefficients for the reference
group for a dummy-coded variable.

• Predicted group membership can be compared to actual group


membership to obtain a measure of classification accuracy.

3
Level of measurement requirements
 Multinomial logistic regression analysis requires that the
dependent variable be non-metric (nominal).

 Multinomial logistic regression analysis requires that the


independent variables be metric or dichotomous. Since SPSS
will automatically dummy-code nominal level variables, they
can be included since they will be dichotomized in the
analysis.

 In SPSS, non-metric independent variables are included as


“factors.” SPSS will dummy-code non-metric IVs.
 In SPSS, metric independent variables are included as
“covariates.”

4
Procedure
 In the dialog box, you select one dependent variable and your
independent variables, which may be factors or covariates. Some of
the submenus are given below:

 Model: By default, a main effect model is fitted. In this submenu, you can
specify a custom model or a variable selection method.

 Statistics: In this submenu, you can request for many statistics including
classification table for the model.

 Criteria: This allows you to specify the criteria for the iterations during
model estimation.

 Categorical: Here is where you identify categorical variables and specify


how you want this data compared.

 Save: This allow you to save some variables to the working data file or to
an external data file.

5
Assumptions and outliers
 Multinomial logistic regression does not make any
assumptions of normality, linearity, and
homogeneity of variance for the independent
variables.

 SPSS does not compute any diagnostic statistics for


outliers. To evaluate outliers, the advice is to run
multiple binary logistic regressions and use those
results to test the exclusion of outliers or influential
cases.

6
Sample size requirements
 The minimum number of cases per independent variable is
10, using a guideline provided by Hosmer and Lemeshow,
authors of Applied Logistic Regression, one of the main
resources for Logistic Regression.

 For preferred case-to-variable ratios, we will use 20 to 1.

7
Methods for including variables
• The only method for selecting independent
variables in SPSS is simultaneous or direct
entry.

8
Overall test of relationship
• The overall test of relationship among the independent
variables and groups defined by the dependent is based
on the reduction in the likelihood values for a model
which does not contain any independent variables and
the model that contains the independent variables.

• This difference in likelihood follows a chi-square


distribution, and is referred to as the model chi-square.

• The significance test for the final model chi-square (after


the independent variables have been added) is our
statistical evidence of the presence of a relationship
between the dependent variable and the combination
of the independent variables.
9
Example:
 The data set (file name = brand_new.sav)
contains information on 735 subjects who
were asked their preference on three
brands of some product (e.g., car or TV).

 Included in the data set are the


information on subjects' sex and age.

10
 The outcome variable is brand (coded as
A, B and C).

The variable Sex is coded as 1 for female


and 2 for male.
 Let's start with some descriptive statistics
of the variables of our interest.

11
Brand

Cumulative
Frequency Percent Valid Percent Percent
Valid A
207 28.2 28.2 28.2
B
307 41.8 41.8 69.9
C
221 30.1 30.1 100.0
Total
735 100.0 100.0

Descriptive Statistics

Minimu Maximu Std.


N m m Mean Deviation
Age 735 24 38 32.90 2.333
Valid N
735
(listwise) Sex of participants

Cumulati
Freque Valid ve
ncy Percent Percent Percent
Valid Female
466 63.4 63.4 63.4

Male
269 36.6 36.6 100.0

Total 12
735 100.0 100.0
Model Fitting Information

13
Model Fitting Information
• The presence of a relationship between the dependent
variable and combination of independent variables is
based on the statistical significance of the final model chi-
square in the SPSS table titled "Model Fitting Information".

• In this analysis, the probability of the model chi-square


(185.85) was <0.001, much less than the level of
significance of 0.05.
• The null hypothesis that there was no difference between
the model without independent variables and the model
with independent variables is rejected.
• Therefore, the existence of a relationship between the
independent variables and the dependent variable is
supported.
14
Strength of multinomial logistic regression relationship
• While multinomial logistic regression does
compute correlation measures to estimate the
strength of the relationship (pseudo R square
measures, such as Nagelkerke's R²), these
correlations measures do not really tell us much
about the accuracy or errors associated with the
model.

• A more useful measure to assess the utility of a


multinomial logistic regression model is
classification accuracy, which compares
predicted group membership based on the
logistic model to the actual, known group
membership, which is the value for the
dependent variable.
15
Evaluating usefulness for logistic models
 The benchmark that we will use to characterize a
multinomial logistic regression model as useful is a 25%
improvement over the rate of accuracy achievable by
chance alone.

 The estimate of by chance accuracy that we will use is the


proportional by chance accuracy rate, computed by
summing the squared percentage of cases in each group.
The only difference between by chance accuracy for
binary logistic models and by chance accuracy for
multinomial logistic models is the number of groups
defined by the dependent variable.

16
Computing by chance accuracy
 The percentage of cases in each group defined by the dependent
variable is found in the „Case Processing Summary‟ table.

Case Processing Summary

Marginal
N Percentage
Brand A 207 28.2%
B 307 41.8%
C 221 30.1%
Sex of Female 466 63.4%
participants Male 269 36.6%
Valid 735 100.0%
Missing 0
Total 735

17
Chance accuracy rate (CAR)
 The proportional by chance accuracy rate is
computed by calculating the proportion of cases
for each group based on the number of cases in
each group in the 'Case Processing Summary',
and then squaring and summing the proportion
of cases in each group.

 That is, CAR = 0.282² + 0.418² + 0.301² = 0.345.

 The proportional by chance accuracy criteria is


43.1% (that is, 1.25 x 34.5% = 43.1%).

18
Comparing accuracy rates
• To characterize our model as useful, we compare the
overall percentage accuracy rate produced by SPSS at the
last step in which variables are entered to 25% more than
the proportional by chance accuracy.

The classification accuracy rate is 55.2%


which is greater than or equal to the
proportional by chance accuracy criteria of
43.1% (1.25 x 34.5% = 43.1%).

The criteria for classification accuracy is


satisfied in this example as shown on page 19.

19
Comparing accuracy rates
While we will accept most of the SPSS defaults for
the analysis, we need to specifically request the
classification table.
Classification

Predicted

Percent
Observed A B C Correct
A 58 136 13 28.0%
B 18 238 51 77.5%
C 10 101 110 49.8%
Overall
Percentage 11.7% 64.6% 23.7% 55.2%

20
Numerical problems
• The maximum likelihood method used to calculate multinomial
logistic regression is an iterative fitting process that attempts
to cycle through repetitions to find an answer.

• Sometimes, the method will break down and not be able to


converge or find an answer.

• Sometimes the method will produce wildly improbable results,


reporting that a one-unit change in an independent variable
increases the odds of the modeled event by hundreds of
thousands or millions. These implausible results can be
produced by multicollinearity and categories of predictors
having no cases or zero cells.

• The clue that we have numerical problems and should not


interpret the results are standard errors for some independent
variables that are larger than 2.0. 21
Relationship of individual independent
variables and the dependent variable
• There are two types of tests for individual independent
variables:

– The likelihood ratio test evaluates the overall


relationship between an independent variable and the
dependent variable
– The Wald test evaluates whether or not the
independent variable is statistically significant in
differentiating between the two groups in each of the
embedded binary logistic comparisons.

• If an independent variable has an overall relationship to


the dependent variable, it might or might not be
statistically significant in differentiating between pairs of
groups defined by the dependent variable.
22
Likelihood ratio test
Likelihood Ratio Tests
Model Fitting
Criteria Likelihood Ratio Tests
-2 Log
Likelihood of
Effect Reduced Model Chi-Square df Sig.
Intercept 1.762E2 .000 0 .
Age 353.964 177.781 2 .000
Sex 183.834 7.651 2 .022

The chi-square statistic is the difference in -2 log-likelihoods between the final


model and a reduced model. The reduced model is formed by omitting an effect
from the final model. The null hypothesis is that all parameters of that effect are 0.
e.g., 353.964 – 176.183 = 177.781
183.834 – 176.183 = 7.651 23
Relationship of individual independent
variables and the dependent variable
• The interpretation for an independent variable focuses on
its ability to distinguish between pairs of groups and the
contribution which it makes to changing the odds of being
in one dependent variable group rather than the other.

• We should not interpret the significance of an


independent variable’s role in distinguishing between
pairs of groups unless the independent variable also has
an overall relationship to the dependent variable in the
likelihood ratio test.

• The interpretation of an independent variable’s role in


differentiating dependent variable groups is the same as
we used in binary logistic regression. The difference in
multinomial logistic regression is that we can have
multiple interpretations for an independent variable in
relation to different pairs of groups. 24
Relationship of individual independent
variables and the dependent variable
Parameter Estimates

95% Confidence
Interval for Exp(B)
Lower Upper
Branda B Std. Error Wald df Sig. Exp(B) Bound Bound
B Intercept -11.775 1.775 44.024 1 .000
Age .368 .055 44.813 1 .000 1.445 1.297 1.610
[Sex=1.00] .524 .194 7.272 1 .007 1.688 1.154 2.471
[Sex=2.00] 0b . . 0 . . . .
C Intercept -22.721 2.058 121.890 1 .000
Age .686 .063 119.954 1 .000 1.986 1.756 2.245
[Sex=1.00] .466 .226 4.247 1 .039 1.594 1.023 2.482
[Sex=2.00] 0b . . 0 . . . .
a. The reference category is: A.
b. This parameter is set to zero because it is redundant.

25
Relationship of individual independent
variables and the dependent variable
 SPSS identifies the comparisons it makes for groups defined
by the dependent variable in the table of ‘Parameter
Estimates,’ using either the value codes or the value labels.

 The reference category is Brand A (see footnote).

In this analysis, two comparisons will be made:

 Brand B will be compared with brand A.

 Brand C will also be compared with brand A.

26
Relationship of individual independent
variables and the dependent variable

 The reference category plays the


same role in multinomial logistic
regression that it plays in the dummy-
coding of a nominal variable: it is the
category that would be coded with
zeros for all of the dummy-coded
variables that all other categories are
interpreted against.

27
• In this example, there is a statistically
significant relationship between the
independent variables (sex and age) and
the dependent variable (brand type).
(LRT, page 23)

 Sex vs. Brand type ( P = 0.022)


 Age vs. Brand type ( P < 0.001)

28
• The table on slide number 24 , titled Parameter Estimates,
has two parts, labeled with the categories of the outcome
variable brand. They correspond to two equations:

log(P(brand=B)/P(brand=A)) = b_10 + b_11*sex + b_12*age


log(P(brand=C)/P(brand=A)) = b_20 + b_21*sex + b_22*age

 For example, we can say that for one unit change in the
variable age, the log of the ratio of the two probabilities,
P(brand=B)/P(brand=A), will be increased by 0.368.

 Also, the log of the ratio of the two probabilities


P(brand=C)/P(brand=A) will be increased by 0.686.

29
 We can say that for one unit change in the variable age, we
expect the relative risk (in this example, preference for brand B)
of choosing brand B over A to increase by exp(.3682) = OR =
1.45. So we can say that the relative risk (in this example,
preference) is higher for older people.

 For a dichotomous predictor variable such as sex, we can say


that the ratio of the relative risks (in this example, preference) of
choosing brand B over A for female and male is exp(.524). We
can see the results displayed as odds ratios in the column
labeled Exp(B) in the table above. That is, females are about
1.7 more likely to choose brand B than males when the
reference category is brand A.

 In general, the older a person is, the more he/she will prefer29
brand B or C.
• Both female and age are statistically significant
across the two models. Females are more likely to
prefer brands B or C compared to brand A.

• Also, the older a person is, the more likely he/she is


to prefer brands B or C to brand A.

• Both of these findings are statistically significant.

31
Cautions
• Pseudo-R-Squared: These do not convey the same
information as the R-square for linear regression, even though
it is still "the higher, the better".

• Sample size: Multinomial regression uses a maximum


likelihood estimation method. Therefore, it requires a large
sample size. It also uses multiple equations. Therefore, it
requires an even larger sample size than ordinal or binary
logistic regression.

• Empty cells or small cells: You should check for empty or


small cells by doing a crosstab between categorical predictors
and the outcome variable. If a cell has very few cases (a small
cell), the model may become unstable or it might not run at all.
32
Cautions

 In order for the multinomial logistic


regression question to be true, the overall
relationship must be statistically
significant.
 there must be no evidence of numerical
problems, the classification accuracy rate
must be substantially better than could be
obtained by chance alone
 and the stated individual relationship must
be statistically significant and interpreted
correctly.

33
Cautions
 Multicollinearity in the multinomial logistic
regression is detected by examining the standard
errors for the b coefficients.

 A standard error larger than 2.0 indicates numerical


problems, such as multicollinearity among the
independent variables, zero cells for a dummy-coded
independent, etc.

 Analyses that indicate numerical problems should


not be interpreted.

 None of the independent variables in this analysis


had a standard error larger than 2.0. (We are not
interested in the standard errors associated with the
intercept.) 34

You might also like