0% found this document useful (0 votes)
55 views19 pages

The Multinomial Logit Model For Nominal Response Data: James J. Dignam

The document describes the multinomial logit model for analyzing nominal categorical outcome data. It discusses how the multinomial logit model extends the standard logit model to model probabilities of response for each category rather than cumulative probabilities. It also compares the multinomial logit model to the proportional odds model, noting key differences in how each approaches modeling nominal outcome data. An example using data on high school students' program choice is presented to illustrate exploratory analyses.

Uploaded by

cdcdiver
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views19 pages

The Multinomial Logit Model For Nominal Response Data: James J. Dignam

The document describes the multinomial logit model for analyzing nominal categorical outcome data. It discusses how the multinomial logit model extends the standard logit model to model probabilities of response for each category rather than cumulative probabilities. It also compares the multinomial logit model to the proportional odds model, noting key differences in how each approaches modeling nominal outcome data. An example using data on high school students' program choice is presented to illustrate exploratory analyses.

Uploaded by

cdcdiver
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

The Multinomial Logit Model

for Nominal Response Data

James J. Dignam

Department of Public Health Sciences


University of Chicago

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 1/1


Nominal Categorical Data

A nominal response variable takes on discrete categories, but


where there is not an intrinsic ordering of the categories.
Examples:
1 Gender: male, female
2 Hair color: blonde, brown, black, red, etc.
3 Blood type: A, B, AB, or O.
4 Tumor cell pathologic feature categories

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 2/1


Nominal Response: notation

Let C 1 ,C 2 , ...,C k , k ≥ 2 denote the k categories for the response (no


intrinsic order).
Let Yi be the response variable for the i t h individual, with Yi taking
the value j if the response is in category C j , j = 1, 2, ..., k.
p i j = P (Yi = j ) = P [ individual i responds in category C j ]
We can still define cumulative probability for Yi :
γi j = p i 1 + p i 2 + · · · + p i j
Does it make sense to model γi j as in the ordinal response case?
Not really, when defining γi j this way, as it is order-dependent
(and any order would be arbitrary).

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 3/1


Standard Logit Model −→ Multinomial Logit Model
p
Standard logit model: log( 1−p ) = x T β
p p r esponse
log( 1−p ) = log( p non−r esponse )
Multinomial logit model:
Pick a baseline/reference category (which plays the same role as
category “non-response" in standard logit model), let’s say C 1
Model can be written as:
pj
log( ) = x T β j , j = 2, 3, ..., k (1)
p1
P j =k
Under model (1) and given the fact that j =1
p j = 1,

1
p1 = (2)
1 + exp (x T β 2 ) + exp (x

3 ) + · · · + exp (x

k)

exp (x T β j )
pj = , j = 2, 3, ..., k (3)
1 + exp (x T β2 ) + exp (x T β3 ) + · · · + exp (x T βk )
J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 4/1
Proportional Odds Model vs. Multinomial Logit Model

As in other logit models, x is a covariate vector without intercept and is


of dimension p. Contrasting the two model extensions of the standard
logit model:
γ
Proportional odds model: log( 1−γj j ) = d j − x T β, j = 1, 2, ..., k − 1
p
Multinomial logit model: log( p 1j ) = α j + x T β j , j = 2, 3, ..., k

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 5/1


Proportional Odds Model vs. Multinomial Logit Model

γ
Proportional odds model: log( 1−γj j ) = d j − x T β, j = 1, 2, ..., k − 1
p
Multinomial logit model: log( p 1j ) = α j + x T β j , j = 2, 3, ..., k
1 Proportional odds model predicts cumulative probability (except the
last category), whereas multinomial logit model predicts probability
for each category (except baseline category) directly.
2 Proportional odds model has constant slope β: the effect of x , is
the same for all k-1 ways to collapse response into binary
outcomes. Multinomial logit model has different slope β j depending
on the response category.
3 Proportional odds model, has (k − 1) intercepts plus p slopes, for a
total of k − 1 + p parameters to be estimated. Multinomial logit
model, has (k − 1) intercepts plus (k − 1) ∗ p slopes, for a total of
(k − 1) + p ∗ (k − 1) parameters to be estimated.

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 6/1


Example: High School Program Choice
Students entering high school make a program choice among general,
vocational, and academic program studies. Their choice might be
related to their writing score and their socio-economic status. These
data describe 200 high school students.

. l i s t prog ses w r i t e i n 1 / 1 5 , c l e a n

prog ses write


1. vocation low 44
2. vocation middle 41
3. academic low 65
4. academic low 50
5. academic low 40
6. academic low 41
7. academic middle 54
8. academic low 44
9. vocation middle 49
10. general middle 54
11. academic middle 46
12. vocation middle 44
13. vocation middle 46
14. academic high 41
15. vocation high 39
. . .

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 7/1


High School Program Choice - looking at the data
Outcome is numeric, with labels attached
. codebook prog
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
prog t y p e o f program
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
type : numeric ( f l o a t )
label : sel

range : [1 ,3] units : 1


unique v a l u e s : 3 missing . : 0/200

tabulation : Freq . Numeric Label


45 1 general
105 2 academic
50 3 vocation

. codebook ses
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
ses ( unlabeled )
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
type : numeric ( f l o a t )
label : sl

range : [1 ,3] units : 1


unique v a l u e s : 3 missing . : 0/200

tabulation : Freq . Numeric Label


47 1 low
95 2 middle
58 3 high

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 8/1


HS Program Choice: exploratory analyses
. t a b prog ses , row c o l c h i 2
|−−−−−−−−−−−−−−−−−−−|
| frequency |
| row percentage |
| column percentage |
+−−−−−−−−−−−−−−−−−−−+
type of | ses
program | low middle high | Total
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
general | 16 20 9 | 45
| 35.56 44.44 20.00 | 100.00
| 34.04 21.05 15.52 | 22.50
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
academic | 19 44 42 | 105
| 18.10 41.90 40.00 | 100.00
| 40.43 46.32 72.41 | 52.50
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
vocation | 12 31 7 | 50
| 24.00 62.00 14.00 | 100.00
| 25.53 32.63 12.07 | 25.00
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
Total | 47 95 58 | 200
| 23.50 47.50 29.00 | 100.00
| 100.00 100.00 100.00 | 100.00

Pearson c h i 2 ( 4 ) = 16.6044 Pr = 0.002

Test of association suggests strong relationship between SES and


program choice
J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 9/1
HS Program Choice: some exploratory analyses

. oneway w r i t e prog , means

| Summary o f
| writing
type of | score
program | Mean
−−−−−−−−−−−−+−−−−−−−−−−−−
general | 51.333333
academic | 56.257143
vocation | 46.76
−−−−−−−−−−−−+−−−−−−−−−−−−
Total | 52.775

A n a l y s i s o f Variance
Source SS df MS F Prob > F
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Between groups 3175.69786 2 1587.84893 21.27 0.0000
W i t h i n groups 14703.1771 197 74.635417
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Total 17878.875 199 89.843593

B a r t l e t t ’ s t e s t f o r equal v a r i a n c e s : chi2 (2) = 2.6184 Prob > c h i 2 = 0.270

Difference in mean writing score by program choice

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 10 / 1


HS Program Choice: some exploratory analyses

. oneway w r i t e ses , means

| Summary o f
| writing
| score
ses | Mean
−−−−−−−−−−−−+−−−−−−−−−−−−
low | 50.617021
middle | 51.926316
high | 55.913793
−−−−−−−−−−−−+−−−−−−−−−−−−
Total | 52.775

A n a l y s i s o f Variance
Source SS df MS F Prob > F
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Between groups 858.715441 2 429.35772 4.97 0.0078
W i t h i n groups 17020.1596 197 86.396749
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Total 17878.875 199 89.843593

B a r t l e t t ’ s t e s t f o r equal v a r i a n c e s : chi2 (2) = 0.1462 Prob > c h i 2 = 0.930

SES also influences writing score

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 11 / 1


HS Program Choice: the multinomial logit model
Model evaluating SES with low as baseline and ORs for general or
vocational programs vs. academic:
. m l o g i t prog i . ses , base ( 2 ) nolog
Multinomial l o g i s t i c regression Number o f obs = 200
LR c h i 2 ( 4 ) = 16.78
Prob > c h i 2 = 0.0021
Log l i k e l i h o o d = − 195.70519 Pseudo R2 = 0.0411
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
prog | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
general |
ses |
middle | − .6166071 .4334269 − 1.42 0.155 − 1.466108 .232894
h i g h | − 1.368595 .5000522 − 2.74 0.006 − 2.348679 − .3885105
|
_cons | − .1718503 .3393104 − 0.51 0.613 − .8368865 .493186
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
academic | ( base outcome )
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
vocation |
ses |
middle | .1093299 .4369785 0.25 0.802 − .7471323 .9657921
h i g h | − 1.332227 .5501196 − 2.42 0.015 − 2.410442 − .2540125
|
_cons | − .4595323 .3687342 − 1.25 0.213 − 1.182238 .2631734
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

This model produces odds ratios from sub-tables of the 3x3 table of program type choice by SES

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 12 / 1


HS Program Choice: the multinomial logit model

What are these parameters? Create a 2x2 table with outcome the
program choice (general vs. academic) and the ’exposure’ variable
SES, with middle SES exposed and low SES unexposed:

SES
Program middle low
general 20 16 36
academic 44 19 63

Note that the OR from this table is 20 × 19/44 × 16 = 0.5398. Taking the
log yields the parameter from the above model (−.6166). Individuals of
middle SES are less likely than low SES individuals to choose the
general program over the academic program.

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 13 / 1


HS Program Choice: the multinomial logit model

Another subtable. Create a 2x2 table with outcome the program choice
(vocational vs. academic) and the ’exposure’ variable SES, with middle
SES exposed and low SES unexposed:

SES
Program middle low
vocational 31 12 43
academic 44 19 63

OR from this table is 1.1155. Taking the log yields the parameter from
the above model (0.1093). Individuals of middle SES are not any more
or less likely to choose the vocational program over the academic
program.
There are 4 such 2x2 subtables in the 3x3 table.

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 14 / 1


HS Program Choice: multiple predictors model
Model with writing score added:
. m l o g i t prog i . ses w r i t e , base ( 2 ) nolog
Multinomial l o g i s t i c regression Number o f obs = 200
LR c h i 2 ( 6 ) = 48.23
Prob > c h i 2 = 0.0000
Log l i k e l i h o o d = − 179.98173 Pseudo R2 = 0.1182
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
prog | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
general |
ses |
middle | − .533291 .4437321 − 1.20 0.229 − 1.40299 .336408
h i g h | − 1.162832 .5142195 − 2.26 0.024 − 2.170684 − .1549804
|
w r i t e | − .0579284 .0214109 − 2.71 0.007 − .0998931 − .0159637
_cons | 2.852186 1.166439 2.45 0.014 .5660075 5.138365
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
academic | ( base outcome )
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
vocation |
ses |
middle | .2913931 .4763737 0.61 0.541 − .6422822 1.225068
h i g h | − .9826703 .5955669 − 1.65 0.099 − 2.14996 .1846195
|
w r i t e | − .1136026 .0222199 − 5.11 0.000 − .1571528 − .0700524
_cons | 5.2182 1.163549 4.48 0.000 2.937686 7.498714
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 15 / 1


HS Program Choice: fitted multinomial logit model

The model produces equations as follows:


p̂ general
log( ) = 2.852 − 0.533 ∗ mi d d l e − 1.16 ∗ hi g h − 0.0579 ∗ wr i t e
p̂ academic

p̂ vocational
log( ) = 5.218 + 0.291 ∗ mi d d l e − 0.983 ∗ hi g h − 0.114 ∗ wr i t e
p̂ academic

(note: only none or 1 of the covariates middle and high take on value 1
in any given prediction)

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 16 / 1


HS Program Choice: the multinomial logit model
Note that a different set of ORs can be produced by changing the
reference category (here to vocational)
. m l o g i t prog i . ses w r i t e , base ( 3 ) nolog

Multinomial l o g i s t i c regression Number o f obs = 200


LR c h i 2 ( 6 ) = 48.23
Prob > c h i 2 = 0.0000
Log l i k e l i h o o d = − 179.98173 Pseudo R2 = 0.1182

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
prog | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
general |
ses |
middle | − .8246841 .4901229 − 1.68 0.092 − 1.785307 .1359392
h i g h | − .1801617 .648455 − 0.28 0.781 − 1.45111 1.090787
|
write | .0556742 .0233313 2.39 0.017 .0099456 .1014028
_cons | − 2.366014 1.174248 − 2.01 0.044 − 4.667498 − .0645293
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
academic |
ses |
middle | − .2913931 .4763737 − 0.61 0.541 − 1.225068 .6422822
high | .9826703 .5955669 1.65 0.099 − .1846195 2.14996
|
write | .1136026 .0222199 5.11 0.000 .0700524 .1571528
_cons | − 5.2182 1.163549 − 4.48 0.000 − 7.498714 − 2.937686
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
vocation | ( base outcome )
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 17 / 1


HS Program Choice: changing the reference category
in the model

Note that in the above model, first set of estimates change - this is
a difference outcome - logit for general vs. vocational program
choice - i.e., a different 2x2 subtable from earlier
Note that second set of estimates is identical except for sign -
why? - This is academic/vocational logit - before we estimated
vocational/academic logit, i.e., the reciprocal
Note that general model fit summaries (Log likelihood, etc) are
identical to earlier model

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 18 / 1


Summary

Multinomial Logit Model

A straightforward extension of binary logit model (i.e. logit model


or logistic regression)
Must choose reference category for ORs, can change to obtain
ORs from ’subtables’ of an RXC table
These models have history in social science, analysis of multi-way
contingency tables (log-linear models for table frequencies)

J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 19 / 1

You might also like