The Multinomial Logit Model For Nominal Response Data: James J. Dignam
The Multinomial Logit Model For Nominal Response Data: James J. Dignam
James J. Dignam
1
p1 = (2)
1 + exp (x T β 2 ) + exp (x
Tβ
3 ) + · · · + exp (x
Tβ
k)
exp (x T β j )
pj = , j = 2, 3, ..., k (3)
1 + exp (x T β2 ) + exp (x T β3 ) + · · · + exp (x T βk )
J. Dignam (UChicago) Lecture 12 Feb. 18, 2020 4/1
Proportional Odds Model vs. Multinomial Logit Model
γ
Proportional odds model: log( 1−γj j ) = d j − x T β, j = 1, 2, ..., k − 1
p
Multinomial logit model: log( p 1j ) = α j + x T β j , j = 2, 3, ..., k
1 Proportional odds model predicts cumulative probability (except the
last category), whereas multinomial logit model predicts probability
for each category (except baseline category) directly.
2 Proportional odds model has constant slope β: the effect of x , is
the same for all k-1 ways to collapse response into binary
outcomes. Multinomial logit model has different slope β j depending
on the response category.
3 Proportional odds model, has (k − 1) intercepts plus p slopes, for a
total of k − 1 + p parameters to be estimated. Multinomial logit
model, has (k − 1) intercepts plus (k − 1) ∗ p slopes, for a total of
(k − 1) + p ∗ (k − 1) parameters to be estimated.
. l i s t prog ses w r i t e i n 1 / 1 5 , c l e a n
. codebook ses
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
ses ( unlabeled )
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
type : numeric ( f l o a t )
label : sl
| Summary o f
| writing
type of | score
program | Mean
−−−−−−−−−−−−+−−−−−−−−−−−−
general | 51.333333
academic | 56.257143
vocation | 46.76
−−−−−−−−−−−−+−−−−−−−−−−−−
Total | 52.775
A n a l y s i s o f Variance
Source SS df MS F Prob > F
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Between groups 3175.69786 2 1587.84893 21.27 0.0000
W i t h i n groups 14703.1771 197 74.635417
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Total 17878.875 199 89.843593
| Summary o f
| writing
| score
ses | Mean
−−−−−−−−−−−−+−−−−−−−−−−−−
low | 50.617021
middle | 51.926316
high | 55.913793
−−−−−−−−−−−−+−−−−−−−−−−−−
Total | 52.775
A n a l y s i s o f Variance
Source SS df MS F Prob > F
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Between groups 858.715441 2 429.35772 4.97 0.0078
W i t h i n groups 17020.1596 197 86.396749
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Total 17878.875 199 89.843593
This model produces odds ratios from sub-tables of the 3x3 table of program type choice by SES
What are these parameters? Create a 2x2 table with outcome the
program choice (general vs. academic) and the ’exposure’ variable
SES, with middle SES exposed and low SES unexposed:
SES
Program middle low
general 20 16 36
academic 44 19 63
Note that the OR from this table is 20 × 19/44 × 16 = 0.5398. Taking the
log yields the parameter from the above model (−.6166). Individuals of
middle SES are less likely than low SES individuals to choose the
general program over the academic program.
Another subtable. Create a 2x2 table with outcome the program choice
(vocational vs. academic) and the ’exposure’ variable SES, with middle
SES exposed and low SES unexposed:
SES
Program middle low
vocational 31 12 43
academic 44 19 63
OR from this table is 1.1155. Taking the log yields the parameter from
the above model (0.1093). Individuals of middle SES are not any more
or less likely to choose the vocational program over the academic
program.
There are 4 such 2x2 subtables in the 3x3 table.
p̂ vocational
log( ) = 5.218 + 0.291 ∗ mi d d l e − 0.983 ∗ hi g h − 0.114 ∗ wr i t e
p̂ academic
(note: only none or 1 of the covariates middle and high take on value 1
in any given prediction)
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
prog | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
general |
ses |
middle | − .8246841 .4901229 − 1.68 0.092 − 1.785307 .1359392
h i g h | − .1801617 .648455 − 0.28 0.781 − 1.45111 1.090787
|
write | .0556742 .0233313 2.39 0.017 .0099456 .1014028
_cons | − 2.366014 1.174248 − 2.01 0.044 − 4.667498 − .0645293
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
academic |
ses |
middle | − .2913931 .4763737 − 0.61 0.541 − 1.225068 .6422822
high | .9826703 .5955669 1.65 0.099 − .1846195 2.14996
|
write | .1136026 .0222199 5.11 0.000 .0700524 .1571528
_cons | − 5.2182 1.163549 − 4.48 0.000 − 7.498714 − 2.937686
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
vocation | ( base outcome )
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Note that in the above model, first set of estimates change - this is
a difference outcome - logit for general vs. vocational program
choice - i.e., a different 2x2 subtable from earlier
Note that second set of estimates is identical except for sign -
why? - This is academic/vocational logit - before we estimated
vocational/academic logit, i.e., the reciprocal
Note that general model fit summaries (Log likelihood, etc) are
identical to earlier model