0% found this document useful (0 votes)
18 views16 pages

Multinomial Ordinal Models

The document discusses models for multi-category outcomes, focusing on multinomial logit and ordered logit models for nominal and ordinal data, respectively. It provides examples using Stata, highlighting practical issues such as the proportionality assumption in ordered logit models and methods for testing this assumption. Additionally, it outlines options for analysis when the proportionality assumption is violated.

Uploaded by

siyoi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views16 pages

Multinomial Ordinal Models

The document discusses models for multi-category outcomes, focusing on multinomial logit and ordered logit models for nominal and ordinal data, respectively. It provides examples using Stata, highlighting practical issues such as the proportionality assumption in ordered logit models and methods for testing this assumption. Additionally, it outlines options for analysis when the proportionality assumption is violated.

Uploaded by

siyoi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Models for Ordered and

Unordered Categorical Variables

DUSTIN C. BROWN
POPULATION RESEARCH CENTER
Objectives

 Introduce models for multi-category outcomes

 Briefly discuss multinomial logit (probit) models

 Briefly discuss ordinal logit (probit) models

 Show examples in Stata

 Discuss practical issues, extensions, etc.


Models for Multi-Category Outcomes

 These models can be viewed as extensions of binary logit


and binary probit regression.

 The dependent variable has three or more categories and is


nominal or ordinal.

 Multinomial logit and ordered logit models are two of the


most common models.
Multinomial Logit (Probit)

 Multinomial logit (probit) models


 Nominal outcomes – no intrinsic order (qualitative)
 Three or more unordered categories

 Examples:
 Smoking status – never, current, former smoker
 Marital status – married, divorced, widowed, never
married
Multinomial Logit (Probit) Model

 Estimates a series of binary logit (probit) models

 One group is chosen to be the base (reference) category for


the other groups (estimates equations for k – 1 groups)
 Example: If never smokers are the base category, then two
models are estimated:
 Current smokers vs. Never smokers
 Former smokers vs. Never smokers
Stata Example: Multinomial Logit

 The data are from the NHIS Adult Sample Files (2009)

 Outcome: Smoking Status – Never Smoked (Base Category), Current


Smoker, Former Smoker

 Predictors:
 Education: <High School, High School, Some College, College (Ref.)
 Race-Ethnicity: NH White (Ref.), NH Black, Hispanic,
 Age in years

 Stata Code: “mlogit smk3 lths hs scol nhb hispanic age, base(0) rrr”

 “base(0)” tells Stata that the comparison group is never smokers


 “rrr” tells Stata to display relative risk ratios
Stata Example: Multinomial Logit Output
Stata Example: Multinomial Logit Interpretation

 The risk of being a current vs. never smoker is 4.86 times


greater for persons without a high school diploma relative to
college graduates net of race-ethnicity and age.

 The risk of being a former vs. never smoker is about 33%


[(0.666 – 1)*100)] lower for blacks relative to whites when
education and age are held constant.

 The risk of being a former vs. never smoker increases by


about 3% (RRR = 1.03) with each additional year of age
controlling for education and race-ethnicity.
Ordered Logit (Probit) Models

 Ordered logit (probit) models


 Ordinal outcomes – inherently ordered categories
 Problem: Distance between adjacent categories is
unknown
 Solution: Treat the ordinal scale as though it represents a
latent interval/ratio scale

 Examples:
 Self-Rated Health – poor, fair, good, very good, excellent
Ordered Logit (Probit) Models

 Estimates the cumulative probability of being in one category


versus all lower or higher categories
 Proportionality Assumption – the distance between each
category is equivalent (a.k.a., proportional odds assumption)
 This assumption often is violated in practice
 Need to test if this assumption holds (can use a “Brant test”)
 Violating this assumption may or may not really “matter”
 Refer to Long & Freese (2006) for more information
Stata Example: Ordered Logit Model

 The data are from the NHIS Adult Sample Files (2009)

 Outcome: Self-Rated Health, where 1 = Excellent, 2 = Very Good, 3 =


Good, 4 = Fair, 5 = Poor

 Predictors:
 Education: <High School, High School, Some College, College (Ref.)
 Race-Ethnicity: NH White (Ref.), NH Black, Hispanic,
 Age in years

 Stata Code: “ologit health lths hs scol nhb hispanic age, or”

 The model is predicting the log odds of reporting worse health


 “or” tells Stata to display proportional odds ratios
Stata Example: Ordered Logit Output
Stata Example: Ordered Logit Interpretation

 The odds of reporting poor vs. fair, good, very good, and excellent
health are 3.97 times greater for persons who did not graduate high
school in comparison to persons with a college degree net of race-
ethnicity and age.

 Each additional year of age is associated with 3.1% (OR= 1.036)


increase in the odds of reporting poor vs. fair, good, very good, and
excellent health when education and race-ethnicity are held constant.

 The cut-points (or thresholds) Stata used to differentiate between the


adjacent levels of self-rated health are at the bottom (cut1, cut2, etc.)
Testing for Proportionality

 Once again, the ordered logit (probit) model assumes that the
distance between each category of the outcome is proportional.

 In practice, violating this assumption may or may not alter your


substantive conclusions. You need to test whether this is the case.

 A Brant test can be used to test whether the proportional odds (i.e.,
parallel lines) assumption holds.
 This is available as a user-added post-estimation command in Stata.

 To download this command type “findit brant” in Stata.

 Once downloaded, you can type “brant” immediately after you


estimate a ordered logit model (“ologit”) to perform the test.
Stata Example: Testing for Proportionality

 The Brant test indicates that the influence of education and race-ethnicity
are not proportional across each category of self-rated health. Note, that
the association between age and self-rated health is proportional though.
When the Proportionality Assumption is Violated…

 Option 1: Do nothing. Use ordered logistic regression because the practical


implications of violating this assumption are minimal.

 Option 2: Use a multinomial logit model. This frees you of the proportionality
assumption, but it is less parsimonious and often dubious on substantive grounds.

 Option 3: Dichotomize the outcome and use binary logistic regression. This is
common, but you lose information and it could alter your substantive conclusions.

 Option 4: Use a model that does not assume proportionality. Increasingly, this is
common. Two user-submitted Stata commands fit these kinds of models:
 “gologit2” – generalized ordered logit models (see Williams 2007, Stata Jn.)
 “oglm” – heterogeneous choice models (see Williams 2010, Stata Jn.)

 Recommendation: Try all the above and decide what to do based on your results.

You might also like