0% found this document useful (0 votes)
17 views42 pages

Slides 8 Iu

The document discusses the Multinomial Logit Model (MNL), which is used to analyze relationships between categorical dependent variables and explanatory variables. It highlights the need for MNL when dealing with more than two categories, providing examples such as health care provider choices and car preferences. The document also outlines the estimation process through maximum likelihood and includes data examples and analysis methods.

Uploaded by

Ngô Trâm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views42 pages

Slides 8 Iu

The document discusses the Multinomial Logit Model (MNL), which is used to analyze relationships between categorical dependent variables and explanatory variables. It highlights the need for MNL when dealing with more than two categories, providing examples such as health care provider choices and car preferences. The document also outlines the estimation process through maximum likelihood and includes data examples and analysis methods.

Uploaded by

Ngô Trâm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

MULTINOMIAL LOGIT MODEL

Trương Đăng Thụy


[email protected]
MULTINOMIAL RESPONSE
MULTINOMIAL RESPONSES
▪ logit/probit model: dependent variable = 0/1
▪ What if more than 2 categories?
▪ Example: long term effect of the exposure to radiation may
be
▪ 1 – dead of cancer
▪ 2 – dead of cause other than cancer
▪ 3 – alive
MULTINOMIAL RESPONSES
▪ Example: choice of health care providers
▪ 1 – public hospital
▪ 2 – private hospital/clinic
▪ 3 – local traditional physician
▪ 4 – self-treatment

▪ Other examples:
▪ choice of car (Y = Toyota, Honda, Suzuki, Mazda, KIA…)
▪ choice of specialization at university
▪ choice of occupation

We can’t use OLS or Logit/Probit. Why?


MULTINOMIAL LOGIT MODEL
MULTINOMIAL LOGISTIC REGRESSION MODEL
▪ Multinomial logit model (MNL) is used to analyze the relationship between
categorical variables and other explanatory variables
▪ Note the difference:
▪ nominal response (MNL)
▪ ordinal response (ordered probit model – not covered in this lecture)
THE DEPENDENT VARIABLE
▪ The occurrence of an alternative 𝑗 for individual 𝑖

𝑌𝑖 = 1,2,3, . . . , 𝐽
▪ Probability of occurrence of each alternative

1 probability = 𝑝i1
2 probability = 𝑝i2
𝑌𝑖 = 3 probability = 𝑝i3
... ...
𝐽 probability = 𝑝iJ
𝑝i1
𝑌𝑖 = 1 ℎi1 = log
𝑝𝑖𝐽
𝑝i2
𝑌𝑖 = 2 ℎi2 = log
THE LOGIT 𝑝𝑖𝐽
(LOG-ODDS RATIO) 𝑌𝑖 = 3 ℎi3 = log
𝑝i3
𝑝𝑖𝐽
... ...
𝑌𝑖 = 𝐽 ℎ𝑖𝐽 = 0
MODELLING THE LOGITS
▪ Given the choice variable with 𝐽 outcomes: 𝑌 = 1,2, … , 𝐽
▪ The corresponding probabilities of choosing these outcomes are 𝑝1 ,..., 𝑝𝐽
▪ Let’s choose 𝐽 as the base outcome (category)
▪ The logits (log-odd ratios)
𝑝1
log = 𝛽1 𝑋
𝑝𝐽
𝑝2
For J alternatives,
log = 𝛽2 𝑋
𝑝𝐽 there will be (J-1)
… set of estimated
𝑝𝐽 coefficients beta
log = 0
𝑝𝐽
THE PROBABILITIES
𝑝𝑗
▪ In general log = 𝛽𝑗 𝑋
𝑝𝐽

▪ Exponentiate both sides 𝑝𝑗 = 𝑝𝐽 ∙ 𝑒 𝛽𝑗 𝑋


▪ Because the sum of all probabilities must be 1
𝐽−1

𝑝𝐽 = 1 − ෍ 𝑝𝑗
1

▪ Substitute 𝑝𝑗 = 𝑝𝐽 ∙ 𝑒 𝛽𝑗 𝑋 into the above to yield


𝐽−1

𝑝𝐽 = 1 − ෍ 𝑝𝐽 ∙ 𝑒 𝛽𝑗 𝑋
1
THE PROBABILITIES
▪ With some transformation we get

1
𝑝𝐽 = 𝐽−1 𝛽𝑗 𝑋
1 + σ1 𝑒
▪ Now using 𝑝𝑗 = 𝑝𝐽 ∙ 𝑒 𝛽𝑗 𝑋 again to recover all the probabilities

𝑒 𝛽𝑗 𝑋
𝑝𝑗 = 𝐽−1 𝛽𝑗 𝑋
1 + σ1 𝑒
THE PROBABILITIES
𝑒 𝑋𝑖𝛽1
1 probability = 𝑝i1 =
σ𝐽𝑘=1 𝑒 𝑋𝑖𝛽𝑘
𝑒 𝑋𝑖 𝛽2
2 probability = 𝑝i2 =
σ𝐽𝑘=1 𝑒 𝑋𝑖𝛽𝑘
𝑌𝑖 =
𝑒 𝑋𝑖 𝛽3
3 probability = 𝑝i3 =
σ𝐽𝑘=1 𝑒 𝑋𝑖𝛽𝑘
... ...
𝑒 𝑋𝑖𝛽𝐽
𝐽 probability = 𝑝iJ =
σ𝐽𝑘=1 𝑒 𝑋𝑖 𝛽𝑘
MAXIMUM LIKELIHOOD ESTIMATION
▪ MNL model is estimated by maximizing the log-likelihood function

𝑁 𝐽

log 𝐿 = ෍ ෍ 𝑦𝑖𝑗 ln 𝑝𝑖𝑗


𝑖=1 𝑗=1

0 if j is NOT chosen
▪ 𝑦𝑖𝑗 =
1 if j is chosen
EXAMPLE DATA
Vhlss 2012
▪ Dependent variable: choice of health care provider
▪ 1 = Commune health center
▪ 2 = Public hospital
▪ 3 = Private hospital
▪ 4 = Lang y
▪ 5 = Individual health care provider

▪ insurance: 1 = having insurance, 0 otherwise


▪ Household income: mil. VND/year
▪ female: 1 = female, 0 otherwise
▪ age (years)
▪ edu: 0 = under primary, 1 = primary, 2 = secondary, 3 = high school, 4 = college or higher.
▪ urban: 1 = living in urban area
▪ Data file: mnl.xlsx
Data
id case choice income female …
1 1 2 12 1
1 2 4 12 1
2 1 3 21 1
3 1 17 0
3 2 17 0
3 3 17 0

choice: 1 = Commune health center; 2 = Public hospital; 3 = Private hospital; 4 = Lang y;
5 = Individual health care provider
income: mil. VND/year; female: 1 = female; 0 = male.
DESCRIPTIVE STATISTICS
THE DEPENDENT VARIABLE
COMMAND FOR FREQUENCY TABLE
CATEGORICAL VARIABLES IN THE DATASET
R FUNCTION FOR TWO-WAY TABLE
BIVARIATE ANALYSIS
BIVARIATE ANALYSIS
MNL MODEL IN RSTUDIO
SET THE BASE OUTCOME
THE NULL MNL MODEL
(INTERCEPT ONLY)
THE MNL MODEL (COEF)
THE MNL MODEL (SE)
THE MNL MODEL (Z)
THE MNL MODEL (P-VALUE)
RE-ARRANGE THE REGRESSION RESULTS
EQUATION FOR COMMUNE HEALTH CENTER
EQUATION FOR PRIVATE HOSPITAL
LR TEST FOR OVERALL SIGNIFICANCE
THE FITTED VALUES
THE PREDICTION
MCFADDEN R-SQUARE
HYPOTHESIS TESTING AFTER MNL
HYPOTHESIS TESTING AFTER MNL
MARGINAL EFFECTS AFTER MNL
MARGINAL EFFECTS AFTER MNL
MARGINAL EFFECTS AFTER MNL

You might also like