0% found this document useful (0 votes)
52 views32 pages

CE687A Lecture23

The document provides an overview of count models and model selection for modeling average annual daily traffic (AADT). It discusses Poisson and negative binomial regression models as well as the use of log transformations and different model specifications. Key aspects covered include heteroskedasticity, empirical Bayes estimation to update the mean based on observed counts, and comparing models on training and test data to avoid overfitting.

Uploaded by

varun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views32 pages

CE687A Lecture23

The document provides an overview of count models and model selection for modeling average annual daily traffic (AADT). It discusses Poisson and negative binomial regression models as well as the use of log transformations and different model specifications. Key aspects covered include heteroskedasticity, empirical Bayes estimation to update the mean based on observed counts, and comparing models on training and test data to avoid overfitting.

Uploaded by

varun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Statistical and Econometric

Methods for Transportation


Engineering (CE687A)
Count models: application and
model selection
Aditya Medury
Lecture 23

2022-23, Semester I
IIT Kanpur
1
Disclaimer

This course material is being distributed as part of CE687A, titled “Statistical and
Econometric Methods for Transportation Engineering ", at IIT Kanpur during
semester I of the academic year 2022-23. Its contents are being shared in
confidence, for the sole purpose of instruction, and are only meant for the
students registered in this course. Any form of distribution, reproduction or
uploading of these materials anywhere, or with anyone, outside this course is
strictly prohibited.

2
Poisson-Gamma mixing

• Let 𝜇’s in the population be distributed by Gamma distribution


• Let crash counts 𝑍|𝜇 be distributed by Poisson distribution

𝜇𝑘 −𝜇
𝑃𝑟𝑜𝑏 𝑍 = 𝑘|𝜇 = 𝑒
𝑘!

𝐸 𝑍|𝜇 = 𝜇, 𝑉𝑎𝑟 𝑍|𝜇 = 𝜇

𝛽 𝛼 −𝛽𝜇 𝛼−1
𝑓 𝜇 = 𝑒 𝜇
Γ(𝛼)

𝛼 𝛼
𝐸 𝜇 = , 𝑉𝑎𝑟 𝜇 = 2
𝛽 𝛽
3
Poisson-Gamma mixing


𝑃𝑟𝑜𝑏 𝑍 = 𝑘 = න 𝑃𝑟𝑜𝑏 𝑍 = 𝑘|𝜇 𝑓 𝜇 𝑑𝜇
0


𝜇𝑘 −𝜇 𝛽 𝛼 −𝛽𝜇 𝛼−1
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 = න 𝑒 𝑒 𝜇 𝑑𝜇
0 𝑘! Γ(𝛼)


𝛽𝛼
= න 𝑒 −𝜇(1+𝛽) 𝜇𝑘+𝛼−1 𝑑𝜇
Γ(𝛼)𝑘! 0

Resembles a gamma distribution

4
Negative binomial distribution

Γ(𝑘 + 𝛼) 𝛽𝛼
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 =
Γ(𝛼)𝑘! 𝛽 + 1 𝑘−𝛼

𝛼 𝛼 𝛼
𝐸𝑍 = , 𝑉𝑎𝑟 𝑍 = + 2 = 𝐸 𝜇 + 𝑉𝑎𝑟[𝜇]
𝛽 𝛽 𝛽

5
Re-parameterized NB distribution
(as used in R)

𝜃 𝑘
Γ 𝑘+𝜃 𝜃 𝜇
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 =
Γ 𝜃 𝑘! 𝜃+𝜇 𝜃+𝜇

Where,

𝛼 1 2 𝛼 𝛼
𝐸 𝑍 =𝜇 = , 𝑉𝑎𝑟 𝑍 = 𝜇 + 𝜇 = + 2
𝛽 𝜃 𝛽 𝛽

6
𝑉𝑎𝑟 𝑍 > 𝑉𝑎𝑟 𝜇

7
Image source: Hauer, E. (2015). The art of regression modeling in road safety (Vol. 38). New York: Springer.
Recap: Law of iterated expectations

𝐸 𝑌 = 𝐸𝑋 𝐸 𝑌|𝑋

• 𝐸𝑋 . is the expectation over the values of 𝑋.

8
Law of total variance

𝑉𝑎𝑟 𝑌 = 𝐸 𝑉𝑎𝑟 𝑌|𝑋 + 𝑉𝑎𝑟 𝐸 𝑌 𝑋

9
Deriving the law of total variance (I)

𝑉𝑎𝑟 𝑌|𝑋 = 𝐸 𝑌 2 𝑋 − 𝐸 𝑌|𝑋 2 (𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒)


⇒ 𝐸 𝑉𝑎𝑟 𝑌|𝑋 = 𝐸 𝐸 𝑌 2 |𝑋 − 𝐸[ 𝐸 𝑌|𝑋 2 ]

= 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 (𝑖𝑡𝑒𝑟𝑎𝑡𝑒𝑑 𝑒𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛𝑠)

= 𝐸 𝑌2 − 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 − 𝐸𝑌 2

10
Deriving the law of total variance (I)
𝐸 𝑉𝑎𝑟 𝑌|𝑋 =
= 𝐸 𝑌 2 − 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 − 𝐸 𝑌 2
2
= 𝐸 𝑌 2 − 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 − 𝐸 𝐸 𝑌|𝑋
= 𝑉𝑎𝑟 𝑌 − 𝑉𝑎𝑟 𝐸 𝑌|𝑋

⇒ 𝑉𝑎𝑟 𝑌 = 𝐸 𝑉𝑎𝑟 𝑌|𝑋 + 𝑉𝑎𝑟 𝐸 𝑌|𝑋

11
What is the distribution of mean conditional on observing
crashes?

• 𝑘|𝜇~𝑃𝑜𝑖𝑠𝑠𝑜𝑛[𝜇] 𝜇|𝑘 ∼ ?
• 𝜇~𝐺𝑎𝑚𝑚𝑎[𝛼, 𝛽]
𝑃 𝑘 𝜇 𝑓(𝜇)
𝑓 𝜇|𝑘 = ∝ 𝑃 𝑘 𝜇 𝑓(𝜇)
𝑃(𝑘)

𝜇𝑘 −𝜇 𝛽 𝛼−1 −𝛽𝜇 𝛼−1


∝ 𝑒 𝑒 𝜇
𝑘! Γ 𝛼

∝ 𝑒 −(1+𝛽)𝜇 𝜇 (𝛼+𝑘)−1

~Γ[𝛼 + 𝑘, 𝛽 + 1]
12
Combining information from 𝝁 and 𝒌

𝑓 𝜇|𝑘 ~Γ[𝛼 + 𝑘, 𝛽 + 1]

𝛼+𝑘 𝛼+𝑘
𝐸 𝜇|𝑘 = , 𝑉𝑎𝑟 𝜇|𝑘 =
𝛽+1 (𝛽 + 1)2

13
𝑬 𝝁|𝒌 is a weighted average of 𝑬 𝝁 and 𝒌

𝛼 𝑘
𝐸 𝜇|𝑘 = +
𝛽+1 𝛽+1

𝛼 𝛼
Since,𝐸 𝜇 = 𝛽, substituting 𝛽=𝐸 𝜇 ,

𝛼 𝑘 𝛼𝐸 𝜇 𝑘𝐸 𝜇
𝐸 𝜇|𝑘 = 𝛼 + 𝛼 = +
+1 +1 𝛼+𝐸 𝜇 𝛼+𝐸 𝜇
𝐸𝜇 𝐸𝜇

𝛼 1 1
𝑤= = =
𝛼+𝐸 𝜇 1 + 𝐸 𝜇 /𝛼 1 + 𝐸 𝜇 𝜃

14
𝑬 𝝁|𝒌 is a weighted average of 𝑬 𝝁 and 𝒌

𝐸 𝜇|𝑘 = 𝒘𝐸 𝜇 + 1 − 𝒘 𝑘

where,
𝛼 1 1
𝒘= = =
𝛼+𝐸 𝜇 1 + 𝐸 𝜇 /𝛼 1 + 𝐸 𝜇 𝜃

• 𝐸 𝜇 ↑ 𝒘 ↓: 𝐸 𝜇|𝑘 → 𝛼 + 𝑘
• 𝜃 ↑ 𝒘 ↓: 𝐸 𝜇|𝑘 → 𝑘

15
𝑬 𝝁|𝒌 is an empirical Bayes estimate of 𝝁

• 𝐸 𝜇 : prior, obtained from empirical data/modeling


• 𝐸 𝜇|𝑘 : posterior estimate

16
Let us revisit the AADT model

Let’s compare three models


• Model 1: CNTYPOP, NUMLANES2, NUMCLASS3, FUNCLASS3

• Model 2: CNTYPOP, CNTYPOP_Urb, Urban, NUMLANES2, NUMCLASS3,


FUNCLASS3

• Model 3: log(CNTYPOP), NUMLANES2, NUMCLASS3, FUNCLASS3

17
What is the variation across urban/rural segments?

• Other variables are also likely to be correlated with Urban (e.g., FUNCLASS3)

18
Comparison of linear models

• Which model would you prefer? Why?

• Are the relevant variables significant


with appropriate signs?

19
Checking for heteroskedasticity

Null hypothesis assumes


homoskedasticity, which can be rejected
for all three models at low significance
levels

20
Recap: Heteroskedasticity Consistent Estimator

−1 𝑛 −1
1 1 ′ 1 1 ′

Estimate. Asy. Var 𝛃|𝐗 = 𝐗𝐗 ෍ 𝑒𝑖2 𝐱 𝐢 𝐱 𝐢′ 𝐗𝐗
𝑛 𝑛 𝑛 𝑛
𝑖=1

• The above estimator is also known as Eicker-Huber-White heteroscedasticity consistent


estimator
• This estimator is robust to unknown heteroskedasticity, and provides “robust” standard
errors for confidence intervals.
• Other techniques include weighted least squares (WLS), if the nature of
heteroskedasticity is known (see 4.4.2 of Washington et al.)

Additional reference for more mathematical background: http://people.stern.nyu.edu/wgreene/MathStat/GreeneChapter9.pdf 21


How different are robust standard errors?

Before After

22
Considering Poisson and NB alternatives

Let’s compare three models


• Model 1 (linear): CNTYPOP, NUMLANES2, NUMCLASS3, FUNCLASS3

• Model 4 (Poisson): CNTYPOP, NUMLANES2, NUMCLASS3, FUNCLASS3

• Model 5 (Poisson): log(CNTYPOP), NUMLANES2, NUMCLASS3, FUNCLASS3

• Model 6 (NB): log(CNTYPOP), NUMLANES2, NUMCLASS3, FUNCLASS3

23
Comparison of generalized linear
models

• Linear model is a special case of


generalized linear models

• Would you prefer log(CNTYPOP) over


CNTYPOP as an explanatory for Poisson
and NB?
𝐾−1
❑ 𝐸[𝜇𝑖 ] = 𝑒 σ𝑗=0 β𝑗 𝑥𝑖𝑗

• Would you prefer NB over Poisson?

24
So which model would you prefer for modelling AADT?
(Model selection)

25
Training vs test data

• Out-of-sample predictions are expected to be worse than in-sample predictions due to


the possibility of overfitting when seeking models with favourable goodness-of-fit
criteria
❑ Penalizing for increase in variables helps mitigate this issue.

Image source: https://medium.com/greyatom/what-is-underfitting-and-overfitting-in-machine-learning-and-how-to-deal-with-it-6803a989c76 26


Estimating training vs test data differences for AADT models

• Given the small sample size, we can split the data as 90% training and 10% test data
• We run 100 iterations of train-test splits, and estimate the training and test RMSE and
MAD for:
• Model 1 (preferred linear model out of three)
• Model 2 (preferred Poisson model out of two)
• Model 3 (preferred NB model)

27
Mean Absolute Deviation

Train Test

28
Root Mean Squared Error

Train Test

29
Overdispersion tests

• When data are overdispersed, the estimated variance is larger than expected from a true
Poisson process → standard errors get inflated.

Given a Poisson regression


𝐻0 : 𝑣𝑎𝑟 𝑦𝑖 = 𝜇𝑖
𝐻𝐴 : 𝑣𝑎𝑟 𝑦𝑖 = 𝜇𝑖 + 𝛼𝜇𝑖2
Regression-based tests can be undertaken to test 𝛼 = 0 assumption. (see section 11.5 of
Washington et al. or Cameron and Trivedi (1990))
• AER package in R has a function dispersiontest

30
A new goodness-of-fit statistic: deviance

ෝ = 2 𝐿𝐿𝑚𝑎𝑥 𝐲 − 𝐿𝐿𝑓𝑖𝑡𝑡𝑒𝑑 (ෝ
𝐷 𝛍 𝛍)
𝐿𝐿𝑚𝑎𝑥 : Maximum possible value of likelihood (𝜇ෝ𝑖 = 𝑦𝑖 )
• For Poisson distribution:𝐿𝐿𝑚𝑎𝑥 = σ𝑁
𝑖=1 𝑦𝑖 log 𝑦𝑖 − 𝑦𝑖 − log 𝑦𝑖 !

• For normal distribution: 𝐷𝑁 = σ𝑁


𝑖 𝑦𝑖 − 𝜇𝑖
2

𝑦𝑖
• For Poisson distribution: 𝐷𝑁 = σ𝑁
𝑖 𝑦𝑖 log − (𝑦𝑖 − 𝜇𝑖 )
𝜇𝑖

• Null Deviance: 𝜇ෝ𝑖 = 𝑦ത (also the outcome of an intercept-only model)

31
Comments
Discussion
Questions

E-mail: [email protected]
32

You might also like