CE687A Lecture23
CE687A Lecture23
2022-23, Semester I
IIT Kanpur
1
Disclaimer
This course material is being distributed as part of CE687A, titled “Statistical and
Econometric Methods for Transportation Engineering ", at IIT Kanpur during
semester I of the academic year 2022-23. Its contents are being shared in
confidence, for the sole purpose of instruction, and are only meant for the
students registered in this course. Any form of distribution, reproduction or
uploading of these materials anywhere, or with anyone, outside this course is
strictly prohibited.
2
Poisson-Gamma mixing
𝜇𝑘 −𝜇
𝑃𝑟𝑜𝑏 𝑍 = 𝑘|𝜇 = 𝑒
𝑘!
𝛽 𝛼 −𝛽𝜇 𝛼−1
𝑓 𝜇 = 𝑒 𝜇
Γ(𝛼)
𝛼 𝛼
𝐸 𝜇 = , 𝑉𝑎𝑟 𝜇 = 2
𝛽 𝛽
3
Poisson-Gamma mixing
∞
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 = න 𝑃𝑟𝑜𝑏 𝑍 = 𝑘|𝜇 𝑓 𝜇 𝑑𝜇
0
∞
𝜇𝑘 −𝜇 𝛽 𝛼 −𝛽𝜇 𝛼−1
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 = න 𝑒 𝑒 𝜇 𝑑𝜇
0 𝑘! Γ(𝛼)
∞
𝛽𝛼
= න 𝑒 −𝜇(1+𝛽) 𝜇𝑘+𝛼−1 𝑑𝜇
Γ(𝛼)𝑘! 0
4
Negative binomial distribution
Γ(𝑘 + 𝛼) 𝛽𝛼
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 =
Γ(𝛼)𝑘! 𝛽 + 1 𝑘−𝛼
𝛼 𝛼 𝛼
𝐸𝑍 = , 𝑉𝑎𝑟 𝑍 = + 2 = 𝐸 𝜇 + 𝑉𝑎𝑟[𝜇]
𝛽 𝛽 𝛽
5
Re-parameterized NB distribution
(as used in R)
𝜃 𝑘
Γ 𝑘+𝜃 𝜃 𝜇
𝑃𝑟𝑜𝑏 𝑍 = 𝑘 =
Γ 𝜃 𝑘! 𝜃+𝜇 𝜃+𝜇
Where,
𝛼 1 2 𝛼 𝛼
𝐸 𝑍 =𝜇 = , 𝑉𝑎𝑟 𝑍 = 𝜇 + 𝜇 = + 2
𝛽 𝜃 𝛽 𝛽
6
𝑉𝑎𝑟 𝑍 > 𝑉𝑎𝑟 𝜇
7
Image source: Hauer, E. (2015). The art of regression modeling in road safety (Vol. 38). New York: Springer.
Recap: Law of iterated expectations
𝐸 𝑌 = 𝐸𝑋 𝐸 𝑌|𝑋
8
Law of total variance
9
Deriving the law of total variance (I)
= 𝐸 𝑌2 − 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 − 𝐸𝑌 2
10
Deriving the law of total variance (I)
𝐸 𝑉𝑎𝑟 𝑌|𝑋 =
= 𝐸 𝑌 2 − 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 − 𝐸 𝑌 2
2
= 𝐸 𝑌 2 − 𝐸 𝑌 2 − 𝐸 𝐸 𝑌|𝑋 2 − 𝐸 𝐸 𝑌|𝑋
= 𝑉𝑎𝑟 𝑌 − 𝑉𝑎𝑟 𝐸 𝑌|𝑋
11
What is the distribution of mean conditional on observing
crashes?
• 𝑘|𝜇~𝑃𝑜𝑖𝑠𝑠𝑜𝑛[𝜇] 𝜇|𝑘 ∼ ?
• 𝜇~𝐺𝑎𝑚𝑚𝑎[𝛼, 𝛽]
𝑃 𝑘 𝜇 𝑓(𝜇)
𝑓 𝜇|𝑘 = ∝ 𝑃 𝑘 𝜇 𝑓(𝜇)
𝑃(𝑘)
∝ 𝑒 −(1+𝛽)𝜇 𝜇 (𝛼+𝑘)−1
~Γ[𝛼 + 𝑘, 𝛽 + 1]
12
Combining information from 𝝁 and 𝒌
𝑓 𝜇|𝑘 ~Γ[𝛼 + 𝑘, 𝛽 + 1]
𝛼+𝑘 𝛼+𝑘
𝐸 𝜇|𝑘 = , 𝑉𝑎𝑟 𝜇|𝑘 =
𝛽+1 (𝛽 + 1)2
13
𝑬 𝝁|𝒌 is a weighted average of 𝑬 𝝁 and 𝒌
𝛼 𝑘
𝐸 𝜇|𝑘 = +
𝛽+1 𝛽+1
𝛼 𝛼
Since,𝐸 𝜇 = 𝛽, substituting 𝛽=𝐸 𝜇 ,
𝛼 𝑘 𝛼𝐸 𝜇 𝑘𝐸 𝜇
𝐸 𝜇|𝑘 = 𝛼 + 𝛼 = +
+1 +1 𝛼+𝐸 𝜇 𝛼+𝐸 𝜇
𝐸𝜇 𝐸𝜇
𝛼 1 1
𝑤= = =
𝛼+𝐸 𝜇 1 + 𝐸 𝜇 /𝛼 1 + 𝐸 𝜇 𝜃
14
𝑬 𝝁|𝒌 is a weighted average of 𝑬 𝝁 and 𝒌
𝐸 𝜇|𝑘 = 𝒘𝐸 𝜇 + 1 − 𝒘 𝑘
where,
𝛼 1 1
𝒘= = =
𝛼+𝐸 𝜇 1 + 𝐸 𝜇 /𝛼 1 + 𝐸 𝜇 𝜃
• 𝐸 𝜇 ↑ 𝒘 ↓: 𝐸 𝜇|𝑘 → 𝛼 + 𝑘
• 𝜃 ↑ 𝒘 ↓: 𝐸 𝜇|𝑘 → 𝑘
15
𝑬 𝝁|𝒌 is an empirical Bayes estimate of 𝝁
16
Let us revisit the AADT model
17
What is the variation across urban/rural segments?
• Other variables are also likely to be correlated with Urban (e.g., FUNCLASS3)
18
Comparison of linear models
19
Checking for heteroskedasticity
20
Recap: Heteroskedasticity Consistent Estimator
−1 𝑛 −1
1 1 ′ 1 1 ′
Estimate. Asy. Var 𝛃|𝐗 = 𝐗𝐗 𝑒𝑖2 𝐱 𝐢 𝐱 𝐢′ 𝐗𝐗
𝑛 𝑛 𝑛 𝑛
𝑖=1
Before After
22
Considering Poisson and NB alternatives
23
Comparison of generalized linear
models
24
So which model would you prefer for modelling AADT?
(Model selection)
25
Training vs test data
• Given the small sample size, we can split the data as 90% training and 10% test data
• We run 100 iterations of train-test splits, and estimate the training and test RMSE and
MAD for:
• Model 1 (preferred linear model out of three)
• Model 2 (preferred Poisson model out of two)
• Model 3 (preferred NB model)
27
Mean Absolute Deviation
Train Test
28
Root Mean Squared Error
Train Test
29
Overdispersion tests
• When data are overdispersed, the estimated variance is larger than expected from a true
Poisson process → standard errors get inflated.
30
A new goodness-of-fit statistic: deviance
ෝ = 2 𝐿𝐿𝑚𝑎𝑥 𝐲 − 𝐿𝐿𝑓𝑖𝑡𝑡𝑒𝑑 (ෝ
𝐷 𝛍 𝛍)
𝐿𝐿𝑚𝑎𝑥 : Maximum possible value of likelihood (𝜇ෝ𝑖 = 𝑦𝑖 )
• For Poisson distribution:𝐿𝐿𝑚𝑎𝑥 = σ𝑁
𝑖=1 𝑦𝑖 log 𝑦𝑖 − 𝑦𝑖 − log 𝑦𝑖 !
𝑦𝑖
• For Poisson distribution: 𝐷𝑁 = σ𝑁
𝑖 𝑦𝑖 log − (𝑦𝑖 − 𝜇𝑖 )
𝜇𝑖
31
Comments
Discussion
Questions
E-mail: [email protected]
32