Lecture 13-GLMM2
Lecture 13-GLMM2
2023
7
6 5
Response
Clusters: Kinds of groups in the data
3 4
Features: Aspects of the model
2
1
(parameters) that vary by cluster 2 4 6
Story
8 10 12
7
Cluster Features
6
tanks survival
5
Response
4
stories treatment effect
3
individuals average response
2
departments admission rate, bias
1
0 10 20 30 40 50
Participant
7
Cluster Features
6
tanks survival
5
Response
stories treatment effect
4
individuals average response
3
departments admission rate, bias
2
1
2 4 6 8 10 12
Story
7
Add clusters: More index variables,
6
more population priors
5
Response
4
Add features: More parameters, more
3
dimensions in each population prior
2
1
0 10 20 30 40 50
Participant
Varying effects as confounds
Varying effect strategy: Unmeasured
features of clusters leave an imprint on the
data that can be measured by (1) repeat
observations of each cluster and (2) partial
pooling among clusters
E Y G
P
Participation
Individual
U
Treatment
X R S
E Y G
P
Participation
X1 X2
Government G1 W G2 Government
War
X1 X2
G1 W G2
War
N1 N2
Nation Nation
Varying effects as confounds
Causal perspective: Competing causes
or actual confounds
X1 X2
Advantage over “fixed effect”
approach: Can include other cluster-
level (time invariant) causes G1 W G2
Fixed effects: Varying effects with
variance fixed at infinity, no pooling N1 N2
Don’t panic: Make a generative model
and draw the owl
Practical Difficulties
7
6 5
Response
4
Varying effects are a good default, but…
3 2
(1) How to use more than one cluster
1
2 4 6 8 10 12
type at the same time? Story
7
(2) How to calculate predictions
6 5
Response
(3) How to sample chains efficiently
3 4
(4) Group-level confounding
2
1
0 10 20 30 40 50
Participant
Fertility & behavior
1989 Bangladesh Fertility Survey
data(bangladesh)
K U
kids urbanity
1. Causes of interest
contraceptive
use
age C district
A D
K U
kids urbanity
2. Competing causes
contraceptive
use
age C district
A D
K U
kids urbanity
3. Relationships among causes
contraceptive
use
age C district
A D
K U
kids urbanity
4. Unfortunate relationships among causes
contraceptive
use
age C district
A D
K U
kids urbanity
5. A series of unfortunate relationships among causes
contraceptive
use
family district
age C
F A D
K U
kids urbanity
Varying districts
1 3 5 7 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 55 58 61
district
prob use contraception
0.0 0.2 0.4 0.6 0.8 1.0
0
10
20
30
Raw proportions
district
40
50
60
prob use contraception
0.0 0.2 0.4 0.6 0.8 1.0
0
10
20
30
Posterior means
district
40
50
60
1.0
0 10 20 30 40 50 60
district
Sample sizes
1.0
2
prob use contraception
0.8
35 45
0.6
0.4
0.2
13 6
4 10
21 14
0.0
0 10 20 30 40 50 60
district
No data
1.0
2
prob use contraception
0.8
35 45
0.6
0.4
0.2
13 6
4 10
21 14
0.0
0 10 20 30 40 50 60
district
Varying districts
1.0
2
0.6
0.4
0.2
13 6
4 10
21 14
0.0
Partial pooling shrinks districts 0 10 20 30
district
40 50 60
K U
kids urbanity
Varying districts + urban
What is the effect of urban living? contraceptive
use
District features are potential district
group-level confounds
age C
A D
Total effect of U passes through K
K U
kids urbanity
Do not stratify by K!
Ci ∼ Bernoulli(pi)
logit(pi) = αD[i] + βD[i]Ui
contraceptive
αj ∼ Normal(ᾱ, σ) use
age C district
βj ∼ Normal(β̄, τ) A D
ᾱ, β̄ ∼ Normal(0,1) K U
kids urbanity
σ, τ ∼ Exponential(1)
Ci ∼ Bernoulli(pi)
logit(pi) = αD[i] + βD[i]Ui
2.0
1.5
1.0
0.5
tau n_eff = 45
PAUSE
More priors, more problems
Ci ∼ Bernoulli(pi) Priors inside priors: “centered”
logit(pi) = αD[i] + βD[i]Ui
αj ∼ Normal(ᾱ, σ)
βj ∼ Normal(β̄, τ)
ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
More priors, more problems
Ci ∼ Bernoulli(pi)
logit(pi) = αD[i] + βD[i]Ui
zj = (αj − ᾱ)/σ
αj ∼ Normal(ᾱ, σ)
βj ∼ Normal(β̄, τ)
ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
More priors, more problems
Ci ∼ Bernoulli(pi)
logit(pi) = αD[i] + βD[i]Ui
zj = (αj − ᾱ)/σ
αj ∼ Normal(ᾱ, σ)
βj ∼ Normal(β̄, τ) αj = ᾱ + zα,j × σ
ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
More priors, more problems
Ci ∼ Bernoulli(pi)
logit(pi) = αD[i] + βD[i]Ui
zj = (αj − ᾱ)/σ
αj ∼ Normal(ᾱ, σ)
βj ∼ Normal(β̄, τ) αj = ᾱ + zα,j × σ
ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
zα,j ∼ Normal(0,1)
Centered varying intercepts Non-centered varying intercepts
Ci ∼ Bernoulli(pi) Ci ∼ Bernoulli(pi)
logit(pi) = αD[i] + βD[i]Ui logit(pi) = αD[i] + βD[i]Ui
αj ∼ Normal(ᾱ, σ) αj = ᾱ + zα,j × σ
βj = β̄ + zβ,j × τ
βj ∼ Normal(β̄, τ)
zα,j ∼ Normal(0,1)
ᾱ, β̄ ∼ Normal(0,1) zβ,j ∼ Normal(0,1)
σ, τ ∼ Exponential(1) ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
Ci ∼ Bernoulli(pi)
mCDUnc <- ulam(
alist( logit(pi) = αD[i] + βD[i]Ui
C ~ bernoulli(p),
logit(p) <- a[D] + b[D]*U,
# define effects using other parameters αj = ᾱ + zα,j × σ
save> vector[61]:a <<- abar + za*sigma,
save> vector[61]:b <<- bbar + zb*tau, βj = β̄ + zβ,j × τ
# z-scored effects
vector[61]:za ~ normal(0,1), zα,j ∼ Normal(0,1)
vector[61]:zb ~ normal(0,1),
# ye olde hyper-priors zβ,j ∼ Normal(0,1)
c(abar,bbar) ~ normal(0,1),
c(sigma,tau) ~ exponential(1)
) , data=dat , chains=4 , cores=4 )
ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
mean sd 5.5% 94.5% n_eff Rhat4
bbar 0.62 0.16 0.37 0.86 1513 1.00 Ci ∼ Bernoulli(pi)
abar -0.70 0.09 -0.84 -0.56 1457 1.00
tau 0.55 0.23 0.17 0.92 368 1.01 logit(pi) = αD[i] + βD[i]Ui
sigma 0.49 0.09 0.36 0.64 753 1.00
αj = ᾱ + zα,j × σ
tau n_eff = 368
βj = β̄ + zβ,j × τ
zα,j ∼ Normal(0,1)
zβ,j ∼ Normal(0,1)
ᾱ, β̄ ∼ Normal(0,1)
σ, τ ∼ Exponential(1)
prob use contraception prob use contraception
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
0
0
10
10
rural
20
20
30
30
district
district
40
40
50
50
60
60
urban
prob use contraception prob use contraception
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
0
0
2
11
2
10
10
rural
21
20
20
8
10
14
30
30
7
district
district
26
40
40
6
3
5
4
50
50
4
20
10
60
60
urban
2 2 2 4
1.0
1.0
prob use contraception
0.8
10
20
0.6
0.6
0.4
0.4
0.2
0.2
7 10 8
21 7 4 3
0.0
0.0
14
0 10 20 30 40 50 60 0 10 20 30 40 50 60
district district
rural
4
3
Density
urban
prior
2
1
0
0.7
0.6
prob C (urban)
0.5
0.4
0.3
0.2
0.7
data (districts)
0.6
prob C (urban)
Features: Aspects of the model
0.5
(parameters) that vary by cluster
0.4
(rural, urban)
0.3
There is useful information to
0.2
transfer across features 0.1 0.2 0.3 0.4 0.5 0.6 0.7
prob C (rural)
Course Schedule
Week 1 Bayesian inference Chapters 1, 2, 3
Week 2 Linear models & Causal Inference Chapter 4
Week 3 Causes, Confounds & Colliders Chapters 5 & 6
Week 4 Overfitting / MCMC Chapters 7, 8, 9
Week 5 Generalized Linear Models Chapters 10, 11
Week 6 Ordered categories & Multilevel models Chapters 12 & 13
Week 7 More Multilevel models Chapters 13 & 14
Week 8 Multilevel models & Gaussian processes Chapter 14
Week 9 Measurement & Missingness Chapter 15
Week 10 Generalized Linear Madness Chapter 16
https://github.com/rmcelreath/stat_rethinking_2023