MULTILEVEL ANALYSIS - Snijders 2012
MULTILEVEL ANALYSIS - Snijders 2012
MULTILEVEL ANALYSIS - Snijders 2012
Tom A. B. Snijders
http://www.stats.ox.ac.uk/~snijders/mlbook.htm
Department of Statistics
University of Oxford
2012
Foreword
Literature:
Tom Snijders & Roel Bosker,
Multilevel Analysis: An Introduction to Basic and Applied Multilevel Analysis,
2nd edition. Sage, 2012.
Chapters 1-2, 4-6, 8, 10, 13, 14, 17.
There is an associated website
http://www.stats.ox.ac.uk/~snijders/mlbook.htm
containing data sets and scripts for various software packages.
These slides are not self-contained, for understanding them it is necessary
also to study the corresponding parts of the book!
2
2. Multilevel data and multilevel analysis 7
macro-level micro-level
schools teachers
classes pupils
neighborhoods families
districts voters
firms departments
departments employees
families children
litters animals
doctors patients
interviewers respondents
judges suspects
subjects measurements
respondents = egos alters
4
2. Multilevel data and multilevel analysis 11–12
Multilevel analysis is a suitable approach to take into account the social contexts
as well as the individual respondents or subjects.
The hierarchical linear model is a type of regression analysis for multilevel data
where the dependent variable is at the lowest level.
Explanatory variables can be defined at any level
(including aggregates of level-one variables).
Z Z Z
. . . .@
.. ... ...@
... ... . . A. . . . . . .
@ @ A
@
R y @
R y A
@
x @
- x AU - y
5
2. Multilevel data and multilevel analysis 7–8
1. Dependence as a nuisance
Standard errors and tests base on OLS regression are suspect
because the assumption of independent residuals is invalid.
2. Dependence as an interesting phenomenon
It is interesting in itself to disentangle variability at the various levels;
moreover, this can give insight in the directions
where further explanation may fruitfully be sought.
6
4. The random intercept model 42
In the random intercept model, the intercepts β0j are random variables
representing random differences between groups:
Yij = β0j + β1 xij + Rij .
where β0j = average intercept γ00 plus group-dependent deviation U0j :
β0j = γ00 + U0j .
8
4. The random intercept model 45
9
4. The random intercept model 45
y12
Y p
regression line group 2
R12
regression line group 3
β02
regression line group 1
β03
β01
X
10
4. The random intercept model 46–47
Arguments for choosing between fixed (F ) and random (R) coefficient models for
the group dummies:
1. If groups are unique entities and inference should focus on these groups: F .
This often is the case with a small number of groups.
2. If groups are regarded as sample from a (perhaps hypothetical) population and
inference should focus on this population, then R .
This often is the case with a large number of groups.
3. If level-two effects are to be tested, then R .
4. If group sizes are small and there are many groups, and it is reasonable to
assume exchangeability of group-level residuals, then R makes better use of the
data.
5. If the researcher is interested only in within-group effects, and is suspicious
about the model for between-group differences, then F is more robust.
6. If group effects U0j (etc.) are not nearly normally distributed, R is risky
(or use more complicated multilevel models).
11
4. The random intercept model 49; also see 17–18
12
4. The random intercept model 50
Deviance 26595.3
13
4. The random intercept model 50–51
Intraclass correlation
18.12
ρI = = 0.22
18.12 + 62.85
Total population of individual values Yij has estimated mean 41.00 and standard
√
deviation 18.12 + 62.85 = 9.00 .
Population of class means β0j has estimated mean 41.00 and standard deviation
√
18.12 = 4.3 .
The model becomes more interesting,
when also fixed effects of explanatory variables are included:
Yij = γ00 + γ10 xij + U0j + Rij .
14
4. The random intercept model 52–53
Table 4.2 Estimates for random intercept model with effect for IQ
Deviance 24912.2
Deviance 25351.0
16
4. The random intercept model 54–55
Y .........
................
..
................
... ...........................
. .
................
................ ........
..
................ ........
................
55 .
. .. . .... . . . ....
.................
.......................
. . .... . .
........
........ ...............
. . .
. .
........ .....................................
................ ........ ...........................................................
................ ........ . . . . .
................ ........ ........................................................................................
................ ........ . . . . . . .
.............................. ..... . ......... ..................................................................................................................
................. ..... .........................................................................................................
50 ................
..
. ...
................
................
. . .
........
. . . ..
. .. . . . . . . . . . .
........ ......................................................................................................................................................
. . . . . . . . . .
........ ......................................................................................................................................................
................ ........ . . . . . . . . . .
....................... ........ ...................................................................................................................................................... ........
..
. ... . . . . . ... . . . . . . . . . . . . . ........
................ ........ ................ ........ ........................ ................ ........ ........ ........
. ........................... . ..... ........ ...................................................................................................................................................... ... .. ........
. .
..... ..... ..... . ..... ..... ..... .......... ..... ..... ..... ..... ..... ..... .. .......
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
. . ..... ....................... ..... . ......... ...................................................................................................................................................... ............... ....... ........
. . . . . . . . . . . . . . .
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
. . ............................ .. ... . ......... ...................................................................................................................................................... ..... .. ........ ..... .. ........
. . . . . . . . . . . . . . .
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
...... . ....................... ... .. . ......... ...................................................................................................................................................... ... .... ........ ...............
. . . . . . . . . . . . . . .
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
................ ........ ...................................................................................................................................................... ........ ........
................ ........ . . . . . . . . . . ........ ........
... . . . . ....................... . . . ... . ........ ...................................................................................................................................................... ............... .... ...........
.. ... . . . . . . . . . . . . .
....................... ........ ...................................................................................................................................................... ........ ........
........ . . . . . . . . . . ........ ........
.. ........ ...................................................................................................................................................... ........ ........
........ . . . . . . . . . . ........ ........
. .. .. . ......... ...................................................................................................................................................... ...... . ........ ..... .. ........
....... ............... ....... ..................... .............. ....... ........ ........
.
........
.
........ ....................................................................................................................................................... ........ ........
........... ...... .......... ...... ...... ...... ...... ...... ........ ........
................ ........ ........................ ................ ................ ........ ........
................................................................................................................................
.................................................................................................
.... .... .... .... .... .... .... ........
........
. ...... . ........
........
........
................
30
..................................................................................... ........ ........
........ ........
. ... ... ... ...
....................................................... ... . ... ........ . ..... . ........
. .
. ................. ........ ........
..... ...... ........ ........
........ ........
........
........
..... . . .. ... ........
........
........
. ....
. .. ........
........ 25
........
........
........
. . . . . . .........
.
−4 −3 −2 −1 0 1 2 3 4 X = IQ
Figure 4.2 Fifteen randomly chosen regression lines according to the random intercept model of
Table 4.2.
17
4. The random intercept model 54–59
Yij = γ00 + γ10 x1ij + . . . + γp0 xpij + γ01 z1j + . . . + γ0q zqj
+ U0j + Rij .
Especially important:
difference between within-group and between-group regressions.
The within-group regression coefficient is the regression coefficient within each
group, assumed to be the same across the groups.
The between-group regression coefficient is defined as the regression coefficient for
the regression of the group means of Y on the group means of X.
This distinction is essential to avoid ecological fallacies (p. 15–17 in the book).
18
4. The random intercept model 54–59
Y
between-group regression line
"
"
"
"
"
"
"
regression line
"
"
within group 2
"
" regression line within group 3
"
"
"
regression line within group 1
"
"
"
"
"
"
"
X
This is obtained by having separate fixed effects for the level-1 variable X
and its group mean X̄.
(Alternative:
use the within-group deviation variable X̃ij = (X − X̄) instead of X.)
19
4. The random intercept model 54–59
Deviance 24888.0
20
4. The random intercept model 53–54
In the model with separate effects for the original variable xij and the group mean
Yij = γ00 + γ10 xij + γ01x.j + U0j + Rij ,
the within-group regression coefficient is γ10 ,
between-group regression coefficient is γ10 + γ01.
This is convenient because the difference between within-group and between-group
coefficients can be tested by considering γ01.
In the model with separate effects for group-centered variable x̃ij
and the group mean
Yij = γ̃00 + γ̃10 x̃ij + γ̃01x.j + U0j + Rij ,
the within-group regression coefficient is γ̃10 ,
the between-group regression coefficient is γ̃01.
This is convenient because these coefficients are given immediately in the results,
with their standard errors.
Both models are equivalent, and have the same fit: γ̃10 = γ10, γ̃01 = γ10 + γ01.
21
4. The random intercept model 62–63
22
4. The random intercept model 62–63
23
4. The random intercept model 62–63
These ‘estimates’ are not unbiased for each specific group, but they are more
precise when the mean squared errors are averaged over all groups.
For models with explanatory variables, the same principle can be applied:
the values that would be obtained as OLS estimates per group are
“shrunk towards the mean”.
The empirical Bayes estimates, also called posterior means,
are also called shrinkage estimators.
24
4. The random intercept model 64–66
There are two kinds of standard errors for empirical Bayes estimates:
comparative standard errors
EB EB
S.E.comp Ûhj = S.E. Ûhj − Uhj
for comparing the random effects of different level-2 units
(use with caution – E.B. estimates are not unbiased!);
and diagnostic standard errors
EB EB
S.E.diag Ûhj = S.E. Ûhj
used for model checking (e.g., checking normality of the level-two residuals).
25
4. The random intercept model 67
U0j
10
---
-- --•••
-
--
- - - ------- •
- -
5 ----------------- ••••••---
- -
--
-- - - - --- ---- -------------------- ••••••••••••••••••••••• --
- -- - --- ----------------- ---- --
- •••••••••••••••• - ------------
- - - •
•
-- - --- --- --- ----- - - - ••••••••••••••• - --- ------ - -
----------------- --------- ---- -- --- •••••••••••••••••••••••••••••••••• - --------------- -- -
0 - -- •••••••••••••••••••• ---------------
-------------------- ••••••••••••••••••••••••••••••••••• ------- - - -
--- ----- - ----- -- -
- ---
- - - ••••••••••• - --- - ---- ---
- -------
•••••••••
••• - --- -------------- -- - ----- -
- ---- ••••• --- - -- - - --
------- ••••••• ------ ----------- --- --- -
•••••• -- - -
−5 - •••••••• - --------- - -
•••••• ---- ----- ---
• --- -- --
-
----- -
-
−10
The ordered added value scores for 211 schools with comparative posterior confidence intervals.
In this figure, the error bars extend 1.39 times the comparative standard errors
to either side, so that schools may be deemed to be significantly different
if the intervals do not overlap (no correction for multiple testing!).
26
5. The hierarchical linear model 74–75
Substitution leads to
Yij = γ00 + γ10 xij + U0j + U1j xij + Rij .
Variable X now has a random slope.
27
5. The hierarchical linear model 74–75
Thus we have a linear model for the mean structure, and a parametrized
covariance matrix within groups with independence between groups.
28
5. The hierarchical linear model 78
29
5. The hierarchical linear model 78
Y
........
............
55 ..... .. .. . . . .. . . ............
............
... . . . . . . . . . . . ............
. .. . . . .
............
. .. ... .. .. .
............ ...........
. .
. . . . . .
. .
. .
..................................................
......................................................................
.
.......................................................
..
............ ............................................. ..................
............ ..................................................... .............................
............ ............................................... ........ .........................................
50 ............
............
............
............
..... . . . . . . . .. . . ............
............
. .... . . .. .. .. .
......................................... ........ .................................. ..........
...................................................................................................................................................................
. . .
............... ........................... ......... ..........................
.............. ............................................................... ..................
...................... ............................................................
........................ ..................... .......................................... ..........
..
.
..........
. .
............ ..........
................
......
30
. . .
............ .............. .......... ............................... ..................................
............................................... .......... . . . .
........... . .. . ..
..
..
....................... .............................. .................................
.
................. . .
........ ........ .................
. .
.................. ........ ........ ............
................... ........ ........ .......
................... ........ ........ ...........
.................. ..................................................... .
........................... .... . . .
... ... .... ...
........ ..........................
.....
........ .................. ........
........ .................. ........
........ .................. ........
........................ . ...............
... ..........
......
25
........
−4 −3 −2 −1 0 1 2 3 4 X = IQ
Figure 5.2 Fifteen random regression lines according to the model of Table 5.1.
Note the heteroscedasticity: variance is larger for low X than for high X.
The lines fan in towards the right.
Intercept variance and intercept-slope covariance depend on the position of the
X = 0 value, because the intercept is defined by the X = 0 axis.
30
5. The hierarchical linear model 80
31
5. The hierarchical linear model 82
Deviance 24856.8
32
5. The hierarchical linear model 83–84
For two variables (IQ and SES) and two levels (student and school),
the main effects and interactions give rise to a lot of possible combinations:
Table 5.3 Estimates for model with random slopes and many effects
Deviance 24624.0
34
5. The hierarchical linear model 85–86
Table 5.4 Estimates for a more parsimonious model with a random slope and many effects
Deviance 24626.8
35
Estimation for the hierarchical linear model
36
Estimation for the hierarchical linear model
37
6. Testing 94–98
6. Testing
To test fixed effects, use the t-test with test statistic
γ̂h
T (γh) = .
S.E.(γ̂h)
(Or the Wald test for testing several parameters simultaneously.)
The standard error should be based on REML estimation.
Degrees of freedom for the t-test, or the denominator of the F -test:
For a level-1 variable: M − r − 1,
where M = total number of level-1 units, r = number of tested level-1 variables.
For a level-2 variable: N − q − 1,
where N = number of level-2 units, q = number of tested level-2 variables.
For a cross-level interaction: again N − q − 1,
where now q = number of other level-2 variables interacting with this level-1
variable.
If d.f. ≥ 40, the t-distribution can be replaced by a standard normal.
38
6. Testing 94–98
39
6. Testing 94–98
Model 1 Model 2
Fixed Effects Coefficient S.E. Coefficient S.E. Test for equality of within- and
γ00 = Intercept 41.15 0.23 41.15 0.23 between-group regressions
γ10 = Coeff. of IQ 2.265 0.065
is t-test for IQ in Model 1:
γ20 = Coeff. of IQ
e 2.265 0.065
γ30 = Coeff. of SES 0.161 0.011 0.161 0.011 t = 0.647/0.264 = 2.45,
γ01 = Coeff. of IQ 0.647 0.264 2.912 0.262 p < 0.02.
40
6. Testing 98–99
41
6. Testing 98–99
α
p 0.10 0.05 0.01 0.001
1 3.81 5.14 8.27 12.81
2 5.53 7.05 10.50 15.36
3 7.09 8.76 12.48 17.61
42
6. Testing 98–99
For example: testing for a random slope in a model that further contains the
random intercept but no other random slopes: p = 1;
testing the second random slope: p = 2;
testing the third random slope: p = 3 – etc.
To test the random slope in the model of Table 5.1,
compare with Table 4.4 which is the same model but without the random slope;
deviance difference 15,227.5 – 15,213.5 = 14.0.
In the table with p = 1 this yields p < 0.001.
Further see p. 99.
43
7. Explained variance 109–110
7. Explained variance
The individual variance parameters may go up when effects are added to the model.
σ̂ 2 τ̂02
I. BALANCED DESIGN
A. Yij = β0 + U0j + Eij 8.694 2.271
B. Yij = β0 + β1 X .j + U0j + Eij 8.694 0.819
C. Yij = β0 + β2(Xij − X .j ) + U0j + Eij 6.973 2.443
44
7. Explained variance 112–113
The best way to define R2, the proportion of variance explained, is the
proportional reduction in total variance ;
for the random intercept model total variance is (σ 2 + τ02).
σ̂ 2 τ̂02
A. Yij = β0 + U0j + Eij 8.694 2.271
D. Yij = β0 + β1 (Xij − X .j ) + β2 X .j + U0j + Eij 6.973 0.991
45
8. Heteroscedasticity 119-120
8. Heteroscedasticity
The multilevel model allows to formulate heteroscedastic models where residual
variance depends on observed variables.
E.g., random part at level one = R0ij + R1ij x1ij .
Then the level-1 variance is a quadratic function of X:
var(R0ij + R1ij xij ) = σ02 + 2 σ01 x1ij + σ12 x21ij .
46
8. Heteroscedasticity 121
Model 1 Model 2
Fixed Effect Coefficient S.E. Coefficient S.E.
Intercept 40.426 0.265 40.435 0.266
IQ 2.249 0.062 2.245 0.062
SES 0.171 0.011 0.171 0.011
IQ × SES –0.020 0.005 –0.019 0.005
Gender 2.407 0.201 2.404 0.201
IQ 0.769 0.293 0.749 0.292
SES –0.093 0.042 –0.091 0.042
IQ × SES –0.105 0.033 –0.107 0.033
47
8. Heteroscedasticity 121
48
8. Heteroscedasticity 122
49
8. Heteroscedasticity 122–123
Y
..............
...............
8 ..
...........
.
...
...
...
..............
.............
............
..........
..........
.........
..
...
...........
.
........
.......
........
4 .
.......
..
...
.
.......
.......
......
......
......
..
.......
.
.
......
......
.....
.....
..
......
−4 −2 .....
......
......
.
......
2 4 IQ
..
..
.......
......
......
......
.......
.
.......
......
.......
..
...
........
−4
.......
.
...
.........
.
..
........
........
........
........
........
.
.........
.........
.........
..
...
..
−8
..........
.
...
...
............
......
...........
51
10. Assumptions of the hierarchical linear model 152–153
r
X p
X
Yij = γ0 + γh xhij + U0j + Uhj xhij + Rij .
h=1 h=1
Questions:
52
10. Assumptions of the hierarchical linear model 154–156; also 56–59
53
Within- and between-group regressions 154–156; also 56–59
What kind of bias can occur if this assumption is made but does not hold?
For a misspecified model,
suppose that we are considering a random intercept model:
Zj = 1 j
where the expected value of Uj is not 0 but
E Uj = z2j γ?
for 1 × r vectors z2j and an unknown regression coefficient γ?. Then
Uj = z2j γ? + Ũj
with
E Ũj = 0 .
54
Within- and between-group regressions 154–156; also 56–59
Write Xj = X̄j + X̃j , where X̄j = 1j (10j 1j )−110j Xj are the group means.
Then the data generating mechanism is
Yj = X̄j γ + X̃j γ + 1j z2j γ? + 1j Ũj + Rj ,
where E Ũj = 0 .
There will be a bias in the estimation of γ
if the matrices Xj = X̄j + X̃j and 1j Ũj are not orthogonal.
By construction, X̃j and 1j Ũj are orthogonal, so the difficulty is with X̄j .
55
Within- and between-group regressions 155-161
4. Check heteroscedasticity.
See Chapter 8.
56
Residuals 161–165
Level-one residuals
OLS within-group residuals can be written as
R̂j = Inj − Pj Yj
57
Residuals 161–165
58
Residuals 164
-
2 --
2 -
••
- - --- - - - -
- - - - • -- - - -
• -
• - - ••• - • - - - • - - • - - - -
-
0 • - - ••• - - • - - -
• • •• - - - • - -•
• •• -
•
•
-
•
- • - -
- - •••
-
0 •••• • • - •• ••
-
- - - - -
• -
- - - -
- -
−2
- - • - - - - - -
- - -
-- −2
- - -
-
−4 −2 0 2 4
IQ
−10 0 10 20 SES
Mean level-one OLS residuals Mean level-one OLS residuals
(bars ∼ twice standard error of the mean) as function of SES.
as function of IQ.
59
Residuals 164
r̄ r̄
2 - -
-- 2
---- - - -
- - - - - - -
- - - - • - •- -
• ••• - - - • • - - - - -
• • ••• -
- --
-• - - •• -•-• - -
0 •- • • •• • - • - •• • •
• • 0
•
• - - - - • -
- -•
•- - • - • - • ••
- - - - -• - - - - -
- - - - - - - -
- - - - -
−2 −2 -
- -
−4 −2 0 2 4 −10 0 10 20
IQ SES
Mean level-one OLS residuals
Mean level-one OLS residuals as function of SES.
as function of IQ.
60
Residuals 165
observed
.....
. ....
3
......
..
.....
.
2 .
......
. .
1 ......
..
. ....
0 ...
.
...
−1 ...
...
.
...
...
−2 .
......
..
−3 ........
.... ..
. .
.. ... . .. expected
−3−2−1 0 1 2 3
Level-two residuals
Empirical Bayes (EB) level-two residuals defined as conditional means
Ûj = E {Uj | Y1, . . . , YN }
(using parameter estimates γ̂, θ̂, ξ̂)
= Ω̂ Zj0 V̂j−1 (Yj − Xj γ̂j ) = Ω̂ Zj0 V̂j−1 (Zj Uj + Rj − Xj (γ̂ − γ))
where
Vj = Cov Yj = Zj ΩZj0 + Σj , V̂j = Zj Ω̂Zj0 + Σ̂j ,
with Ω̂ = Ω(ξ̂) and Σ̂j = Σj (θ̂).
You don’t need to worry about the formulae.
62
Residuals 165–167 and 62–67
Note that
Cov (Uj ) = Cov (Uj − Ûj ) + Cov (Ûj ) .
63
Residuals 165–167 and 62–67
However,
0
−1
Ûj0 {Cov
d (Ûj )}−1Ûj ≈ Û
j
(OLS)
σ̂ 2(Zj0 Zj )−1 + Ω̂ (OLS)
Ûj
where Ûj(OLS)
= (Zj0 Zj )−1Zj0 (Yj − Xj γ̂j )
is the OLS estimate of Uj , estimated from level-1 residuals Yj − Xj γ̂j .
This shows that standardization by diagnostic variances
takes away the difference between OLS and EB residuals.
Therefore, in checking standardized level-two residuals,
the distinctoin between OLS and EB residuals loses its meaning.
Test the fixed part of the level-2 model using non-standardized EB residuals.
64
Residuals 166
U0j . .. U0j . . .
. . . .
.. . . . . .
. . . ... . .. . .
4 . . . . .... .. . . 4
. . . .... . . .. . .... . ... . .
..
. . . . . .............. . . . . . .. .. ... .. . .... ..... . .. ... .
. . . .. .. . . . . .
. . . . . . ............ .. ..... .. .. ............ .. . .. .... .. .. . . . ◦◦
. .. . . . . .
◦ ◦◦
. . . . . . . . . ... . . . .
.. . . . . ... ... ..
◦
. . .. . ... .. . . .
−4 . . . . .... . . −4 .. . . . .. .
. ..
. . . .
. . . .. .
. . mean . . mean
−2 0 2 IQ −10 0 10 SES
65
Residuals 166
U1j U1j
. .. . ..
. .. . .... .
. . . . .. . . .
0.5
. . 0.5 . ... . . . ...... ... .
. . .. .. ..... . ... . . .......... ..... . . ..
.. ..... . .. ..... .... ...... ........ .
.. . ....... ... .......... .
...... . .. .. . ◦
◦
...... ....... . . . ..... . ..
◦◦◦
0 . ◦ ◦ ◦◦
. ..
◦
...
◦◦◦
.
◦◦◦◦◦◦
◦◦
◦◦◦◦◦
◦◦
◦
◦◦◦ ◦
. ◦◦ ◦◦◦ .
◦
◦ ◦◦◦
◦
................. ..... . .. .. . .
◦◦
◦
◦◦
◦◦
◦◦
◦◦◦
◦◦
. ◦
◦◦◦◦
... 0 .. .
.
◦◦
◦◦◦
◦◦◦
◦
◦◦◦
◦
◦◦◦
◦
◦
◦◦
◦ ◦◦
◦
◦◦ ◦◦ ◦
◦
◦◦◦◦
◦
◦◦◦
◦
◦ ◦
◦◦ ◦
◦
◦◦◦◦◦
◦
◦ ◦
◦
◦◦◦
◦ ◦
◦◦
◦
◦◦◦
◦◦◦
◦◦◦
◦◦
◦◦◦
◦
◦◦◦◦◦◦
66
Residuals 169–170
Multivariate residuals
The multivariate residual is defined, for level-two unit j, as
Yj − Xj γ̂.
If all variables with fixed effects also have random effects, then
M 2 = (nj − tj ) s2 + Û 0 {Cov
j j j
d (Ûj )}−1 Ûj ,
where
1
s2j = R̂j0 R̂j , tj = rank(Xj ) .
nj − tj
This indicates how well the model fits to group j.
Note the confounding with level-1 residuals.
If an ill-fitting group does not have a strong effect on the parameter estimates,
then it is not so serious.
67
Residuals 169–170
Deletion residuals
The deletion standardized multivariate residual can be used to assess the fit of
group j, but takes out the effect of this group on the parameter estimates:
2
0 −1
M(-j) = Yj − Xj γ̂(-j) V̂(-j) Yj − Xj γ̂(-j)
where
V̂(-j) = Zj Ω̂(-j) Zj0 + Σ̂(-j) ,
(-j) meaning that group j is deleted from the data for estimating this parameter.
69
Residuals 169–170
School nj Cj pj School nj Cj pj
182 9 0.053 0.293 117 27 0.014 0.987
107 17 0.032 0.014 153 22 0.013 0.845
229 9 0.028 0.115 187 26 0.013 0.022
14 21 0.027 0.272 230 21 0.012 0.363
218 24 0.026 0.774 15 8 0.012 0.00018
52 21 0.025 0.024 256 10 0.012 0.299
213 19 0.025 0.194 122 23 0.012 0.005
170 27 0.021 0.194 50 24 0.011 0.313
67 26 0.017 0.139 101 23 0.011 0.082
18 24 0.016 0.003 214 21 0.011 0.546
70
Residuals 169–170
School nj Cj pj School nj Cj pj
213 19 0.094 0.010 18 24 0.015 0.003
182 9 0.049 0.352 230 21 0.015 0.391
107 17 0.041 0.006 169 30 0.014 0.390
187 26 0.035 0.009 170 27 0.013 0.289
52 21 0.028 0.028 144 16 0.013 0.046
218 24 0.025 0.523 117 27 0.013 0.988
14 21 0.024 0.147 40 25 0.012 0.040
229 9 0.016 0.175 153 22 0.012 0.788
67 26 0.016 0.141 15 8 0.011 0.00049
122 23 0.016 0.004 202 14 0.010 0.511
71
Residuals 169–170
School 15 now does survive the Bonferroni correction: 211 × 0.00049 = 0.103.
Therefore now add the heteroscedasticity of Model 4 in Chapter 8.
Another school (108) does have poor fit p = 0.00008, but small influence
(Cj = 0.008).
Leaving out ill-fitting schools does not lead to appreciable differences in results.
The book gives further details.
72
11. Designing Multilevel Studies 176–179
73
11. Designing Multilevel Studies 176–179
74
11. Designing Multilevel Studies 179–180
τ2 σ2
var(µ̂) = + .
N Nn
The sample mean of a simple random sample of N n elements from this
population has variance
τ 2 + σ2
.
Nn
75
11. Designing Multilevel Studies 179–180
76
11. Designing Multilevel Studies —
77
11. Designing Multilevel Studies –
78
11. Designing Multilevel Studies –
Now
N X
n
1 X
β̂1 = xij Yij (2)
N ns2X j=1 i=1
N X
n
1 X
= β1 + xij Eij
N ns2X j=1 i=1
with variance
σ2
var(β̂1) = .
N ns2X
disaggregated
σ2 + τ 2
var(β̂1 )= .
N ns2X
79
11. Designing Multilevel Studies –
80
11. Designing Multilevel Studies –
81
11. Designing Multilevel Studies –
82
11. Designing Multilevel Studies –
83
11. Designing Multilevel Studies –
84
11. Designing Multilevel Studies –
86
11. Designing Multilevel Studies –
σ̃ 2 = (1 − ρ2W ) σ 2 ,
τ̃ 2 = (1 − ρ2B ) τ 2 .
88
11. Designing Multilevel Studies 180–186
Example:
sample sizes for therapy effectiveness study.
Outcome variable Y , unit variance, depends on
X1 (0–1) course for therapists: study variable,
X2 therapists’ professional training,
X3 pretest.
Suppose pre-thinking leads to the following guesstimates:
Means: µ1 = 0.4, µ2 = µ3 = 0.
Variances between groups:
var(X1) = 0.24 (because µ1 = 0.4)
var(X2) = var(X3) = 1 (standardization)
ρI (X3) = 0.19 (prior knowledge)
ρ(X1, X2) = −0.4 (conditional randomization)
ρ(X1, X̄3 | X2) = 0 (randomization) ⇒ ρ(X1, X̄3) = 0.2
ρ(X2, X̄3) = 0.5 (prior knowledge).
89
11. Designing Multilevel Studies 180–186
2
This yields σX 3 (W )
= 1 − 0.19 = 0.81 and
0.24 −0.20 0.04
ΣX(B) = −0.20 1.0 0.22 .
0.04 0.22 0.19
90
11. Designing Multilevel Studies 180–186
0
var(Yij ) = β12σX(W
2 2
) + β ΣX(B) β + τ0 + σ
2
91
11. Designing Multilevel Studies 180–186
S.E.(β̂1)
0.26 ◦◦◦◦◦
◦◦◦◦◦
0.23 ◦◦◦◦◦◦◦
◦◦◦◦◦◦
0.20 ◦◦◦◦◦◦◦
◦◦◦◦◦◦
∗ ∗∗∗∗∗∗∗∗
0.17 ∗∗ ∗ ∗∗∗∗∗∗∗ ∗
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗
0.14
5 10 20 40 n 30
Figure 1 Standard errors for estimating β1 ,
for 20N + N n ≤ 1,000;
∗ for σ 2 = 0.6, τ02 = 0.1;
◦ for σ 2 = 0.5, τ02 = 0.2.
92
11. Designing Multilevel Studies 180–186
93
11. Designing Multilevel Studies 188–190
S.E.(ρ̂I) = (1 − ρI )
s
2
× (1 + (n − 1)ρI ) .
n(n − 1)(N − 1)
Budget constraint:
substitute N = k/(c1 + c2n) and plot as a function of n for various ρI .
E.g., suppose 20 N + N n ≤1000
and 0.1 ≤ ρI ≤ 0.2.
94
11. Designing Multilevel Studies 188–190
S.E.(ρ̂I)
0.15 ∗ ◦
0.125
0.10 ◦
∗
◦
∗◦
0.075
∗◦
∗◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦ ρI = 0.20
0.05 ∗∗∗
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ρI = 0.10
0.025
2 10 20 30 40 n
Figure 2 Standard error for estimating
intraclass correlation coefficient for
budget constraint 20N + N n ≤ 1000
with ρI = 0.1 and 0.2.
Variance parameters
Approximate formulae for random intercept model
(Longford, 1993)
s
2 2 2
S.E.(σ̂ ) ≈ σ
N (n − 1)
and
s
2 1 2 τ02 n τ04
S.E.(τ̂02) ≈ σ 2 + + .
Nn n−1 σ2 σ4
Standard errors for estimated standard deviations:
S.E.(σ̂ 2)
S.E.(σ̂) ≈ ,
2σ
and similarly for S.E.(τ̂0).
Same procedure:
substitute N = k/(c1 + c2n) and plot as function of n.
Example in Cohen (J. Off. Stat., 1998).
96
12.2 Sandwich Estimator for Standard Errors 197–201
1. Can the research question be answered with the incompletely specified model?
(Cf. the nuisance-interest contrast.)
Are only the fixed parameters of interest?
2. Is the model specification of the random part adequate?
3. Are sample sizes etc. sufficient to make the sandwich estimator a reliable
estimator of variance?
4. Is the loss of efficiency acceptable?
For the latter two questions, the degree of imbalance between the clusters will
be an issue.
98
13. Imperfect Hierarchies 205–206
1. Cross-classified
2. Multiple membership
3. Multiple membership & multiple classification
99
13.1 Cross-classified models 206
Cross-classification
Individuals are simultaneously members of several social units
(such as neighborhood and schools)
This leads to crossed factors (random effects)
Individuals uniquely belong to a combination of both factors
100
13.1 Cross-classified models 207
Adapted notation
Level 1 unit (say, pupils) indicated by i, simultaneously nested in
101
13.1 Cross-classified models 207, 209
Intra-class correlations:
1. Units i within same j, different k
2
τW
2
τW + τ02 + σ 2
Models:
103
13.1 Cross-classified models 208
Model 1 Model 2
104
13.1 Cross-classified models 209
2. Correlation between grades of pupils who attended the same secondary school
but came from different primary schools
τ02 0.066
2 2 2
= = 0.141;
τW + τ0 + σ 0.467
3. Correlation between grades of pupils who attended both the same primary and
the same secondary school
2
τW + τ02 0.072
2 2 2
= = 0.154.
τW + τ0 + σ 0.467
105
13.1 Cross-classified models 209–210
Model 2 Model 3
2
τW = var(W0k ) primary school 0.006 0.005 0.003 0.003
τ02 = var(U0j ) secondary school 0.066 0.014 0.034 0.008
σ 2 = var(Rij ) 0.395 0.010 0.330 0.008
106
13.3 Multiple membership models 210–211
Multiple memberships
Individuals are/have been members of several social units
(such as several different secondary schools)
Only one random factor at level two
(as opposed to two or more factors in cross-classified models)
Individuals belong to multiple levels of that factor
107
13.3 Multiple membership models 211–212
Membership weights
We weight membership by importance
(such as duration of stay in particular school)
PN
Weights wih for each pupil i in school h, with h=1 wih = 1
Example:
1. Pupil 1: only in school 1
2. Pupil 2: equal time in schools 1 and 3
3. Pupil 3: equal time in schools 1, 2, 3
School
Pupil 1 2 3
1 1 0 0
2 0.5 0 0.5
3 0.33 0.33 0.33
108
13.3 Multiple membership models 211-212
Remaining 6%
• 215 attend two schools (wi vary between 0.2 and 0.8)
• 5 attend three (wi vary between 0.2 and 0.25)
Models:
110
13.3 Multiple membership models 213
111
13.3 Multiple membership models 213
112
13.4 Multiple membership multiple classification models 213-214
Figure 13.4 Example of pupils nested within multiple schools crossed with
neighborhoods
113
13.4 Multiple membership multiple classification models 214
Model 7 Model 8
114
14. Survey weights 216
115
14. Survey weights 217
1. Descriptive
e.g., estimate mean or proportion in a specific population.
2. Analytic
how do variables of interest depend on other variables?
Here often there is a wish to generalize to a larger population.
116
14. Survey weights 217
Connected to this, but a distinct issue is the fact that the use of a probability
model can be founded on different bases:
1. Design-based
This uses the probability distribution that is implied by the sampling design.
This usually is for a sample from a finite population.
The probabilities are under control of the researcher,
except for non-response and other sampling errors.
2. Model-based
The researcher assumes that data can be regarded
as outcomes of a probability model with some unknown parameters.
This usually assumes a hypothetical infinite population,
e.g., with normal distributions.
Assumptions are made about independence etc.
117
14. Survey weights 218–219
118
14. Survey weights 219–222
1. Sampling weights, used in complex surveys, which are the inverses of the
probabilities of including an element in the sample;
i.e., population elements that are undersampled
get a higher weight to correct for the undersampling.
Purpose: unbiased estimation, correct for non-uniform sampling.
2. Precision weights, expressing that some data points are associated with more
precision than others, and therefore get a stronger effect on the results.
Example: multiplicity weights (representing number of cases).
Purpose: lower estimation variance, i.e., increased precision.
119
14. Survey weights 219–222
These kinds of weight differ in their implications for the standard errors.
A survey sample with a few cases with very high weights can lead to high standard
errors, because the results are dominated by these high-weight cases.
The effective sample size, for a sample with weights wi, is
P 2
i wi
neff = P 2
i wi
(applicable for estimating a mean, not for regression).
This is equal to the number of sample elements if wi is constant,
and else it is smaller: loss of efficiency.
The design effect is the ratio of effective sample size
to number of sample elements.
120
14. Survey weights 222
121
14. Survey weights 222
Suppose that the inclusion probability of schools is associated with the ethnic
composition, but no data about this are available.
Then the analysis effectively uses a SES variable that also represents the
unobserved ethnic composition;
if the analysis is unweighted and sampling is not uniform,
the regression coefficient of SES will average over ethn. comp.
in proportions according to the sample, differring from those in the population.
122
14. Survey weights 223
Thus, the design is used to try and make the model more interesting and
appropriate; a design-based analysis is followed only if the model-based analysis
seems unreliable.
123
14. Survey weights 223–230
3. Apply the hierarchical linear model separately to different parts of the survey.
5. Carry out model-based and design-based elements within each level-2 unit,
and compare the results.
For a two-level design, this leads to single-level within-group analyses.
The differences between the two kinds of analyses can be inspected and tested.
Again, this is done with the hope of being able to choose for a model-based
approach; if this is not justified, then inspection of the differences can give insight
into how the design is associated with the variables of interest.
125
14. Survey weights 223–230
Equation (4.17) gives Asparouhov’s (2006) measure for the informativeness of the
sample design, based on such a comparison:
γ̂hHLM − γ̂hW
I2 = ,
S.E. γ̂hHLM
126
14. Survey weights 231–236
Numbers of sampled schools (left) and sampled pupils (right) per stratum,
for PISA data, USA, 2009.
Midwest Northeast South West
Public schools 38 1,264 26 732 55 1,776 35 1,116
Private schools 2 71 2 46 4 108 3 120
127
14. Survey weights 231–236
There are too few private schools to say much specifically about them.
Level-one weights all are between .95 and 1.00: can be disregarded.
128
14. Survey weights 231–236
The unweighted results are seriously biased for villages and cities,
and unweighted standard errors are unreliable.
129
14. Survey weights 231–236
130
14. Survey weights 231–236
131
14. Survey weights 231–236
Estimates for model for metacognitive competence for five parts of the data set:
fixed effects.
Fixed effects Par. S.E. Par. S.E. Par. S.E. Par. S.E. Par. S.E.
Intercept –0.72 0.48 –0.12 0.10 –0.15 0.10 –0.17 0.06 –0.18 0.07
Male –0.27 0.12 –0.34 0.06 –0.33 0.06 –0.28 0.06 –0.35 0.06
Age –0.05 0.23 –0.05 0.12 –0.18 0.12 –0.26 0.11 –0.01 0.12
Grade 0.25 0.12 0.20 0.06 0.24 0.07 0.28 0.06 0.18 0.07
Immigrant –0.13 0.09 0.01 0.04 0.05 0.05 –0.03 0.06 0.06 0.09
ESCS 0.04 0.08 0.10 0.04 0.14 0.04 0.08 0.04 0.11 0.04
Sch-imm 0.51 0.37 0.04 0.14 0.14 0.16 0.10 0.16 –0.29 0.36
Sch-ESCS 0.74 0.40 0.16 0.12 0.11 0.12 0.25 0.13 0.10 0.13
132
14. Survey weights 231–236
Estimates for model for metacognitive competence for five parts of the data set:
variance parameters.
N (schools) 11 38 39 38 39
133
14. Survey weights 231–236
Conclusions:
Next page:
Estimates for model for metacognitive competence, including urbanization, for five
parts of the data set: fixed effects.
Reference category for urbanization is ‘city’.
134
14. Survey weights 231–236
N schools 11 38 39 38 39
Fixed effects Par. S.E. Par. S.E. Par. S.E. Par. S.E. Par. S.E.
Intercept –0.90 0.40 –0.16 0.12 0.03 0.13 0.01 0.13 –0.01 0.23
Male –0.27 0.11 –0.33 0.06 –0.33 0.06 –0.28 0.06 –0.35 0.06
Age –0.07 0.23 –0.04 0.12 –0.19 0.12 –0.27 0.11 –0.01 0.12
Grade 0.26 0.12 0.20 0.06 0.25 0.07 0.28 0.06 0.18 0.07
Immigrant –0.13 0.09 0.01 0.04 0.05 0.05 –0.03 0.06 0.06 0.09
ESCS 0.03 0.08 0.10 0.04 0.14 0.04 0.08 0.04 0.11 0.04
Sch-imm 0.85 0.33 0.11 0.15 –0.00 0.18 0.11 0.17 0.12 0.43
Sch-ESCS 1.06 0.33 0.19 0.13 0.14 0.13 0.21 0.14 0.15 0.14
Large city –0.80 0.26 –0.10 0.12 0.01 0.17 –0.32 0.18 –0.51 0.34
Town –0.36 0.24 0.12 0.12 –0.17 0.13 –0.20 0.13 –0.38 0.26
Small town — 0.02 0.21 –0.42 0.17 –0.22 0.13 –0.24 0.25
Village — — — –0.23 0.17 –0.12 0.24
135
14. Survey weights 231–236
N schools 11 38 39 38 39
136
14. Survey weights 231–236
Conclusion
Next page:
Design-based and model-based estimates for model for metacognitive competence,
entire data set: fixed effects.
I2 is Asparouhov’s (2006) informativeness measure.
137
14. Survey weights 240–243
Design-based Model-based
Fixed effects Par. S.E. Par. S.E. I2
138
14. Survey weights 240–243
Design-based Model-based
139
14. Survey weights 240–243
Conclusion:
Differences remain, mainly with respect to school-level variables.
A residual analysis showed that there is one outlier:
a school with 6,694 pupils enrolled,
while other schools range between 100 and 3,592.
This school and all private schools were excluded,
and square root of school size included as control variable.
Next page:
Design-based and model-based estimates for model for metacognitive competence,
public schools without outlier: fixed effects.
I2 is Asparouhov’s (2006) informativeness measure.
140
14. Survey weights 240–243
Design-based Model-based
Design-based Model-based
Conclusion:
Only the coefficient of age still differs between the two types of analysis.
This suggests that age effects depend on the design variables.
This could be, e.g., school enrollment or urbanization.
142
14. Survey weights 240–243
Next page:
Design-based and model-based estimates for model for metacognitive competence,
public schools without outlier, with more extensive controls: fixed effects.
I2 is Asparouhov’s (2006) informativeness measure.
143
14. Survey weights 240–243
Design-based Model-based
Design-based Model-based
Conclusion:
It seems that now we arrived at a good model.
145
14. Survey weights 240–243
1. Fixed occasions :
All subjects measured at the same time points.
2. Variable occasions :
time points can be different; also number of measurements may differ.
(Then there are few alternative approaches.)
149
15. Longitudinal data 250-251
150
15. Longitudinal data 250-251
151
15. Longitudinal data 253-254
Here again the assumptions of the random intercept model are quite restrictive.
It is likely that individual respondents differ not only in their mean value over time,
but also in rate of change and other aspects of time dependence.
This is modeled by including random slopes of time,
and of non-linear transformations of time.
Random slope of time:
Yti = µt + U0i + U1i (t − t0) + Rti ;
with a covariate, e.g.,
Yti = µt + αzi + γzi(t − t0) + U0i + U1i (t − t0) + Rti .
152
15. Longitudinal data 254
153
15. Longitudinal data 254
Model 3 Model 4
Fixed effect Coefficient S.E. Coefficient S.E.
154
15. Longitudinal data 254-255
Interpretation :
3.67 2.16 2.06 1.96 1.87 1.77
2.16 3.50 2.01 1.94 1.87 1.80
c
2.06 2.01 3.38 1.92 1.87 1.82
Σ(Y ) =
b
1.96
1.94 1.92 3.31 1.87 1.85
1.87 1.87 1.87 1.87 3.29 1.88
1.77 1.80 1.82 1.85 1.88 3.33
1.00 0.60 0.58 0.56 0.54 0.51
0.60 1.00 0.58 0.57 0.55 0.53
0.58 0.58 1.00 0.57 0.56 0.54
R̂(Y c) =
0.56
.
0.57 0.57 1.00 0.57 0.56
0.54 0.55 0.56 0.57 1.00 0.57
0.51 0.53 0.54 0.56 0.57 1.00
156
15. Longitudinal data 254-255
157
15. Longitudinal data 256-258
158
15. Longitudinal data 256-258
159
15. Longitudinal data 256-258
160
15. Longitudinal data 256-258
!
4.039 1.062
Σ(Y
b )=
c
.
1.062 3.210
161
15. Longitudinal data (not in the book)
Model 4 Model 5
Fixed effect Coefficient S.E. Coefficient S.E.
162
15. Longitudinal data (not in the book)
3.67 2.16 2.06 1.96 1.87 1.77 3.61 2.11 1.92 1.92 1.84 1.60
2.16 3.50 2.01 1.94 1.87 1.80 2.11 3.60 2.15 1.95 1.90 1.80
2.06 2.01 3.38 1.92 1.87
1.82 and
1.92 2.15 3.26 1.96 1.73
1.73
1.96 1.94 1.92 3.31 1.87 1.85 1.92 1.95 1.96 3.49 1.85 1.85
1.87 1.87 1.87 1.87 3.29 1.88 1.84 1.90 1.97 2.07 3.29 1.87
1.77 1.80 1.82 1.85 1.88 3.33 1.60 1.80 1.73 1.85 1.87 2.88
163
15. Longitudinal data 263-266
164
15. Longitudinal data 263-266
Table 15.7 Linear growth model for 5–10-year-old children with retarded growth.
165
15. Longitudinal data 263-266
166
15. Longitudinal data 263-266
Next to linear models, polynomial models and spline models can be considered.
Here a simple approach to splines: fixed nodes determined by trial and error.
167
15. Longitudinal data 265-267
Table 15.8 Cubic growth model for 5–10-year-old children with retarded growth
Deviance 6603.75
168
15. Longitudinal data 265-267
Estimated correlation matrix for level-two random slopes (U0i, U1i, U2i, U3i) :
1.0 0.17 −0.27 0.04
0.17 1.0 0.11 −0.84
R
bU = .
−0.27 0.11 1.0 −0.38
0.04 −0.84 −0.38 1.0
Again ‘predicted’ curves from empirical Bayes estimates – looks almost linear:
169
15. Longitudinal data 268-270
170
15. Longitudinal data 268-270
Table 15.9 Piecewise linear growth model for 5–10-year-old children with retarded growth
Deviance 6481.87
171
15. Longitudinal data 268-270
Estimated correlation matrix of the level-two random effects (U0i, ..., U5i)
and ‘predicted’ curves :
1.0 0.22 0.31 0.14 −0.05 0.09
0.22 1.0 0.23 0.01 0.18 0.33
0.31 0.01 1.0 0.12 −0.16 0.48
R
bU = .
0.14 0.01 0.12 1.0 0.47 −0.23
−0.05 0.18 −0.16 0.47 1.0 0.03
0.09 0.33 0.48 −0.23 0.03 1.0
172
15. Longitudinal data 270-276
173
15. Longitudinal data 270-276
174
15. Longitudinal data 270-276
Table 15.10 Cubic spline growth model for 12–17-year-old children with retarded growth
175
15. Longitudinal data 270-276
Estimated correlation matrix of the level-two random effects (U0i, ..., U4i) :
1.0 0.26 −0.31 0.32 0.01
0.26 1.0
0.45 −0.08 −0.82
b U = −0.31 0.45 1.0 −0.89 −0.71
R
.
0.32 −0.08 −0.89 1.0
0.40
0.01 −0.82 −0.71 0.40 1.0
176
15. Longitudinal data 274
length
170 . . .. . ..................................
. . . ...........................................
. ... .. . . . . . . . . .. . .
. . .. .................... ............................
160 . . . .. . ..... ... . . . ..
. .
... . .
.........................∗ ......∗
....∗
. .....∗.....∗
. .....∗ ....
........ .... . . . . .. . .
...........................∗......∗ . ....∗
. .
....∗ ..∗
. . ..
................ .....
. .. . . .
..
.. . . .... . . ..
.
. . . . . . . ... .. ..
... . . . .. .. .
. .
..
. .... .
.
......................∗ ....∗ ....∗ . ..∗
.
. ...............................................
150 ... ...... ......................∗ ......∗...∗
.....∗ ∗ .............................
..........................................................................∗ .
.......∗.....∗
....
. . .. .
...
..
....... ..........
. .. .
....
. .
............................................∗ ..
.....∗ .....∗
..∗ ..
....∗
..
..∗
.. ∗ . .
..... ......... . . . . . .
. . . . .. . .. .
140 . . .. . .. . .. . . . ..∗. . .. . .
..
..
..
. . . . . . . . . .. ... ..
. .. ...............................
. .
.......∗
............∗ .
.
...∗. .....∗
. . ∗
. . . .. .. .
..................................................................
....∗ . .
.
130 ∗.....∗ .....................................................
............
120
12 13 14 15 16 17
t (age in years)
Figure 15.1 Average growth curve (∗) and 15 random growth curves for
12–17-year-olds for cubic spline model.
Next: explain growth variability by gender and parents’ height (Table 15.11 ).
177
15. Longitudinal data 275-276
178
15. Longitudinal data 275-276
Estimated correlation matrix of the level-two random effects (U0i, ..., U4i) :
1.0 0.22 −0.38 0.35 0.07
0.22 1.0
0.38 −0.05 −0.81
b U = −0.38 0.38 1.0 −0.91 −0.75
R
.
0.35 −0.05 −0.91 1.0
0.48
0.07 −0.81 −0.75 0.48 1.0
179
17. Discrete dependent variables 292–293
180
17. Discrete dependent variables 294-295
2 ...
...
...
...
..
...
p 1 ...
...
.
..
...
.
logit(p) = ln , ..
...
...
...
.
....
1−p ...
...
...
.
...
....
.
0 ...
...
...
.
...
.
.
p
where ln(x) denotes the natural logarithm .....
..
...
...
...
.
....
1
...
...
.
...
.
...
...
181
17. Discrete dependent variables 294-295
p
0.05 0.12 0.27 0.50 0.73 0.88 0.95
.. ... ... ... ... ... ... ... ...
... ... ...
... ... ... ... ... ... ... ...
...
...
... ... ... ... .. ... ... ... ...
.... ... ... ... ... ... ..
. ..
. ...
.. ... ...
...
... ... ... .
... ... ...
... ... ...
...
... ... ... ... ... ...
... ...
... ... ... ... ... ... ... ...
.... ... ... ... ... ... ... ..
. ...
.. ... ...
... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
...
... ... ... ...
... ... ... ... ... ...
.... ... ... ... ... ... ... ..
. ...
.. ... ... ... ... ... ... ... ...
... ...
...
...
... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
.... ... ... ...
... ... ... .... ..
. ...
.. ... ... ... ... ... ... ... ...
... ... ...
... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
...
...
.... ...
... ... ... ... ... ... ..
. ...
.. ... ... ... ... ... ... ... ...
... ... ...
...
...
... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
.... ... ... ... ... ... ..
. ..
. ...
.. ... ... ... ... ... .
... ... ...
... ... ... ... ... ... ... ... ...
... ...
...
...
...
...
... ... ... ... ... ...
...
.... ... ... ... ... ... ... ..
. ...
.. ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ...
...
... ... ... ... ... ...
.... ... ...
...
... ... ... ... ..
. ...
.. ...
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
.... ...
...
... ... ... ... .... ..
. ...
...
.. ... ... ...
... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ...
... ... ... ... ... ... ...
.... ...
... ... ... ... ... ... ..
. ...
.. ... ... ... ... ... ... ... ...
... ... ... ...
... ... ... ... ... ...
... . .. .. ... ... ... ..
−4 −3 −2 −1 0 1 2 3 4
logit(p)
182
17. Discrete dependent variables 295–297
−5 −2 −1 0 −41 2 −3
logit(p)
Figure 17.5: Observed log-odds and estimated normal distribution
of population log-odds of religious attendance.
183
17. Discrete dependent variables 297–299
184
17. Discrete dependent variables 297–299
Explanatory variables:
• Educational level, measured as the age at which people left school (14–21 years), minus 14.
This variable was centered within countries. The within-country deviation variable has mean 0,
standard deviation 2.48.
• Income, standardized within country; mean −0.03, standard deviation 0.99.
• Employment status, 1 for unemployed, 0 for employed; mean 0.19, standard deviation 0.39.
• Sex, 1 for female, 0 for male; mean 0.52, standard deviation 0.50.
• Marital status, 1 for single/divorced/widowed, 0 for married/cohabiting; mean 0.23, standard
deviation 0.42.
• Divorce status, 1 for divorced, 0 for other; mean 0.06, standard deviation 0.25.
• Widowed, 1 for widowed, 0 for other; mean 0.08, standard deviation 0.27.
• Urbanization, the logarithm of the number of inhabitants in the community or town of
residence, truncated between 1,000 and 1,000,000, minus 10; mean 0.09, standard deviation
2.18.
185
17. Discrete dependent variables 297–299
Next page:
Table 17.2.
Logistic random intercept model for religious attendance in 59 countries.
186
17. Discrete dependent variables 297–299
Random intercept:
τ0 = S.D.(U0j ) intercept standard deviation 1.08
Deviance 115,969.9
187
17. Discrete dependent variables 302–303
188
17. Discrete dependent variables 302–303
Table 17.3.
Logistic random slope model for religious attendance in 59 countries.
(continued...)
189
17. Discrete dependent variables 302–303
(... continuation)
For a data set with large groups like these country data,
a two-step approach (Section 3.7) might be preferable.
190
17. Discrete dependent variables 307–309
Estimated level-two intercept variance may go up when level-1 variables are added
and always does when these have no between-group variance.
This can be understood by threshold model
which is equivalent to logistic regression:
(
0 if Y̆ ≤ 0
Y =
1 if Y̆ > 0 ,
191
17. Discrete dependent variables 307–309
Measure of explained variance (‘R2’) for multilevel logistic regression can be based
on this threshold representation, as the
proportion of explained variance in the latent variable.
2
Because of the arbitrary fixation of σR to π 2/3,
these calculations must be based on one single model fit.
192
17. Discrete dependent variables 307–309
Let
r
X
Ŷij = γ0 + γh xhij
h=1
be the latent linear predictor; then
Y̆ij = Ŷij + U0j + Rij .
Calculate Ŷij (using estimated coefficients) and then
2
σF = var Ŷij
in the standard way from the data; then
var Y̆ij = σF2 + τ02 + σR 2
2
where σR = π 2/3 = 3.29.
193
17. Discrete dependent variables 305–307
is at level one.
194
17. Discrete dependent variables 305–307
Model 1
Fixed Effect Coefficient S.E.
γ0 Intercept 2.487 0.110
γ1 Gender –1.515 0.102
γ2 Minority status –0.727 0.195
Linear predictor
Ŷij = 2.487 − 1.515 genderij − 0.727 minorityij
0.582
has variance σ̂F2 = 0.582 . Therefore Rdicho
2
= 0.582 + 0.481+3.29
= 0.13 .
195
17. Discrete dependent variables 310–313
‘Structural model’:
X r
Y̆ij = γ0 + γh xhij + U0j + Rij .
h=1
196
17. Discrete dependent variables 313
Table 17.6 Multilevel 4-category logistic regression model number of science subjects
Model 1 Model 2
Threshold parameters Threshold S.E. Threshold S.E.
θ1 Threshold 1 - 2 1.541 0.041 1.763 0.045
θ2 Threshold 2 - 3 2.784 0.046 3.211 0.054
197
17. Discrete dependent variables 314–319
r
X
ln E (Lij ) = γ0 + γh xhij + U0j .
h=1
Model 1 Model 2
199
17. Discrete dependent variables 314–319
Model 3 Model 4
200