Week14 Longitudinal Data
Week14 Longitudinal Data
Department of Statistics
Graduate Course
NCKU
id age Y1 Y2
1 14 28 22
2 12 34 16
3 ··· ··· ···
In the long format there will be multiple records for each unit.
id age Y
1 14 28
1 14 22
2 12 34
2 12 16
3 ··· ··· ···
Both the wide format and the long format present advantages
and disadvantages in some situations.
The long format has an explicit time variable available that can
be used for analysis
The wide and the long formats can be converted into each
other.
Understand Longitudinal/Clustered Data
Apart from the fact that the columns are ordered in time, there
is not major difference with cross-sectional imputation methods.
Basic Set ups for Longitudinal Data
Yij : j = 1, · · · , ni ; i = 1, · · · , N.
In a longitudinal study,
i indicates the subject and j the measurement occasion.
(
1, if Yij is observed
Rij =
0, elsewhere.
(1, 1, · · · , 1, 0, 0, · · · , 0)
Ti = Di − 1
Week
Patient Treatment Baseline 1 2 3 4 5 6
1 1 22 20 18 16 14 12 10
2 1 22 21 18 15 12 9 6
3 1 22 22 21 20 19 -99 -99
4 2 20 20 20 20 21 21 22
5 2 21 22 22 23 24 25 26
6 2 18 19 20 -99 -99 -99 -99
Week
Patient Treatment Baseline 1 2 3 4 5 6
1 1 22 20 18 16 14 12 10
2 1 22 21 18 15 12 9 6
3 1 22 22 21 20 19 -99 -99
4 2 20 20 20 20 21 21 22
5 2 21 22 22 23 24 25 26
6 2 18 19 20 -99 -99 -99 -99
CC and LOCF Methods
β0 + β1 Ti + β2 ti + β3 Ti ti , (1)
γ0 + γ1 Ti + γ2 ti + γ3 Ti ti . (2)
∆CC = γ1 + γ3 . (5)
• We will now consider the special but important cases where the
true missing data mechanisms are MCAR and MAR, respectively.
E (Yi ) = β0 + β1 xi , i = 1, 2, · · · , n.
and
Y ∼ N(0, σ 2 ), independent.
The ordinary least squares regression line (or maximum likelihood)
is obtained by solving the normal equation for β :
X
XT
i yi − X T
i β̂ = 0.
Generalized estimating equations
yi∗ − XiT β̂ = 0,
X X
XT
i yi − X T
i β̂ + XT
i (9)
observed missing
where
yi∗ = EYi |ri (Yi ).
Generalized estimating equations
πi = P(Ri = 1|y, Xi ).
πi = P(Ri = 1|y, Xi ).
Then it is easily seen that the IPW normal equations are unbiased
(for known i),
X XT
i yi − X T
i β̂
(10)
π
observed
and hence consistent for β.
n
X
S(β) = Si (zi ; β̂) = 0. (11)
i=1
X Si (zi , β̂)
. (12)
P(Ri = 1)
observed
Generalized estimating equations
π̂ → π0 , as n → ∞,
υi = Var (Yi ).
X X ∂µij
S(β) = υij−1 (yij − µij )
i j
∂β
N
0
D i [V i (α)]−1 (y i − µi )
X
= (14)
i=1
General Estimating Equations
Note that
µ
I D is an ni × p matrix with (i,j) th elements of ij
β
I V i is ni × ni diagonal or complex
1/2 1/2
V i (β, α) = φAi (β)R i (α)Ai (β)
in which
q
υi,1 (µi1 (β)) 0 ··· 0
q
0 υi,2 (µi2 (β)) · · · 0
1/2
Ai (β) =
.. .. .. ..
. . . .
q
0 0 ··· υi,ni (µini (β))
2. Exchangeable
3. AR(1)
4. Unstructured
working correlation matrix
1. Independence
1 0 0 ... 0
0 1 0 . . . 0
Ri = . . . .
. . ...
.. .. ..
0 0 0 ... 1
2. Exchangeable
1 α α ... α
α 1 α . . . α
Ri = . . . .
. . ...
.. .. ..
α α α ... 1
working correlation matrix
4. Unstructured (Symmetric)
1 α12 α13 . . . α1m
α21 1 α23 . . . α2m
Ri =
.. .. .. .. ..
. . . . .
αm1 αm2 αm3 ... 1
I Augmented IPW
I Multiple imputation