Analysis of Clustered Binary Data
Analysis of Clustered Binary Data
Analysis of Clustered Binary Data
X
2
i1
n
i
p
i
^ pp
2
C
i
^ pp1 ^ pp
where ^ pp
i
P
K
k1
p
ik
and ^ pp is the overall event rate.
In addition, the test can be extended to the comparison
of more than two treatment groups.
[14]
Although the
ICC appears to be quite adaptable to adjusting a
variety of chi-square tests, this method has limitations
regarding the required size of the study population and
the test assumptions. Donald and Donner
[13]
suggest that the number of units per cluster should
be greater than 10 and the ratio of the number of units
to the number of observations per unit should be
greater than two when applying the adjusted chi-
square to Pearsons chi-square test for homogeneity
of proportions. They also note that the methods
assumption, that the correlation between responses
within a cluster is equal, may not be appropriate when
the probability of a correct response varies between
units within a cluster or when the correlation is depen-
dent upon the size of the cluster. However, Jung
et al.
[17]
illustrate through simulation studies that the
adjustment method performs well even if the assump-
tion of a common intracluster correlation is not met,
Table 1 Contingency table for clustered binary data
Response
Success Failure
Group 1
P
K
k1
P
nk
j1
y
1jk
P
K
k1
P
nk
j1
n
1
y
1jk
P
K
k1
P
nk
j1
n
1jk
Group 2
P
K
k1
P
nk
j1
y
2jk
P
K
k1
P
nk
j1
n
2
y
2jk
P
K
k1
P
nk
j1
n
2jk
P
K
k1
P
nk
j1
y
1jk
y
2jk
P
K
k1
P
nk
j1
n
1jk
n
2jk
K
k1
y
ik
n
ik
^ pp
i
2
_
K
k1
n
ik
_
2
_
_
_
_
_
_
divided by the estimated binomial variance of the
proportion of positive responses, n
1
i
^ pp
i
1 ^ pp
i
. Rao
and Scott refer to the ination factor as the design
effect and apply it to the standard chi-square test
with I 1 degrees of freedom,
~ww
2
I
i1
~yy
i
~ nn
i
~ pp
2
~ nn
i
~ pp1 ~ pp
The number of positive responses (y
i
), number of units
(n
i
) and event rate (p) in the chi-square equation are
each adjusted by the design effect and applied to the
one degree of freedom chi-square statistic. The design
effect approach is best utilized when the mean cluster
size differs between comparison groups. The limiting
factor of this sampling approach is that a relatively
large number of clusters is suggested in order to obtain
a consistent ratio estimator.
Following a similar approach, Obuchowski
[8]
has
developed a test for clustered matched-pair data that
pools the probability of a success across all clusters
and all tests, where p in the above equation is replaced
with pp, which is the average of the proportions of
success for each procedure: pp ^ pp
i
^ pp
i
0 =2. The
homogeneity hypothesis and the numerator of the test
statistic remain the same as in the McNemar test,
except that the overall sample proportion of positive
units detected by procedure i (^ pp
i
) is estimated by
counting the number of positive responses by proce-
dure i summed over all K clusters, e.g.,
a
k
b
k
a
k
c
k
for Procedure 2.
The test is asymptotically distributed as a chi-square
with one degree of freedom under the null hypothesis,
w
2
0
^ pp
i
^ pp
i
0
2
var^ pp
i
^ pp
i
0
pp
where var^ pp
i
^ pp
i
0
pp
var^ pp
i
pp
var^ pp
i
0
pp
2cov^ pp
i
; ^ pp
i
0
pp
. Obuchowski compares this method to
the ICC adjustment. Under specic conditions (i.e., a
cluster size 5, three different correlation structures,
and various sample sizes), the proposed method per-
forms similar to the ICC in terms of maintaining a
Type I error rate below 0.05. The main difference
between the two procedures is that the sampling
approach does not take into account within-cluster
correlation.
An alternative approach proposed by Durkalski
and colleagues utilizes a method of moments approach
for the analysis of clustered matched-pair data.
[9]
The
proposed test statistic is
w
2
V
K
k1
1=n
k
b
k
c
k
_
2
K
k1
b
k
c
k
=n
k
2
because under the null hypothesis, the estimated
variance of the difference in success rates is
var^ pp
10
^ pp
01
1
K
2
K
k1
b
k
c
k
n
k
_ _
2
This estimate of the variance is consistent according to
large sample theory. Based on simulations, the method
of moments approach performs as well in terms of size
and power as the McNemar test adjusted by the ICC
and Obuchowskis approach for clustered matched-pair
Table 2 Contingency table for clustered matched-pair binary data
Procedure 1
Success Failure
Procedure 2 Success
K
k1
nk
j1
a
jk
K
k1
nk
j1
b
jk
K
k1
nk
j1
a
jk
b
jk
Failure
K
k1
nk
j1
c
k
K
k1
nk
j1
d
k
K
k1
nk
j1
a
jk
c
jk
K
k1
nk
j1
n
k
N
The joint probabilities of success and failure between the two procedures are p
11
N
1
K
k1
Ea
k
, p
10
N
1
K
k1
Eb
k
,
p
01
N
1
K
k1
Ec
k
, and p
00
N
1
K
k1
Ed
k
.
Analysis of Clustered Binary Data 3
D
o
w
n
l
o
a
d
e
d
B
y
:
[
U
n
i
v
e
r
s
i
t
y
o
f
A
l
b
e
r
t
a
]
A
t
:
0
6
:
3
1
7
J
a
n
u
a
r
y
2
0
0
9
data. Moreover, this method has been extended to
account for non-inferiority study designs.
[18]
General Estimating Equations
Although adjusting the chi-square test for the equality
of probabilities is popular, more complex analyses that
incorporate covariate effects may also be of interest.
These analyses require sophisticated models that
account for the correlation within clusters. A popular
method due to the availability of statistical software
is the application of generalized linear models.
[19]
Based on this model, marginal regression modeling
using GEEs was developed as a semiparametric
approach to tting logistic regression models for
binary clustered data.
[11]
The logistic regression model
using the GEE estimating function obtains similar
results as the adjusted chi-square tests previously dis-
cussed in this entry when covariates are not present.
This model is dened as
logit PrY
ijk
1 b
0
b
1
x
ijk
where x
ijk
identies the ith treatment variable from the
jth unit of the kth cluster (in the case of two treat-
ments) and can be run using PROC GENMOD in
SAS.
[20]
The p-value is generated from a comparison
of the log odds ratio to the estimated variance. Because
clustering affects the standard errors of the parameter
estimates rather than the parameter estimates them-
selves, the parameter estimates can be obtained by
running a regression analysis on each response. The
logit link expresses the linear relationship between a
clusters responses and the corresponding covariates.
The correlation within a cluster is accounted for
in the variancecovariance matrix. To estimate the
variance of the response, a specied working
correlation matrix that denes the association among
units within a cluster substitutes for the true, often
unknown, correlation matrix. Although the working
correlation matrix may be mis-specied, which could
possibly result in a decrease of efciency, the GEE
method can produce consistent estimates.
[11]
Further-
more, the decreased efciency may be minimal if the
number of clusters is large.
[21]
Prentice
[22]
extends
Zeger and Liangs model to incorporate the modeling
of correlations within a cluster in order to improve
upon the efciency of the GEE estimates of the regres-
sion parameters. Although the GEE method is more
complex than the straightforward comparison of event
rates, the attractiveness of this approach is that it easily
incorporates adjustments for covariates when needed.
Methods for binary regression using clustered data
deserve an entry of their own and are not discussed
further.
EXAMPLE
To illustrate the clinical application of commonly used
methods for analyzing clustered binary data and the
importance of accounting for the clustered nature of
the data, Donner and Klar
[23]
consider data collected
from schools randomly allocated to one of two inter-
ventions. The outcome of interest is the proportion
of children who use smokeless tobacco after two years
of follow up. The performance of various methods
for testing the equality of clustered binary data is
observed, including the test statistics and p-values of
the unadjusted Pearson chi-square, the ICC ination
factor, the design effect ination factor, and the GEE
approach. The test statistics and p-values illustrate that
statistical conclusions can be false when clustering is
ignored (using the standard chi-square test), while all
methods that adjust for clustering give relatively the
same conclusions with some variability in the actual
test values and signicance levels.
For matched-pair data, a data set containing
clustered matched-pair data that appeared in
Obuchowskis paper
[8]
is assessed with the unadjusted
McNemar test, the ICC adjustment, pooled estimators,
and the method of moments approach. A trial in diag-
nostic methods for hyperparathyroidism is designed to
compare the sensitivity and specicity of two tech-
niques, positron emission tomography (PET) and a
single photon emission CT (SPECT) scan. The data
consist of 21 patients whose glands were examined
for the presence of hyperparathyroidism (Table 3).
Of the 21 patients, a total of 72 glands were evaluated
by both diagnostic tools. The specicity of the two
scanners is considered to be of interest and, therefore,
only the glands that are conrmed negative by surgery
(considered the gold standard) are evaluated. Of the 72
glands among the 21 patients, a total of 51 glands were
conrmed negative. The estimated intracluster correla-
tion is equal to 0.46. Obuchowskis, Donners, and
Durkalskis test results are chi-square values of 2.86
(p 0.091), 3.66 (p 0.056), and 2.32 (p 0.128),
respectively. The unadjusted McNemar test statistic is
4.5 (p 0.034).
All the tests that account for clustering fail to reject
the null hypothesis of no difference between the two
procedures (at the 0.05 level), whereas the McNemar
test (which does not account for the clustering) con-
cludes that a statistically signicant difference does
exist (at the 0.05 level) in the specicity of PET and
SPECT. This is not surprising, because as seen in
Donner and Klars comparisons, accounting for clus-
tering means assigning less weight to multiple observa-
tions from the same cluster. Without doing this, these
multiple observations would have more inuence than
they perhaps should, and this could easily lead to pseudo-
power, or a false rejection of a true null hypothesis.
4 Analysis of Clustered Binary Data
D
o
w
n
l
o
a
d
e
d
B
y
:
[
U
n
i
v
e
r
s
i
t
y
o
f
A
l
b
e
r
t
a
]
A
t
:
0
6
:
3
1
7
J
a
n
u
a
r
y
2
0
0
9
CONCLUSIONS
Clustered binary data are prevalent in medical
research. Ophthalmology studies, dental research,
oncology studies, and community research programs
often involve multiple layers, which violate the basic
assumption of independence for common statistical
methods. Methods for analyzing clustered binary data
are available and continue to be developed and
enhanced. All methods account for the within-cluster
correlation; yet, it is the approach to doing so that
makes them different. The appropriate strategy is
dependent on the research question being explored
and the data structure in terms of cluster size and
cluster covariates.
Due to the availability of current statistical soft-
ware, the majority of methods discussed in this entry
are simple to implement in practice. Therefore, the
collapsing of the data to a single-level model should
be avoided. Ignoring the cluster effect or collapsing
of data has been shown throughout the literature
to create biased estimates, which can lead to false
statistical conclusions (as was the case in our
examples). If a simple test of the equality of event rates
is being performed and a relatively small difference
in mean cluster size between comparison groups is
present, then the ICC approach is convenient to
implement and can be extended to a number of data
scenarios, including stratied or matched-pair data.
For more complex study designs, such as those that
incorporate covariates or multiple cluster levels (per-
haps classrooms within school districts or designs that
have a combination of clustered and longitudinal
data), hierarchical modeling approaches are available
and continue to be explored.
[1,3]
It is worthwhile to
mention that during the planning stages of a study that
involves clustered binary data, the statistician needs to
consider inating the sample size to offset the loss of
information that occurs due to the clustering. Lee
and Dubin offer considerations for computing these
ination factors in practice.
[24]
ACKNOWLEDGMENT
The author would like to thank Dr. Vance Berger
for his constructive comments, which have added to
this entry.
REFERENCES
1. Neuhaus, J. Assessing change with longitudinal
and clustered binary data. Annu. Rev. Public
Health 2001, 22, 115128.
2. Ashby, M.; Neuhaus, J.M.; Hauck, W.W.;
Bacchetti, P.; Heilbron, D.C.; Jewell, N.P.; Segal,
M.R.; Fusaro, R.E. An annotated bibliography of
method for analyzing correlated categorical data.
Stat. Med. 1992, 11, 6799.
3. Pendergast, J.F.; Gange, S.J.; Newton, M.A.;
Lindstrom, M.J.; Palta, M.; Fisher, M.R. A
survey of methods for analyzing clustered binary
response data. Int. Stat. Rev. 1996, 64 (1), 89118.
4. Hujoel, P.P.; Moulton, L.H.; Loesche, W.J.
Estimation of sensitivity and specicity of site-
specic diagnostic tests. J. Periodont. Res. 1990,
25, 193196.
5. Donner, A. Statistical methodology for paired
cluster designs. Am. J. Epidemiol. 1987, 126 (5),
972979.
6. Rao, J.N.K.; Scott, A.J. A simple method for the
analysis of clustered binary data. Biometrics 1992,
48, 577585.
7. Lee, E.W.; Dubin, N. Estimation and sample size
considerations for clustered binary responses.
Stat. Med. 1994, 13, 12411252.
Table 3 Hyperthyroid data
k n
k
y
ik
y
i
0
k
a
k
b
k
c
k
d
k
1 3 0 2 0 0 2 1
2 3 2 3 2 0 1 0
3 3 3 3 3 0 0 0
4 1 1 1 1 0 0 0
5 3 2 3 2 0 1 0
6 4 4 4 4 0 0 0
7 3 3 3 3 0 0 0
8 2 2 2 2 0 0 0
9 2 2 1 1 1 0 0
10 1 1 1 1 0 0 0
11 3 2 2 2 0 0 1
12 2 2 2 2 0 0 0
13 3 3 3 3 0 0 0
14 2 2 2 2 0 0 0
15 2 0 2 0 0 2 0
16 3 2 2 2 0 0 1
17 3 2 2 2 0 0 1
18 3 2 3 2 0 1 0
19 2 2 2 2 0 0 0
20 1 1 1 1 0 0 0
21 2 2 2 2 0 0 0
K 21, N
P
K
k1
n
k
51, y
ik
P
nk
j1
y
ijk
40, y
i
0
k
P
nk
j1
y
i
0
jk
46, a
P
K
k1
P
nk
j1
a
jk
39, b
P
K
k1
P
nk
j1
b
jk
1,
c
P
K
k1
P
nk
j1
c
jk
7, d
P
K
k1
P
nk
j1
d
jk
4.
(From Ref.
[8]
.)
Analysis of Clustered Binary Data 5
D
o
w
n
l
o
a
d
e
d
B
y
:
[
U
n
i
v
e
r
s
i
t
y
o
f
A
l
b
e
r
t
a
]
A
t
:
0
6
:
3
1
7
J
a
n
u
a
r
y
2
0
0
9
8. Obuchowski, N.A. On the comparison of corre-
lated proportions for clustered data. Stat. Med.
1998, 17, 14951507.
9. Durkalski, V.; Palesch, Y.; Lipsitz, S.; Rust, P.
The analysis of clustered matched-pair data. Stat.
Med. 2003, 22, 24172428.
10. Donner, A. The analysis of intraclass correlation
in multiple samples. Ann. Hum. Genet. 1985, 49,
7582.
11. Zeger, S.L.; Liang, K. Longitudinal data analysis
for discrete and continuous outcomes. Biometrics
1986, 42, 121130.
12. Berger, V.W. Pros and cons of permutation tests
in clinical trials. Stat. Med. 2000, 19, 1319
1328.
13. Donner, A.; Donald, A. The statistical analysis of
multiple binary measurements. J. Clin. Epidemiol.
1988, 41 (9), 899905.
14. Donner, A.; Banting, D. Adjustment of frequently
used chi-square procedures for the effect of site-
to-site dependencies in the analysis of dental data.
J. Dent. Res. 1989, 68 (9), 13501354.
15. Donald, A.; Donner, A. Adjustments to the
MantelHaenszel chi-square statistic and odds
ratio variance estimator when the data are
fclustered. Stat. Med. 1987, 6, 491499.
16. Donner, A.; Eliasziw, M. Application of matched
pair procedures to site-specic data in periodontal
research. J. Clin. Periodontol. 1991, 18,
755759.
17. Jung, S.; Ahn, C.; Donner, A. Evaluation of an
adjusted chi-square statistic as applied to observa-
tional studies involving clustered binary data.
Stat. Med. 2001, 20, 21492161.
18. Durkalski, V.; Palesch, Y.; Lipsitz, S.; Rust, P.
The analysis of clustered matched-pair data under
a non-inferiority study design. Stat. Med. 2003,
22, 279290.
19. Nelder, J.A.; Wedderburn, R.W.M. Generalized
linear models. J. R. Stat. Soc. A 1972, 135,
370384.
20. SAS Statistical Software V8 or V9, Cary, North
Carolina.
21. Ahn, C. Statistical methods for the estimation of
sensitivity and specicity of site-specic diagnos-
tic tests. J. Periodont. Res. 1997, 32, 351354.
22. Prentice, R.L. Correlated binary regression with
covariates specic to each binary observation.
Biometrics 1988, 44, 10331048.
23. Donner, A.; Klar, N. Analysis of binary out-
comes. In Cluster Randomization Trials; Arnold,
Hodder Headline Group: London, 2000; 8695
(Chapter 6).
24. Lee, E.W.; Dubin, N. Estimation and sample size
considerations for clustered binary responses.
Stat. Med. 1994, 13, 12411252.
6 Analysis of Clustered Binary Data
D
o
w
n
l
o
a
d
e
d
B
y
:
[
U
n
i
v
e
r
s
i
t
y
o
f
A
l
b
e
r
t
a
]
A
t
:
0
6
:
3
1
7
J
a
n
u
a
r
y
2
0
0
9