Introduction To Survey Sampling7 PDF
Introduction To Survey Sampling7 PDF
In combination, the sampling methods discussed in the previous chapters are sufficient to
handle most sampling problems. There are, however, three other design features that are
applicable in certain circumstances and deserve some attention: two-phase sampling,
replicated sampling and panel designs. These three designs are reviewed in this chapter.
Two-Phase Sampling
In two-phase, or double, sampling, certain items of information are collected for an initial, or
first-phase, sample, then further items are collected at the second phase from a subsample of
the initial sample. The method may be extended to more phases (multiphase sampling), but for
most purposes two phases suffice.
One use for two-phase sampling arises when the levels of precision needed for different
estimates from a survey are not compatible, implying that different sample sizes would be
appropriate. In this situation the information required to form the estimates needing the larger
sample could be obtained from the first-phase sample, and that required to form the other
estimates could be obtained only from the second-phase sample. Not only does this two-phase
procedure have the potential for saving data collection and processing costs, it also reduces
the burden placed on some respondents. A familiar example of the use of two-phase sampling
in this way is provided by the U.S. Census of Population and Housing. In recent censuses,
basic demographic and other variables have been collected for the total population (the first-
phase sample thus being a complete enumeration), while additional variables have been
collected only for samples of the population.
Other uses of two-phase sampling arise when the sample designer would like to use certain
population data to produce an efficient design, but when the expense of obtaining those data
for all the population would be too great. In such cases, it may sometimes be economical to
collect the data for a large first-phase sample, and then to use them in selecting the second-
phase sample. The first-phase sample may be used in this way to provide stratification
information, size measures for PPS or PPES selection, or clustering for economies of data
collection for the second-phase sample. In assessing the efficiency of a two-phase design, the
costs of conducting the first-phase survey have to be recognized; because of these costs, the
second-phase sample size is necessarily smaller than a single-phase sample. For this reason,
two-phase designs are usually helpful only when the first-phase element survey costs are
smaller than those for the second phase by a large factor. Sufficiently large differences
between first- and second-phase costs can occur when different data collection procedures are
used—perhaps data taken from records, or collected by mail or telephone for the first phase,
and then face-to-face interviews or expensive measurements taken (as in some medical
surveys) at the second phase.
Two-phase sampling is often used for sampling rare populations—that is, subgroups of the
population for which no separate sampling frame exists, such as Vietnam veterans, blacks, and
the recently retired. The design of good, economical, probability samples for rare populations is
one of the most challenging tasks the survey sampler faces (see Kish, 1965: section 11.4). One
technique to consider is a two-phase design in which the first-phase sample identifies the
members of the rare population inexpensively, and the survey items are then collected from
them at the second phase. In essence, the approach involves the use of two-phase stratified
sampling. The members of the first-phase sample are allocated into two (or more) strata
according to whether they are members of the rare population or not. The strata are then
sampled disproportionately. If the first-phase identification of the rare population is error-free,
then the sampling rates may be set at 1 for the stratum of members and at 0 for the stratum of
nonmembers of the rare population. If the identification is subject to error, however, the
sampling rate in the second stratum needs to be nonzero in order to give members of the rare
population falsely allocated to that stratum some chance of being selected. When the first-
phase screening is imperfect it is preferable, where possible, to err in favor of false positives
rather than false negatives, since the former can be handled more easily. Thus, for example, in
a study of severe hearing loss among children, the initial home-based screening used a less
stringent definition of hearing loss with the aim of ensuring that all children with severe hearing
loss were included in the second phase of the study, which involved measurements made
under controlled laboratory conditions.
An illustration of the use of two-phase sampling for clustering comes from a survey of political
attitudes among electors in a European city. An alphabetical list of electors with their addresses
was available as a sampling frame. Since the city was a large one and the survey was to be
conducted by face-to-face interviewing, some clustering of the sample was desired to reduce
interviewers' travel costs. In theory, the electors' addresses could have been used to allocate
the whole of the electorate to clusters, but that would have been prohibitively expensive.
Instead, a sample of electors ten times larger than required was selected, this sample was
allocated to clusters of equal size based on geographical proximity, and then one-tenth of the
clusters were selected to comprise the final sample.
Replicated Sampling
subsamples, each of the identical sample design. Replicated sampling may be used for
studying variable nonsampling errors, such as the variability in the results obtained by different
interviewers and coders, and for facilitating the calculation of standard errors. The essential
feature for either use is that each subsample provides independent comparable estimates of
the population parameters.
As a simple example of the use of replicated sampling for studying variable interviewer effects,
suppose that an SRS of 1000 is required, with a team of 20 interviewers conducting the
fieldwork. With an unreplicated design, the sample of 1000 would be selected and allocated
between interviewers on grounds of general convenience and geographical proximity, perhaps
with interviews in the more difficult areas being assigned to the best interviewers. When one
interviewer failed to secure a response, the interview might be reissued to a more experienced
interviewer. As a result of this nonrandom assignment of interviewers, differences in
interviewers' results may arise from interviewer effects, from differences in the subsamples they
interview, or from both; these two sources of differences are confounded and cannot be
disentangled. For a simple replicated design, the sample of 1000 could be selected as 20
independent SRSs, each of size 50, with each interviewer then being responsible for obtaining
the 50 interviews in one of the replicates. Since the replicates are comparable samples, any
differences in the subsample results beyond those likely to arise from sampling fluctuations
can be attributed to systematic differences between interviewers in the responses they obtain.
The approach employed for distinguishing between sampling fluctuations and real differences
is that used in a standard one-way analysis of variance (see, for instance, Iversen and Norpoth,
1976); however, the calculations are different when the replicates employ complex sample
designs.
To describe the calculations of interviewer variance, let 1, 2,… c denote the means obtained
from the c subsamples allocated to the different interviewers. The variance of these c means
may be estimated by v1 = v( γ − )2/(c − 1) where = γ/c is the mean of the sample means.
This estimator makes no assumption about the presence or absence of systematic interviewer
effects; when they are present, the estimator will be expected to be larger than when they are
absent. Under the null hypothesis of no interviewer effects, SRS theory can be used to provide
another estimator of the variance of the γ's: Ignoring the fpc term, v( γ) = s2γ/r from formula 3
where s2γ is the estimated element variance in the γth subsample, and r = n/c is the
subsample size. An average of the estimates of v( γ) across the c subsamples is given by V2 =
S−2/r, where s−2 = s−2γ/c is the average of the within-subsample element variance estimates.
Comparison of v1 and v2 then provides a test of the null hypothesis. This comparison is made
by taking the ratio F = v1/v2 = rv1/s−2, with a large value of F indicating the presence of
interviewer variance. The significance test for F greater than 1 is obtained using a standard F-
test with (c − 1) and c(r − 1) = (n − c) degrees of freedom. A useful index of interviewer variance
is the intraclass correlation coefficient ρ, measuring the proportion of the total variance in the y-
values that is accounted for by interviewer variance. The value of ρ may be estimated by (F −
1)/(F − 1 + r). See Kish (1962) for an example.
The consequences of variation among interviewers are similar to those of clustering; each
interviewer's assignment is in effect a separate cluster. Thus, equivalent to the design effect for
a cluster sample, the effect of interviewer variance for the replicated design described above is
to multiply the SRS variance of the overall sample mean by [1 + (r − 1)ρ]. As with clustering,
even a small value of ρ leads to a sizable multiplier if r—the number of interviews conducted by
each interviewer—is large. The usual estimator of the variance of the overall mean from SRS
theory (formula 3) does not allow for the effects of clustering or of interviewer variance. An
attraction of the replicated sampling variance estimator based on the variation between
subsamples is that it automatically encompasses the clustering effect of interviewer variance.
As shown below, this variance estimator is in fact v1/c, which is simply the standard cluster
sampling variance estimator, s2a/a, from formula 15 in a different guise.
The cost of using replicated sampling to study systematic interviewer effects, or interviewer
variance, comes from the need to give interviewers randomly chosen, rather than the most
efficient, assignments. The ease of conducting interviewer variance studies depends on the
general survey conditions; they are, for instance, much more readily incorporated into
telephone than face-to-face surveys, and in face-to-face surveys they are simpler to conduct
with small, compact populations. A completely random assignment of interviews across a
multistage sample of a widely dispersed population would clearly create excessive interviewer
travel costs, but completely random assignments are not required. Some form of restricted
replication—for instance, random interviewer assignments within PSUs or strata—can still
permit the estimation of interviewer variance.
The second use of replicated sampling, to provide simple variance estimates, employs much
the same reasoning as above. Given c estimates z1, z2,… zc of parameter Z, obtained from
independent replicates of the same design, the variance of the mean of the estimates = Σzγ/c
is given by
Thus
provides a general formula for estimating variances from replicated designs. It can be applied to
any form of statistic (such as index numbers, correlation and regression coefficients, as well as
simple means and percentages), and the subsample design can be of any complex form (such
as a stratified multistage PPS design).
A small problem with the use of formula 24 is that it gives the variance of the average of the
replicate values . This average value is not in general the same as the estimator obtained by
pooling the subsamples into one large sample, , and is as a rule the preferred estimator. In
practice, however, the difference between and is usually slight. A commonly adopted
procedure is to compute and use v( ) from formula 24, or a slight variant of it, to provide a
variance estimate for .
A more serious problem centers on the choice of c, the number of replicates to be employed. If
a small value of c is chosen, the replicated variance estimator v( ) will be imprecise, and this
imprecision will affect the width of the confidence interval for the parameter being estimated.
With c replicates, v( ) has (c − 1) degrees of freedom; hence, in forming confidence intervals,
the t distribution with (c − 1) degrees of freedom should be employed. As an illustration of this
effect, consider an SRS of 1000 elements. The 95% confidence interval for obtained from the
conventional approach is ± 1.96s/ , where 1.96 is taken from a table for the normal
distribution. With a replicated design with 10 subsamples of 100 each, the 95% confidence
interval is ± 2.26 , where 2.26 is taken from a table for the t distribution with 9 degrees of
freedom. With a replicated design with 4 subsamples of 250 each, the 95% confidence interval
is ± 3.18 , where 3.18 comes from a table for the t distribution with 3 degrees of freedom.
Since in each case the standard error estimator is unbiased for the true variance of , the
replicated variance estimator with 10 replicates leads to a confidence interval that is on average
15% larger, and that with 4 replicates to a confidence interval that is on average 62% larger,
than that based on the conventional variance estimator. To obtain a reasonably precise variance
estimator, a relatively large value of c is needed, perhaps around 20 to 30 or more. On the
other hand, the greater the value of c, the less stratification that can be employed. This
situation occurs because each subsample must take at least one selection from each stratum.
The restriction on stratification is especially harmful with multistage designs. As an illustration,
suppose that 60 PSUs are selected. With a conventional design, the PSUs would probably be
divided into 60 strata with one selection per stratum, or perhaps 30 strata with two selections
per stratum. With a replicated design with 10 replicates, the maximum number of strata is
reduced to only 6.
In summary, the benefit of a simple and general variance estimator with replicated sampling is
bought at the cost of some loss of precision: Either c is small and the precision of the variance
estimator suffers from limited degrees of freedom, or c is large and the precision of the survey
estimator itself suffers through the loss of stratification. For these reasons, simple replicated
sampling as described is not greatly used in practice. Instead, pseudoreplication techniques
have been developed to enable stratification to be employed to the point of the paired selection
design, yet also to give variance estimates of reasonable precision. These techniques are
described in Chapter 10.
Panel Designs
It has been implicitly assumed thus far that the samples are being designed for cross-sectional
surveys with one round of data collection. There are, however, many survey objectives that
require data to be collected at two or more points of time. While the preceding sample designs
remain applicable, some additional sampling considerations arise when the time dimension is
included.
One purpose of several rounds of data collection is to measure changes over time. An
important distinction needs to be made here between gross and net changes, the former
referring to changes at the element level and the latter to changes in the aggregate. If
measures of individual changes are needed—as for example in a study to examine in detail the
effects of changing leisure activities on blood pressures—then the data must be collected for
the same sampled elements on each round. If only net changes are required—as perhaps in a
study to chart the changes in popularity of a political leader—then the data do not have to be
collected from the same elements. Even with net changes, however, it may be more efficient to
retain the same sample.
Another purpose of conducting surveys at several points of time is to collect information when it
is readily accessible and can be reported accurately. Thus, for instance, in a survey requiring a
detailed accounting of annual household incomes, several interviews may be taken during the
course of the year in order to collect information while it is fresh in the respondents' minds.
Again, a study investigating the association between children's preschool upbringing and
school performance would almost certainly need to collect data on preschool upbringing as it
takes place and then later collect data on school performance. It would be unsafe to rely on
retrospective reports of preschool training, because it would be imperfectly remembered and
because the memory may be distorted by the results of school performance.
A panel or longitudinal survey, in which data are collected from the same sampled elements on
more than one occasion, raises some further issues in addition to those applying with a cross-
sectional survey. One issue is the mobility of the sampled elements. In most panel surveys,
some of the elements—often persons or households—will move during the life of the panel.
These movers need to be retained in the panel in order to keep intact the probability sample
selected at the start, and this requires the development of effective tracing methods. Since
some movers will leave the sampled PSUs of a multistage design, mobility will cause an
increase in data collection costs for later rounds of a survey employing face-to-face
interviewing.
A second issue that needs to be faced with panel surveys is that populations change over time;
some elements in the original population leave while others enter. Consider, for example, a
long-term panel survey of the health of a particular community. At the start, a probability
sample of members of the community is drawn, and they are followed for several years. During
this time the community's population will change: Some of the original members will leave,
through death or because they move out of the community, while new members will enter—
births and movers into the community. The main problem created by leavers is the reduction in
sample size; the panel remains a probability sample of that part of the orginal population that
still lives in the community. The problem with entrants, on the other hand, is that they are not
represented in the sample. In consequence, the sample is not a probability sample of all the
community's population as it exists at later rounds of data collection. When a population has a
significant proportion of new entrants and when cross-sectional results are needed for the
population present at a later round, a supplement sample of entrants is needed. An added
complexity occurs when the element of analysis is a grouping such as a household or family. A
sizable proportion of households or families is likely to change composition over even such a
short period as a year, creating severe conceptual and practical problems in a panel survey.
Another concern with panel surveys is that repeated interviewing may have adverse effects on
respondents. Some may object to the burden and refuse to remain in the panel, thus causing a
bias in the panel membership (see Chapter 9 on nonresponse). Others may be influenced by
their panel membership with regard to the survey's subject matter so that they give untypical
responses. This panel conditioning effect can, for instance, occur in consumer panels in which
respondents are asked to report their household purchases on a regular basis. The act of
reporting can make respondents more price conscious; hence they may alter their patterns of
purchases. A related risk in a panel study asking for the same information repeatedly is that
respondents may remember their previous responses and attempt to give consistent answers.
A widely used method to alleviate some of the problems of panels is to limit the length of panel
membership by using some form of panel rotation. As a simple example, each member of the
panel might be retained for three rounds of the survey. For each round, one-third of the sample
from the previous round would be dropped, and a new one-third would be added: the new third
would be included also in the following two rounds. Thus, using letters to represent the three
parts of the sample, the first round sample is, say ABC, the second round BCD, the third round
CDE, the fourth round DEF, and so on. In this way there is a two-thirds overlap in sample
membership between adjacent rounds and a one-third overlap between rounds that are one
round apart.
As observed earlier, a panel design may be useful but is not essential for estimating net
change. Consider the simple estimator 2− 1 of the change in mean level of variable y
between times 1 and 2. The variance of this difference is given in general by
where is the product-moment correlation coefficient between the sample means 1 and 2.
With independent samples on the two rounds of the survey, = 0. With overlapping samples
is not 0; it is generally positive, but on occasion it can be negative. The last term in formula 25
reflects the gains ( positive) or losses ( negative) in the precision of the estimator of change
through using a panel design.
To obtain further insight into the effect of sample overlap on the measurement of change,
consider the simple case of a static population and simple random sampling with a sample of
size of n on each occasion; furthermore, assume—as is often a reasonable approximation—that
the element variances on the two occasions are equal (i.e., S12 = S22 = S2), and ignore the fpc
term. Then, with a partial overlap of a proportion P in the two samples, the general formula 25
reduces to
where R is the correlation between the elements' y values on the two occasions. The situations
of independent samples and of complete overlap are special cases of this formula, the first with
P = 0 and the second with P = 1. When P = 0, the variance of the difference is simply 2S2/n, so
that the ratio of the variance of the difference with a panel design to that with two independent
samples is (1 − PR). As an illustration, suppose that the correlation in individuals' political
attitudes (or perhaps blood pressures) for the two occasions is 0.75. Then the completely
overlapping panel would reduce the variance of 2 − 1 by a multiplying factor of (1 − 0.75) =
0.25. A partial overlap of two-thirds (rotating out one-third) would reduce the variance by a
factor of [1 − (0.75 × 2/3)] = 0.50. If the correlations across time are high, the gains of the panel
design in measuring change are thus considerable. In the case of a rotating panel design,
further gains can be achieved by using a more complex estimator of the change (see Kish,
1965: 463–464). Note, however, that if R is negative—as would occur if the y variable were the
purchase of a consumer durable in the last month—then the panel design leads to a loss of
precision in measuring change. If, say, R = -0.2, and with complete overlap P = 1, then (1 − PR)
= 1.2, so that the variance of the change is 20% larger with the panel design than with two
independent samples.
Finally, we should note that the gains from positive correlations in nonindependent samples
shown in formula 25 are not confined to the situation where the same elements are kept in a
panel. Designs that retain the same clusters but select different elements can also be helpful
for measuring changes, although the degree of correlation will generally be less than that
occurring when the same elements are retained. A useful design for avoiding the need to follow
movers is to sample dwellings rather than households; a household moving out of a sampled
dwelling is then replaced in the panel by the incoming household.
http://dx.doi.org/10.4135/9781412984683.n7