W2 2+Sampling+and+Generalizability
W2 2+Sampling+and+Generalizability
A statistical agency gathers All rice farmers in the district, 2000 rice farmers
information from 2000 rice defined as those growing rice in
farmers to estimate the average the previous year
yield for farmers in a district
6
Identifying a Sampling Frame
• Sampling Frame: list of all the units in the
population from which to select the sample
– Ideally a complete list, but may be smaller than target
population: all rice farmers in HK – only those who
are registered members of a cooperative of rice
farmers
– Can also contain more units: actual voters – survey
all registered voters and then determine likelihood of
voting
– In some situations, no sampling frame is available,
and so, it must be created: bike repair shops in
Congo 7
Creating a Sampling Frame
• Use an area sample frame: use a grid to
divide an area map into equal sized plots.
• Carry out a listing exercise: survey team
creates a list of sampling units within a
given area.
– Units are numbered and randomly selected.
– Can be time-consuming
8
How Do We Prepare to Sample?
• Sampling units: Units listed at each stage
of a multistage sampling design.
10
11
How Do We Prepare to Sample?
Evaluate Generalizability
• Sample generalizability
• Cross-population generalizability
12
How Do We Prepare to Sample?
Assess the Diversity of the Population
• Sampling is unnecessary if all units in population
are identical.
• Representative sample
– A sample that “looks like” the population from which it
was selected in all respects that are potentially
relevant to the study. The distribution of
characteristics among the elements of a
representative sample is the same as the distribution
of those characteristics among the total population. In
an unrepresentative sample, some characteristics are
overrepresented or underrepresented.
13
How Do We Prepare to Sample?
Consider a Census
• Research in which information is obtained
through responses from or information
about all available members of an entire
population.
• Well-designed sampling strategy can
result in a representative sample of the
same population at far less cost.
14
What Sampling Method Should We
Use?
• Probability sampling methods: sampling
methods that rely on a random, or chance,
selection method so that the probability of
selection of population elements is known.
15
• Probability of selection: the likelihood that an
element will be selected from the population for
inclusion in the sample.
– In a census of all the elements of a population, the probability that
any particular element will be selected is 1.0. If half the elements in
the population are sampled on the basis of chance, the probability
of selection for each element is one half or 0.5. As the size of the
sample as a proportion of the population decreases, so does the
probability of selection.
– Random sampling
• Two common problems:
– Incomplete sampling frame
– Nonrespondents
16
What Sampling Method Should We
Use?
Probability Sampling Methods
• Bias
– Sampling bias occurs when some population
characteristics are over- or underrepresented in
the sample because of particular features of the
method of selecting the sample.
– When the goal is to generalize, probability
samples are more useful than nonprobability
samples.
– Size of the sample and homogeneity of
population affect degree of error.
17
• That’s why blood testing works—blood is
homogeneous in any one person’s body.
18
• Trump won 2016 election
19
What Sampling Method Should We
Use?
Probability Sampling Methods
• Simple Random Sampling
– A method of sampling in which every sample
element is selected purely on the basis of
chance through a random process.
– In a true random sample, probability of
selection is equal for each element.
21
What Sampling Method Should We
Use?
Probability Sampling Methods
• Systematic Random Sampling
– A method of sampling in which sample elements are
selected from a list or from sequential files, with every
nth element being selected after the first element is
selected randomly.
– Depends on sampling interval
• The number of cases between one sampled case and
another in a systematic random sample.
– Normally yields essentially simple random sample.
– May not be random if sequence has periodicity.
22
What Sampling Method Should We
Use?
Probability Sampling Methods
• Cluster Sampling: Sampling in which elements
are selected in two or more stages, with the first
stage being the random selection of naturally
occurring clusters and the last stage being the
random selection of elements within clusters.
– Process:
• Draw random sample of clusters (schools, counties)
• Draw random sample of elements within clusters (students,
residents).
– Sampling error is greater. 24
Cluster Sampling
• The more clusters you select, with the
fewest individuals in each, the more
representative your sampling will be –
costs will be higher
25
Cluster Sampling
26
What Sampling Method Should We
Use?
Probability Sampling Methods
• Stratified Random Sampling: A method of
sampling in which sample elements are
selected separately from population strata
that the researcher identifies in advance.
29
What Sampling Method Should We
Use?
Probability Sampling Methods
• Stratified Random Sampling
– Proportionate stratified sampling
– Disproportionate stratified sampling :
Sampling in which elements are selected from
strata in proportions that are purposefully different
from those that appear in the population.
• Commonly used to ensure that cases from smaller
strata are included sufficiently.
30
What Sampling Method Should We
Use?
Probability Sampling Methods
• Stratified Random Sampling
– Proportionate stratified sampling
• Disproportionate stratified sampling
– % Koreans in HK: 13288/7400000=0.2%
– A stratified random sample of 1000 2 Koreans
31
What Sampling Method Should We
Use?
Nonprobability Sampling Methods
• Availability Sampling: Sampling in which
elements are selected on the basis of
convenience.
– Useful in a new setting or in exploratory
studies.
33
What Sampling Method Should We
Use?
Nonprobability Sampling Methods
• Quota Sampling: A nonprobability sampling
method in which elements are selected to ensure
that the sample represents certain characteristics
in proportion to their prevalence in the population.
36
Sampling Weights
• Numbers used to estimate population
parameters (means, percentages) from
sample statistics, compensating for
“distortions” that may be introduced by
sampling
37
Sampling Weights
• Calculated as the inverse of the probability
of selection, called “Inverse Probability
Sampling Weights” (IPSW)
• In simple random sampling (n/N):
w=N
n
39
Sampling Weights
• In simple random sampling (n/N):
w=N
n
• A survey of 100 out of 2000 seniors at a
university:
– N=2000; n=100; w = 20.
– Each senior in the sample represents 20 seniors
in the population
40
Sampling Weights
• In simple random sampling, the sampling
weight is the same for all units; considered
“self weighted.”
• In stratified/multi-stage sampling, sample
weight for each strata/stage is calculated.
41
Single-stage stratified sample
• Single-stage stratified sample. The weight for
observations in stratum i is:
wi = Ni
ni
• Population: 900,000 rural households; 100,000
urban households
• Sample: 4,000 households divided equally
between urban and rural areas
– Weight for rural households: 900,000/4,000=450
– Weight for urban households: 100,000/4,000=50
42
Multistage sample
• Three-stage random sample: weight at each stage is
the inverse probability of selection
– Stage 1: 10 out of 50 states
– Stage 2: 5 counties in each of the 10 selected states
– Stage 3: 100 households in each selected county
wc,s = 50 Cs Hc
10 5 100
• Cs: total number of counties in the selected state
• Hc: total number of households in the selected county
• keeping in mind the fact that the number of terms
should be equal to the number of stages in the
sampling.
43
Using Sampling Weight
Suppose our variable of interest in a national survey is
household income. We can estimate national income as a
weighted sum of household income across the sample
using the following equation:
44
45