0% found this document useful (0 votes)
10 views45 pages

W2 2+Sampling+and+Generalizability

Uploaded by

wnpndtl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views45 pages

W2 2+Sampling+and+Generalizability

Uploaded by

wnpndtl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Lecture 3

Sampling and Generalizability


Key Terms
Population
In statistics, we generally want to study a population.
You can think of a population as a collection of
persons, things, or objects under study.
Key Terms
Sample
To study the population, we select a sample. The idea
of sampling is to select a portion (or subset) of the
larger population and study that portion (the sample) to
gain information about the population.
Key Terms
Parameter
A parameter is a number that is a property of the
population. Because it takes a lot of time and money to
examine an entire population, sampling is a very
practical technique.
Key Terms
Statistic
A statistic is a number that represents a property of the
sample.
Table 2.2 Sample of Surveys
Description of Survey Population Sample
A polling firm collects information All likely voters in the country, 1500 likely voters
from 1500 likely voters to defined as those that voted in at
understand their political views least two of the last three
elections

A statistical agency gathers All rice farmers in the district, 2000 rice farmers
information from 2000 rice defined as those growing rice in
farmers to estimate the average the previous year
yield for farmers in a district

A university carries out a survey All full-time undergraduate 200 students


of 200 students to explore students at the university in a
options for reducing the year
numbers of students who
transfer out
A state government agency All businesses in the state that 5000 small businesses
carries out a survey of 500 small have 10 or fewer full time
businesses in a state workers

6
Identifying a Sampling Frame
• Sampling Frame: list of all the units in the
population from which to select the sample
– Ideally a complete list, but may be smaller than target
population: all rice farmers in HK – only those who
are registered members of a cooperative of rice
farmers
– Can also contain more units: actual voters – survey
all registered voters and then determine likelihood of
voting
– In some situations, no sampling frame is available,
and so, it must be created: bike repair shops in
Congo 7
Creating a Sampling Frame
• Use an area sample frame: use a grid to
divide an area map into equal sized plots.
• Carry out a listing exercise: survey team
creates a list of sampling units within a
given area.
– Units are numbered and randomly selected.
– Can be time-consuming

8
How Do We Prepare to Sample?
• Sampling units: Units listed at each stage
of a multistage sampling design.

10
11
How Do We Prepare to Sample?
Evaluate Generalizability
• Sample generalizability
• Cross-population generalizability

12
How Do We Prepare to Sample?
Assess the Diversity of the Population
• Sampling is unnecessary if all units in population
are identical.
• Representative sample
– A sample that “looks like” the population from which it
was selected in all respects that are potentially
relevant to the study. The distribution of
characteristics among the elements of a
representative sample is the same as the distribution
of those characteristics among the total population. In
an unrepresentative sample, some characteristics are
overrepresented or underrepresented.
13
How Do We Prepare to Sample?
Consider a Census
• Research in which information is obtained
through responses from or information
about all available members of an entire
population.
• Well-designed sampling strategy can
result in a representative sample of the
same population at far less cost.
14
What Sampling Method Should We
Use?
• Probability sampling methods: sampling
methods that rely on a random, or chance,
selection method so that the probability of
selection of population elements is known.

15
• Probability of selection: the likelihood that an
element will be selected from the population for
inclusion in the sample.
– In a census of all the elements of a population, the probability that
any particular element will be selected is 1.0. If half the elements in
the population are sampled on the basis of chance, the probability
of selection for each element is one half or 0.5. As the size of the
sample as a proportion of the population decreases, so does the
probability of selection.

– Random sampling
• Two common problems:
– Incomplete sampling frame
– Nonrespondents

• Nonprobability sampling methods

16
What Sampling Method Should We
Use?
Probability Sampling Methods
• Bias
– Sampling bias occurs when some population
characteristics are over- or underrepresented in
the sample because of particular features of the
method of selecting the sample.
– When the goal is to generalize, probability
samples are more useful than nonprobability
samples.
– Size of the sample and homogeneity of
population affect degree of error.
17
• That’s why blood testing works—blood is
homogeneous in any one person’s body.

18
• Trump won 2016 election

19
What Sampling Method Should We
Use?
Probability Sampling Methods
• Simple Random Sampling
– A method of sampling in which every sample
element is selected purely on the basis of
chance through a random process.
– In a true random sample, probability of
selection is equal for each element.

21
What Sampling Method Should We
Use?
Probability Sampling Methods
• Systematic Random Sampling
– A method of sampling in which sample elements are
selected from a list or from sequential files, with every
nth element being selected after the first element is
selected randomly.
– Depends on sampling interval
• The number of cases between one sampled case and
another in a systematic random sample.
– Normally yields essentially simple random sample.
– May not be random if sequence has periodicity.
22
What Sampling Method Should We
Use?
Probability Sampling Methods
• Cluster Sampling: Sampling in which elements
are selected in two or more stages, with the first
stage being the random selection of naturally
occurring clusters and the last stage being the
random selection of elements within clusters.

– Process:
• Draw random sample of clusters (schools, counties)
• Draw random sample of elements within clusters (students,
residents).
– Sampling error is greater. 24
Cluster Sampling
• The more clusters you select, with the
fewest individuals in each, the more
representative your sampling will be –
costs will be higher

25
Cluster Sampling

26
What Sampling Method Should We
Use?
Probability Sampling Methods
• Stratified Random Sampling: A method of
sampling in which sample elements are
selected separately from population strata
that the researcher identifies in advance.

– Ensures that various groups within the sampling


frame will be included.
– Process:
• Distinguish all elements in the population
• Determine size of each stratum
27
HK Population
• Question: attitudes towards the new visa
scheme in HK
• Simple Random Sample: very few, if any,
Koreans in your sample.
• Stratified Random Sample:
– All residents in HK are distinguished according to
their ethnic origin -- the sampling strata.
– Residents are sampled randomly from within
these strata: Chinese, Koreans, Germans, etc.

29
What Sampling Method Should We
Use?
Probability Sampling Methods
• Stratified Random Sampling
– Proportionate stratified sampling
– Disproportionate stratified sampling :
Sampling in which elements are selected from
strata in proportions that are purposefully different
from those that appear in the population.
• Commonly used to ensure that cases from smaller
strata are included sufficiently.

30
What Sampling Method Should We
Use?
Probability Sampling Methods
• Stratified Random Sampling
– Proportionate stratified sampling
• Disproportionate stratified sampling
– % Koreans in HK: 13288/7400000=0.2%
– A stratified random sample of 1000 2 Koreans

31
What Sampling Method Should We
Use?
Nonprobability Sampling Methods
• Availability Sampling: Sampling in which
elements are selected on the basis of
convenience.
– Useful in a new setting or in exploratory
studies.

33
What Sampling Method Should We
Use?
Nonprobability Sampling Methods
• Quota Sampling: A nonprobability sampling
method in which elements are selected to ensure
that the sample represents certain characteristics
in proportion to their prevalence in the population.

– May be representative on quota characteristics but no


other way.
– Must know relevant characteristics of entire
population.
– If you can’t draw a random sample, it’s better to use a
quota sample than no quota.
34
What Sampling Method Should We
Use?
Nonprobability Sampling Methods
• Purposive Sampling: A nonprobability sampling
method in which elements are selected for a
purpose, usually because of their unique position.
– a purposive sample may be a key informant survey,
which targets individuals who are particularly
knowledgeable about the issues under investigation.
– Informants should be
• Knowledgeable
• Willing to talk
• Representative
35
What Sampling Method Should We
Use?
Nonprobability Sampling Methods
• Snowball Sampling: A method of sampling
in which sample elements are selected as
successive informants or interviewees
identify them.
– Used for hard-to-reach or hard-to-identify
interconnected populations.
– Normally cannot be confident that sample
represents total population of interest

36
Sampling Weights
• Numbers used to estimate population
parameters (means, percentages) from
sample statistics, compensating for
“distortions” that may be introduced by
sampling

37
Sampling Weights
• Calculated as the inverse of the probability
of selection, called “Inverse Probability
Sampling Weights” (IPSW)
• In simple random sampling (n/N):
w=N
n

39
Sampling Weights
• In simple random sampling (n/N):
w=N
n
• A survey of 100 out of 2000 seniors at a
university:
– N=2000; n=100; w = 20.
– Each senior in the sample represents 20 seniors
in the population

40
Sampling Weights
• In simple random sampling, the sampling
weight is the same for all units; considered
“self weighted.”
• In stratified/multi-stage sampling, sample
weight for each strata/stage is calculated.

41
Single-stage stratified sample
• Single-stage stratified sample. The weight for
observations in stratum i is:
wi = Ni
ni
• Population: 900,000 rural households; 100,000
urban households
• Sample: 4,000 households divided equally
between urban and rural areas
– Weight for rural households: 900,000/4,000=450
– Weight for urban households: 100,000/4,000=50

42
Multistage sample
• Three-stage random sample: weight at each stage is
the inverse probability of selection
– Stage 1: 10 out of 50 states
– Stage 2: 5 counties in each of the 10 selected states
– Stage 3: 100 households in each selected county
wc,s = 50 Cs Hc
10 5 100
• Cs: total number of counties in the selected state
• Hc: total number of households in the selected county
• keeping in mind the fact that the number of terms
should be equal to the number of stages in the
sampling.
43
Using Sampling Weight
Suppose our variable of interest in a national survey is
household income. We can estimate national income as a
weighted sum of household income across the sample
using the following equation:

44
45

You might also like