Lecture 3 - Sampling Design - 2018
Lecture 3 - Sampling Design - 2018
• Today we will be
focusing on gathering
data
I. Sampling Methods
II. Sampling Error
III. Power Calculations for
Sample Size
IV. Example
PART I:
SAMPLING
WHY SAMPLE A PORTION?
Measuring the whole
(population) can be too costly!
Data can be obtained more
quickly… with lower
measurement error
Convenience Sampling
choose the easiest respondents available e.g. the first ten people you meet on the
street
Quota Sampling
select a certain number (i.e. quota) of each type
can produce substantial bias (e.g. 1992 UK election polls)
Still widely used especially for telephone surveys with high non-response levels
6
HOW TO DRAW A SAMPLE?
Random Sampling
use a mechanical chance process
that gives each unit a known, positive, probability of being sampled
for instance, given a list of ALL individuals in the population you could:
toss a coin
draw lots out of a basket
use a random number table
use a computer software
Cluster Sampling
Randomly pick clusters and then (randomly) sample multiple individuals from
each cluster
RECAP
A good sample must be representative of the population
Therefore, draw a random sample
Next up:
A random sample produces reliable information
about the population
PART II:
UNDERSTANDING
SAMPLING ERROR
…AND HOW TO PLAN FOR IT
SAMPLING: A DEEPER CONSIDERATION
• Margin of
sampling error
• Standard error
of estimates
Next up:
How to determine the correct
sample size for our study
PART III:
DETERMINING
SAMPLE SIZE
IMPORTANT POINTS TO
CONSIDER
In reality
The population mean is unknown
You can only ever draw one sample from the population
Can calculate the sample mean and its standard error (width) but not the population
mean (location)
Often interested in difference in means.. e.g. to measure treatment effects
If the sample is small, sampling error is too large and the sample mean is not
so informative
Can’t reject the “Zero effect” assumption even if sample mean is large and positive
Want the sample size to be “just right” for the job at hand
Need to do power calculations
WHEN THE TRUE MEAN IS
UNKNOWN…
STATISTICAL POWER (TO
DISTINGUISH BETWEEN
POSSIBLE TRUE STATES)
STATISTICAL POWER (TO
DISTINGUISH BETWEEN
POSSIBLE TRUE STATES)
Minimize overlap!
POWER FORMULA FOR CLUSTERED RCT
Significance
Effect Size Variance
Power Level
2
EffectSize 1
t1 t * *
1 (m 1) P1 P n
Proportion in
Average Treatment Sample
ICC Cluster Size Size
POWER: MAIN INGREDIENTS
0.45
0.4
0.35
0.3
control
0.25
treatment
0.2 power
0.15
0.1
0.05
0
-4 -3 -2 -1 0 1 2 3 4 5 6
SAMPLE SPLIT: 75% T, 25% C
POWER: 83%
THE POWER CURVE
Given sampling error, sample size should give enough power (>=
80%) to distinguish “No effect” from alternative scenarios of
interest
Important to choose the appropriate sampling design and sample
size for your problem
NEXT LECTURE:
Non-Experimental Evaluation Methods