0% found this document useful (0 votes)
79 views25 pages

Formalizing The Concepts: Simple Random Sampling: Juan Muñoz Kristen Himelein March 2013

1) The document discusses concepts related to simple random sampling including the purpose of sampling, sampling frame, target population, parameter estimates, unbiased estimators, and random sampling techniques. 2) It describes key random sampling methods like simple random sampling, systematic sampling, stratified sampling, and multi-stage sampling. 3) It also covers calculating sample size, sample variance, standard errors, and confidence intervals for estimating population parameters from sample data.

Uploaded by

Lester John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views25 pages

Formalizing The Concepts: Simple Random Sampling: Juan Muñoz Kristen Himelein March 2013

1) The document discusses concepts related to simple random sampling including the purpose of sampling, sampling frame, target population, parameter estimates, unbiased estimators, and random sampling techniques. 2) It describes key random sampling methods like simple random sampling, systematic sampling, stratified sampling, and multi-stage sampling. 3) It also covers calculating sample size, sample variance, standard errors, and confidence intervals for estimating population parameters from sample data.

Uploaded by

Lester John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Formalizing the Concepts:

Simple Random Sampling


Juan Muñoz
Kristen Himelein
March 2013
Purpose of sampling

To study a portion of the population – through


observations at the level of the units selected, such
as households, persons, institutions or physical
objects – and make quantitative statements about
the entire population
Purpose of sampling
• Why sampling?

– Saves cost compared to full enumeration

– Easier to control quality of sample

– More timely results from sample data

– Measurement can be destructive


Sampling Concepts and Definitions
Unit of analysis
• The level at which a measurement is taken

• Most common units of analysis are persons,


households, farms, and economic establishments
Sampling Concepts and Definitions
Target population or universe
• The complete collection of all the units of
analysis to study.

• Examples: population living in households in a


country; students in primary schools
Sampling Concepts and Definitions
Sampling frame
• List of all the units of analysis whose characteristics
are to be measured
• Comprehensive, non-overlapping and must not
contain irrelevant elements
• Units must be identifiable (often linked to
cartography)
• Should be updated to ensure complete coverage
• Examples: list of establishments; census; civil
registration
Sampling Concepts and Definitions
Parameter / Estimate
• Objective of sampling is to estimate parameters
of a population
• Quantity computed from all N values in a
population set
• Typically, a descriptive measure of a population,
such as mean, variance
– Poverty rate, average income, etc.
Sampling Concepts and Definitions
Unbiased Estimator
• Estimator - mathematical formula or function using sample
results to produce an estimate for the entire population
ˆ  ˆ( X 1 , X 2 ,..., X n )
• When the mean of individual sample estimates equals the
population parameter, then the estimator is unbiased
• Formally, an estimator is unbiased if the expected value of the
(sample) estimates is equal to the (population) parameter
being estimated (where k is the number of experiments).
ˆ1  ˆ2  ...  ˆk
 
k

k
Random sampling

• Also known as scientific sampling or probability


sampling

• Each unit has a non-zero and known probability


of selection

• Mathematical theory is available to predict the


probability distribution of the sampling error
(the error caused by observing a sample instead of the
whole population).
Random sampling techniques

• Single stage, equal probability sampling


– Simple Random Sampling (SRS)
– Systematic sampling with equal probability

• Stratified sampling

• Multi-stages sampling
In real life those techniques are usually combined
in various ways – most sampling designs are
complex
Techniques in Random Sampling
Single stage, equal probability sampling

• Random selection of n “units” from a population of N


units, so that each unit has an equal probability of
selection
– N (population ) → n (sample)
– Probability of selection (sampling fraction) = f = n/N

Is the most basic form of probability sampling and


provides the theoretical basis for more complicated
techniques
Techniques in Random Sampling
Single stage, equal probability sampling
(continued)
1. Simple Random Sampling. The investigator mixes
up the whole target population before grabbing “n”
units.

2. Systematic Random Sampling. The N units in the


population are ranked 1 to N in some order (e.g.,
alphabetic). To select a sample of n units, calculate the
step k ( k= N/n) and take a unit at random, from the
1st k units and then take every kth unit.
Techniques in Random Sampling

Single stage, equal probability sampling

• Advantage
– self-weighting (simplifies the calculation of
estimates and variances)
• Disadvantages
– Sample frame may not be available
– May entail high transportation costs
Techniques in Random Sampling
Stratified sampling
• The population is divided into mutually exclusive
subgroups called strata.
• Then a random sample is selected from each
stratum.
• Common examples : Urban / Rural, Provinces,
Male / Female
Techniques in Random Sampling
Two-stage sampling
• Units of analysis are divided into groups
called Primary Sampling Units (PSUs)
• A sample of PSUs is selected first
• Then a sample of units is chosen in each of
the selected PSUs

This technique can be generalized


(multi-stage sampling)
Sample variance & standard error
• Uncertainty is measured by the standard error (ê).
• Variance of the sample mean of an SRS of „n‟ units for a
population of size „N‟:
 N  n  Var ( X )  n  Var ( X )
eˆ 2  Var ( x)     1  
 N 1  n  N n
• Measure of sampling error. Depends on 3 factors:
– ( 1 - n/N ) = Finite Population Correction (fpc)
– n = sample size
– Var(X) = Population variance. Unknown, but can be
estimated without bias by:
( xi  x)
n 2
sˆ  
2

n 1
x
i 1
Sample Variance in Proportions
• A proportion P (or prevalence) is equal to the mean of
a dummy variable.
• In this case Var(P) = P(1-P), and

pˆ (1  pˆ )
Var ( pˆ ) 
n 1
Standard deviation vs standard error
Population Sample

2 = variance of the population s2 = variance of the sample


 s
= standard deviation around the mean = standard error
N n

Difference: The standard deviation is a descriptive statistic. It is degree to


which individuals in the population differ from the mean of the population. The
standard error is an estimate of how close to the population mean your
sample mean is likely to be.

Standard errors decrease with sample size. Standard deviations are left
unchanged.
Sample Standard Error

n = 100 n = 750

Bigger samples have smaller standard errors around the mean


Confidence intervals
o Estimates obtained from random samples can be
accompanied by measures of the uncertainty associated
with the estimate called confidence intervals.

o It is not sufficient to simply report the sample


proportion obtained by a candidate in the sample
survey, we also need to give an indication of how
accurate the estimate is.
Confidence intervals for averages

x  t  eˆ( x )
where:
tα = 1.28 for confidence level α = 80%
tα = 1.64 for confidence level α = 90%
tα = 1.96 for confidence level α = 95%
tα = 2.58 for confidence level α = 99%
Confidence intervals for proportions

In a sample of 1,000 electors, 280 of them (28


percent) say they will vote Green.

e ˆ) 
Var ( p
ˆ (1  p
p ˆ) 0.28  0.72

n 1 999

Standard error is 1.42 percent.


Confidence intervals
In a sample of 1,000 electors, 280 of them
(28 percent) say they will vote Green.
Standard error is 1.42 percent.
Standard error

24 25 26 27 28 29 30 31 32

95 percent confidence interval 28 ± 1.42 • 1.96

99 percent confidence interval 28 ± 1.42 • 2.58


Sample Size
The required sample size n is determined by
• The variability of the parameter Var(X)
Though this is unknown…
• The maximum margin of error E we are willing to accept
• How confident we want to be in that the error of our estimation
will not exceed that maximum
For each confidence level α there is a coefficient tα
• The size of the population
(not very important)

t2  Var ( X ) t2  P(1  P) nN 


n
n  n 
E2 E2 1  n N

You might also like