Sampling Methods
Sampling Methods
Recall…
Statistics is a tool for converting data into information:
Statistics
Data Information
Personal Interview,
Telephone Interview,
Self Administered Questionnaire, and
Internet
Population and Sample
• A population can be defined as including all people
or items with the characteristic one wishes to
understand.
• Because there is very rarely enough time or money
to gather information from everyone or everything
in a population, the goal becomes finding a
representative sample (or subset) of that
population.
Sample: A sample is a small representative fraction of a population. For
example, to investigate the adaptation status of a new insecticide of all
the farmers of the country, some farmers are selected to collect
necessary data, selected farmers constitute a sample of the population
of farmers.
Important statistical terms
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all responses,
measurements, or counts that are of interest)
Sample:
A subset of the population
Sampling:
Sampling is the process/method of selecting a sample from a
population
Target Population:
The population to be studied/ to which the investigator wants
to generalize his results
Sampling Unit:
Smallest unit from which sample can be selected
Sampling frame
List of all the sampling units from which sample is drawn
Sampling scheme/sampling plan/sampling method
Method of selecting sampling units from sampling frame
Why sampling?
Get information about large populations
Less costs
Less field time
More accuracy
When it’s impossible to study the whole population
When might you sample the entire population?
When your population is very small
When you have extensive resources
When you don’t expect a very high response
Classification of Sampling
Methods
Sampling
Methods
Probability Non-
Samples probability
Simple
Cluster Judgment Quota
Random
Procedure for Drawing a Probability Sample
Disadvantages :
◙ The main demerit of this method is that it is not a random sampling
method in the true sense.
◙ If the correct and complete sampling frame is not known, sampling
in this method is not possible.
◙ If population size is not a multiple of the sample size -
(i) Resulting sample may not be of the required size.
(ii) Sample mean will not be an unbiased estimate of the
population mean.
5.18
Stratified Random Sampling
If the population is not homogeneous (the population elements
are not similar) in respect of the characteristic under study, a
simple random sample may not properly represent the
population. In such cases, the whole population is divided into
a number of more or less homogeneous subdivisions, these
subdivisions are called strata. From each of these subdivisions,
separate random selections of elements are made to constitute
a sample. This method of sampling is known as stratified
random sampling.
The strata should be such that -
◙ Elements included in each stratum should be as far as possible of
homogeneous nature, and
◙ Elements of different strata should be as far as possible of different
nature.
Stratified Random Sampling…
5.20
Stratified Random Sampling…
After the population has been stratified, we can use simple
random sampling to generate the complete sample:
5.21
Advantages :
◙ Sample units are selected from different strata of the
population on the basis of relative importance, so the sample
drawn in this method is more representative compared to the
sample obtained by other methods.
◙ Administration of stratified random sampling is more
convenient than simple random sampling.
◙ Sampling unit selection is less expensive and less time
consuming in stratified random sampling compared to simple
random sampling.
◙ Supervision is comparatively easier in stratified random
sampling.
Disadvantages :
◙ Stratum selection sometimes may become complicated.
Improper stratification leads to reduce the reliability of the
collected information.
◙ It is not easy to determine the sample components of
different strata without previous experience.
◙ Sampling is not possible if sizes of the different strata are
not known.
Types of Stratified Samples
Proportional Stratified Sample:
The number of sampling units drawn
from each stratum is in proportion to
the relative population size of that
stratum
Disproportional Stratified Sample:
The number of sampling units drawn
from each stratum is allocated
according to analytical considerations
e.g. as variability increases sample
size of stratum should increase
Types of Stratified Samples…
Male 160 80
Female 40 20
Total 200 100
(Lal Das D.K,2008)
• Now, if the investigator has decided to draw a sample of
suppose 60. then he has draw 30 from each sub-strata.
Disproportionate sampling gives equal weights to each 29
Cluster Random Sampling
In this method the population is divided into a required
number of mutually exclusive groups or classes; these groups
or classes are known as clusters. Then some clusters are
randomly selected and data are collected from all the units
included in these selected clusters.
Cluster Random Sampling
1. Take the population map.
2. Divide it into equal clusters
3. Assign each cluster a random number.
4. Select the cluster/s on the basis of a pre-decided rule.
5. Divide the selected cluster into sub-clusters.
6. Repeat steps 2-4 until you get a manageable sub-cluster.
Cluster sampling
Section 1 Section 2
Section 3
Section 5
Section 4
Advantages
Low cost/high frequency of use
Requires list of all clusters, but only of
individuals within chosen clusters
Can estimate characteristics of both cluster and
population
For multistage, has strengths of used methods
Disadvantages
Larger error for comparable size than other
probability methods
Multistage very expensive
Types of Cluster Sampling
Cluster Sampling
35
Non probability samples
Convenience samples
sample is selected from elements of a population that
are easily accessible
Snowball sampling (friend of friend….etc.)
Purposive sampling (judgemental)
•You chose who you think should be in the
study
Quota sample
Non-Probability Sampling
Methods
Convenience Sample
The sampling procedure used to obtain
those units or people most conveniently
available
Why: speed and cost
External validity?
Internal validity
Is it ever justified?
Judgment or Purposive Sample
The sampling procedure in which an
experienced research selects the sample
based on some appropriate characteristic
of sample members… to serve a purpose
Advantages
Moderate cost
Commonly used/understood
Disadvantages
Bias!
Disadvantages
Bias because sampling units not
independent
Projecting data beyond sample not
justified.
Sampling Error…
Sampling error refers to differences between the sample and
the population that exist only because of the observations
that happened to be selected for the sample.
Note: increasing the sample size will not reduce this type of
error.
Common research designs
• Experimental design
–Subjects are randomly assigned to treatments (=variables) by the
researcher
–Causal inferences are stronger
–Random sampling from the population less important
–Usually laboratory (exc. Moving to Opportunity, MTO)
• Observational design (e.g., surveys)
–Subjects are not randomly assigned to variables
–Random sampling is important.
–Selection bias
–Causal inferences are compromised.
Determining Sample Size
• What data do you need to consider
–Variance or heterogeneity of population
–The degree of acceptable error (confidence interval)
–Confidence level
Quantitative Qualitative
2
Z σ 2 Z2 π(1 π)
n n
D2 D2
(σ12 σ 22 )xF 2 P (1 - P) F
n n
D 2
D2
Problem 1
A study is to be performed to determine a certain
parameter in a community. From a previous study a
sd of 46 was obtained.
If a sample error of up to 4 is to be accepted. How
many subjects should be included in this study at
99% level of confidence?
Answer
2
Z σ 2
n
D 2
2
2.58 x 46 2
n 880.3 ~ 881
42
Problem 2
• A study is to be done to determine effect of 2 drugs
(A and B) on blood glucose level. From previous
studies using those drugs, Sd of BGL of 8 and 12
g/dl were obtained respectively.
• A significant level of 95% and a power of 90% is
required to detect a mean difference between the two
groups of 3 g/dl. How many subjects should be
include in each group?
Answer
(σ σ )xF
2 2
n 2
1 2
D
(8 12 )x10.5
2 2
n 2
242.6 ~ 243
3
in each group
Problem 3
It was desired to estimate proportion of anaemic
children in a certain preparatory school. In a similar
study at another school a proportion of 30 % was
detected.
Compute the minimal sample size required at a
confidence limit of 95% and accepting a difference of
up to 4% of the true population.
Answer
Z π(1 π)
2
n 2
D
2 P (1 - P) F
n
D2