Sampling
Sampling
SAMPLING
⦿ Population refers to the entire group of people, events, or things of
interest with the required characteristics that the researcher wishes
to investigate.
⦿ Because there is very rarely enough time or money to gather
information from everyone or everything in a population, the goal
becomes finding a representative sample (or subset) of that
population.
⦿ A sample is “a smaller & representative collection of units from a
population used to determine truths about that population” .
⦿ Sampling is the process of selecting a sufficient number of elements
from the population, so that a study of the sample and an
understanding of its properties or characteristics would make it
possible for us to generalize such properties or characteristics to the
population elements.
⦿ Sampling involves selecting a relatively small number of
elements (sample) from a larger defined group (population)
and expecting the information gathered from the small group
will enable judgments about the larger group.
4
Sampling Terms
Population
Theoretical population
Sampling units
(available elements)
Sampling frame
SAMPLING BREAKDOWN 6
SAMPLING…….
STUDY POPULATION
SAMPL
E
TARGET POPULATION
7
Terminology
Population
The entire group of people of interest from whom the researcher
needs to obtain information.
Element (population unit)
one unit from a population. If 1,000 employees in a hospital
happen to be the population of interest to a researcher, each
employee therein is an element.
Sampling
The selection of a subset of the population
Sampling Frame
Listing of population from which a sample is chosen
Census
A polling of the entire population
Survey
A polling of the sample
Terminology
Parameter
A parameter is a measure/number describing characteristic of
population based on all the elements of the population. eg
population mean or mode. It implies a summary description of the
characteristics of the target population.
Statistic
A statistic is defined as a numerical value, which is obtained from
a sample of data. It is a descriptive statistical measure and
function of sample observation. A statistic is something that
describes a sample (eg sample mean)and is used as an
estimator for a population parameter. (because samples should
represent populations!). the statistic is a summary value of a
small group of population i.e. sample.
SAMPLING FRAME
⦿ Source list from which sample is drawn. In the most
straightforward case, such as the sentencing of a batch of
material from production (acceptance sampling by lots), it
is possible to identify and measure every single item in
the population and to include any one of them in our
sample. However, in the more general case this is not
possible. There is no way to identify all rats in the set of
all rats. Where voting is not compulsory, there is no way
to identify which people will actually vote at a forthcoming
election (in advance of the election)
⦿ As a remedy, we seek a sampling frame which has the
property that we can identify every single element and
include any in our sample .
⦿ The sampling frame must be representative of the
population
10
Sampling
Methods
Probability
sampling
Nonprobabil
ity
sampling
PROBABILITY SAMPLING
Probability Nonprobability
⦿ Simple random ⦿ Convenience
sampling sampling
⦿ Systematic ⦿ Judgment sampling
random sampling ⦿ Quota sampling
⦿ Stratified random ⦿ Snowball sampling
sampling
⦿ Cluster sampling
⦿ Multistage
sampling
1. SIMPLE RANDOM SAMPLING
• Applicable when population is small,
homogeneous & readily available
• All subsets of the frame are given an equal
probability. Each element of the frame thus has
an equal probability of selection.
• It provides for greatest number of possible
samples. This is done by assigning a number to
each unit in the sampling frame.
• A table of random number or lottery system is
used to determine which units are to be
selected.
15
■ Advantages
■ Representativeness and Freedom from Bias
■ Estimates are easy to calculate.
■ Minimal knowledge of population needed
■ Easy to analyze data
■ Disadvantages
■ Time and Labor Requirement
■ Requires sampling frame
■ Does not use researchers’ expertise
■ Larger risk of random error than stratified
2. SYSTEMATIC SAMPLING
17
SYSTEMATIC SAMPLING……
As described above, systematic sampling is an EPS method, because
all elements have the same probability of selection (in the example
given, one in ten). It is not 'simple random sampling' because different
subsets of the same size have different selection probabilities - e.g. the
set {4,14,24,...,994} has a one-in-ten probability of selection, but the
set {4,13,24,34,...} has zero probability of selection.
18
SYSTEMATIC SAMPLING……
ADVANTAGES:
⦿ Sample easy to select
⦿ Suitable sampling frame can be identified easily
⦿ Sample evenly spread over entire reference
population
⦿ Easier and less costlier method of sampling
DISADVANTAGES:
⦿ Sample may be biased if hidden periodicity in
population coincides with that of selection.
⦿ Difficult to assess precision of estimate from one
survey.
19
3. STRATIFIED SAMPLING
⦿ Where population embraces a number of distinct
categories i.e. If a population from which a sample is to be
drawn does not constitute a homogeneous group, stratified
sampling technique is generally applied in order to obtain a
representative sample. The frame can be organized into
separate "strata.” Under stratified sampling the population
is divided into several sub-populations that are individually
more homogeneous than the total population (the different
sub-populations are called ‘strata’) and then we select
items from each stratum to constitute a sample. Since each
stratum is more homogeneous than the total population, we
are able to get more precise estimates for each stratum and
by estimating more accurately each of the component
parts, we get a better estimate of the whole
⦿ Every unit in a stratum has same chance of being selected
20
STRATIFIED SAMPLING……
ADVANTAGES
21
STRATIFIED SAMPLING…….
22
4. CLUSTER SAMPLING
⦿ If the total area of interest happens to be a big one, a
convenient way in which a sample can be taken is to divide the
area into a number of smaller non-overlapping areas and then
to randomly select a number of these smaller areas (usually
called clusters), with the ultimate sample consisting of all (or
samples of) units in these small areas or clusters.
⦿ Thus in cluster sampling the total population is divided into a
number of relatively small subdivisions which are themselves
clusters of still smaller units and then some of these clusters are
randomly selected for inclusion in the overall sample Population
divided into clusters of homogeneous units, usually based on
geographical contiguity.
⦿ Sampling units are groups rather than individuals.
⦿ A sample of such clusters is then selected.
⦿ All units from the selected clusters are studied.
23
Advantages
Disadvantages
25
5. MULTISTAGE SAMPLING
⦿ Multi-stage sampling is a further development of the principle
of cluster sampling. Suppose we want to investigate the
working efficiency of nationalized banks in India and we want
to take a sample of few banks for this purpose. The first stage
is to select large primary sampling unit such as states in a
country. Then we may select certain districts and interview all
banks in the chosen districts. This would represent a two-stage
sampling design with the ultimate sampling units being clusters
of districts. If instead of taking a census of all banks within the
selected districts, we select certain towns and interview all
banks in the chosen towns. This would represent a three-stage
sampling design. If instead of taking a census of all banks
within the selected towns, we randomly sample banks from
each selected town, then it is a case of using a four-stage
sampling plan. If we select randomly at all stages, we will have
what is known as ‘multi-stage random sampling design’.26
Non Probability
Sampling
1. CONVENIENCE SAMPLING
⦿ Sometimes known as opportunity sampling or accidental
sampling.
⦿ A type of non probability sampling which involves the sample
being drawn from that part of the population which is close to
hand. That is, readily available and convenient.
⦿ The researcher using such a sample cannot scientifically make
generalizations about the total population from this sample
because it would not be representative enough.
⦿ For example, if the interviewer wants to conduct a survey at a
hospital early in the morning on a given day, the patients that
he/she could interview would be limited to those given there at
that given time, which would not represent the views of other
patients of society in such an area, if the survey was to be
conducted at different times of day and several times per week.
⦿ This type of sampling is most useful for pilot testing.
28
■ Advantages
■ Very low cost
■ Extensively used/understood
■ No need for list of population elements
■ Disadvantages
■ Variability and bias cannot be measured
or controlled
■ Projecting data beyond sample not
justified.
2. Judgmental sampling or Purposive sampling
⦿ The researcher chooses the sample based on who they think would be
appropriate for the study. This is used primarily when there is a limited
number of people that have expertise in the area being researched
⦿ This type of sampling technique might be the most appropriate if the
population to be studied is difficult to locate or if some members are thought
to be better (more knowledgeable, more willing, etc.) than others to
interview. This determination is often made on the advice and with the
assistance of the client. For instance, if you wanted to interview incentive
travel organizers within a specific industry to determine their needs or
destination preferences, you might find that not only are there relatively few,
they are also extremely busy and may well be reluctant to take time to talk
to you. Relying on the judgment of some knowledgeable experts may be far
more productive in identifying potential interviewees than trying to develop
a list of the population in order to randomly select a small number.
30
■ Advantages
■ Moderate cost
■ Commonly used/understood
■ Sample will meet a specific objective
■ Disadvantages
■ Bias!
■ Projecting data beyond sample not
justified.
3. Snowball Sampling
▪ Also called chain referral sampling.
▪ Selection of additional respondents is based on referrals
from the initial respondents.
› friends of friends
▪ Thus the sample group appears to grow like a rolling
snowball
▪ Used to sample from low incidence or rare populations.
▪ This sampling technique is often used in hidden
populations which are difficult for researchers to access;
example populations would be drug users or sex
workers.
Method of snowball sampling
■ Disadvantages
■ Bias because sampling units not independent
■ Projecting data beyond sample not justified.
■ Not Random
4. Quota Sampling
Determine Appropriate
Sample Size
Execute Sampling
Design
1. Define the Target Population
It addresses the question “Ideally, who do you want to
survey?” i.e. those who have the information sought
What are their characteristics. Who should be
excluded?
– age, gender, product use, those in industry
– Geographic area
It involves
– defining population units
– setting population boundaries
– Screening (e.g. security questions, product use)
Define the Target Population
▪ phone book
⦿ .04= 1.96√0.2x0.8/n
⦿ n= 0.2x0.8x1.96x1.96/.04x.04
⦿ n= 384.2
If population range is given
⦿ ME = Z( s/√n)
⦿ n = (sZ/ME)2
⦿ Standard deviation is approx. 1/6th of
range.
⦿ use the z value, e.g. 1.96 for a 95%
confidence interval.
Slovin’s formula
⦿ When population size is known
⦿ n = N / (1 + Ne2)
Where:
• n = Sample size
• N = Total population
• e = Margin of error
• Suppose Population size is 1000
• n = N / (1 + N e2) =
1,000 / (1 + 1000 * 0.05 2) = 285.714286
• =286
Sample size calculation for case control studies
Suppose a researcher wants to see association between childhood
violence with psychiatric disorder in adulthood. He will take a sample
of adult persons with psychiatric disorders and will take another
sample of adults with no psychiatric disorders. He will go
retrospectively to see history of childhood violence in both groups.
Exposure to both groups will be compared and odds ratio will be
calculated.
So if the researcher wants to calculate sample size for the
above-mentioned case control study to know link between childhood
violence with psychiatric disorder in adulthood and he wants to fix
power of study at 80% and assuming expected proportions in case
group and control group are 0.35 and 0.20 respectively, and he
wants to have equal number cases and control; then the sample
size per group will be
Sample size calculation for case control studies
Sample size estimation in intervention studies
Sample size estimation in intervention studies (single
group)