9-1 ASAP Statistics - Sampling-1
9-1 ASAP Statistics - Sampling-1
Business Analytics
Plan procedure
for selecting sampling units
Conduct fieldwork
Why Sample?
Availability of
Lower cost
elements
Sampling
provides
Greater speed Greater
accuracy
14-7
Merits and Demerits of Sampling
population
Population: A complete group of entities sharing
some common set of characteristics.
A Census is am investigation of all individual
Sampling Frame
The list of elements from which a sample may be
drawn, also called population
Target Population Sampling Frame Error
Error that occurs when
The specific, complete certain sample elements are
group relevant to the not listed or available and are
not represented in the
research project sampling frame
Simple Random
1. The lottery method
Drawing names from a hat and selecting
the winning raffle ticket from a large drum
are this type of sampling.
2. The use of Table of Random Numbers
3. Use of computers
Probability of selection= sample size/
Population size
14-19
Simple Random
Advantages Disadvantages
• Easy to implement • Requires list of
with random population
dialing elements
• Time consuming
• Uses larger
sample sizes
• Produces larger
errors
• High cost
14-20
Stratified
Advantages Disadvantages
• Control of sample size in • Increased error will result
strata if subgroups are selected
• Increased statistical at different rates
efficiency • Especially expensive if
• Provides data to strata on population must
represent and analyze be created
subgroups • High cost
• Enables use of different
methods in strata
14-22
Systematic Sampling
In this method every kth element in the
population is selected, beginning with
a random start of an element in the
range of 1 to k. Skip interval k is
determined by :
k= Population size/ sample size
14-23
Cluster Sampling
If the total area of interest happens to be a big
one, a convenient way in which a sample can be
taken is to divide the area into a number of
similar non-overlapping areas and then to
randomly select a number of these smaller areas
(called clusters), with the ultimate sample
consisting of all units in these area or cluster.
Then from each selected sampling units, a
sample of population elements is drawn by either
simple random selection of stratified random
selection
14-24
Cluster Sampling
Advantages Disadvantages
• Provides an unbiased • Often lower statistical
estimate of population efficiency due to
parameters if properly subgroups being
done homogeneous rather
• Economically more than heterogeneous
efficient than simple • Moderate cost
random
• Lowest cost per sample
• Easy to do without list
14-25
Area Sampling
If cluster happens to be some
geographic subdivisions, in that case
cluster sampling is better known as
area sampling.
In large field surveys, clusters
consisting of specific geographical
areas like districts, talukes, villages or
blocks in a city are randomly drawn.
14-27
Multi-Stage Sampling
Muti-stage sampling involves two or
more steps that combine some of the
probability sampling techniques
already described.
Multi stage sampling is applied in big
inquiries extending to a considerable
large geographical area.
Non-Probability sampling
Some of the popular non-
probability sampling techniques
are:
1. Convenience Sampling
2. Judgment Sampling
3. Quota Sampling
4. Snowball Sampling
Convenience Sampling
This is a non-probability sampling method in
which the interviewers will decide the choice of
sampling units based on their convenience.
In most of the situations, the following may be
true:
➢ The sampling units may be distributed sparsely
(thinly)
➢ Many respondents will refuse to fill the
questionnaire
➢ Interviewers may not be serious in selecting
sampling units as per sampling plan, etc.
Judgment Sampling (Purposive
Sampling)
This is a non-probability sampling method
in which the sampling units are selected
on the advice of some expert or by the
intuition /opinion of the researcher
himself.
There is chance of personal biases.
If done seriously, lead to better results.
This is called purposive sampling because the
samples are identified selectively which
prevents the inclusion of other sampling units.
Quota Sampling
Quota sampling is a non-probability
sampling method in which the population
is classified into a number of groups
based on some criterion, say age of the
members of the population, viz., old age,
middle age, and young age.
In this sampling, the proportion of number of
sampling units selected are same as in the
population.
Snowball Sampling
The snowball sampling is a restrictive multi-
stage sampling in which initially certain number
of sampling units are randomly selected. Later,
additional sampling units are selected based on
referral process.
This means that the initially selected
respondents provide addresses of additional
respondents for interviewers.
Initial respondents may be selected randomly,
for example, from the information in the
telephone directories.
Population Parameter
The population mean (µ), standard deviation (σ)
, and proportion (p) are called the parameters of
a distribution.
• Variables in a population
• Measured characteristics of a population
• Greek lower-case letters as notation
S
S x
=
n
S
= X Z cl
n
Random Sampling Error
and Sample Size are
Related
Sample Size
Sample size calculation use:
1.Variance (standard deviation)
2.Magnitude of error
3.Confidence level
Sample Size Formula
2
zs
n=
E
Sample Size Formula - Example
Suppose a survey researcher,
studying expenditures on lipstick,
wishes to have a 95 percent confident
level (Z) and a range of error (E) of
less than $2.00. The estimate of the
standard deviation is $29.00.
Sample Size Formula - Example
(1.96)(29.00)
2 2
zs
n = =
E 2.00
2
56.84
= = (28.42)2
= 808
2.00
Sample Size Formula - Example
(1.96)(29.00)
2 2
zs
n = =
E 4.00
2
56.84
= = (14.21)2
= 202
4.00
Calculating Sample Size
99% Confidence
2 2
(2.57)(29) (2.57)(29)
n= n=
2 4
2
74.53
2
= 74.53
=
2 4
= [37.265] = [18.6325]
2 2
=1389 = 347
Standard Error of the Proportion
sp =
pq
n
or
p (1− p )
n
Confidence Interval for a
Proportion
p ZclSp
Sample Size for a Proportion
2
Z pq
n= 2
E
z2pq
n= 2
E
Where:
n = Number of items in samples