Sampling
Sampling
Chapter 7
Sampling
SAMPLING
Page 1
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Advantages of Sampling
1. Sampling is cheaper than a census survey. It is obviously more economical, for
instance, to cover a sample of households than all households in a territory
although the cost per unit of study may be higher in a sample survey than in a
census.
2. Since magnitude of operations involved in a sample survey is small, both the
execution of the fieldwork and the analysis of the results can be carried out
speedily.
3. Sampling results in greater economy of effort as relatively small staffs is
required to carry out the survey and to tabulate and process the survey data.
4. A sample survey enables the researcher to collect more detailed information
than would otherwise be possible in a census survey. Also, information of a
more specialised type can be collected, which would not be possible in a
census survey on account of availability of a small number of specialists.
5. Since the scale of operations involved in a sample survey is small, the quality of
interviewing, supervision and other related activities can be better than the
quality in a census survey.
Page 3
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Limitations of Sampling
1. When the information is needed on every unit in the population such as
individuals, dwelling units or business establishments, a sample survey cannot
be of much help for it fails to provide information on individual count.
2. Sampling gives rise to certain errors. If these errors are too large, the results of
the sample survey will be of extremely limited use.
3. While in a census survey it may be easy to check the omissions of certain units
in view of complete coverage, this is not so in the case of sample survey.
Step Description
1. Define the population The population is defined in terms of a) element, b)
units, c) extent and d) time.
2. Specify sampling frame The means of representing the elements of the
population – for example telephone book, map, or
city directory – are described.
3. Specify sampling unit The unit for sampling – for example, city block,
company, or household – is selected. The sampling
unit may contain one or several population
elements.
4. Specify sampling method The method by which sampling units are to be
selected is described.
5. Determine sample size The number of elements of the population to be
sampled is chosen.
6. Specify sampling plan The operational procedures for selection of the
sampling units are selected.
7. Select the sample The office and fieldwork necessary for the selection
of the sample are carried out.
Step 1: Define the population
It is the aggregate of all elements defined prior to selection of sample. A population
must be defined in terms of
• elements,
Page 4
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
• sampling units,
• extent and
• time.
Eliminating any one of these specifications leaves an incomplete definition of the
population that is to be sampled.
Page 5
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Page 6
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
It may be pointed out that these four criteria come into conflict with each other in
most of the cases, and the researcher should carefully balance the conflicting
criteria so that he is able to select a really good sample design.
Page 7
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Sampling Techniques
Sampling techniques may be broadly classified as non-probability and probability
sampling techniques.
Page 8
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
from which the sample was drawn. Probability sampling techniques are
classified based on :
− Element versus cluster sampling
− Equal unit probability versus unequal probabilities
− Unstratified versus stratified selection
− Random versus systematic selection
− Single-stage versus multistage techniques
Non-probability techniques:
Page 9
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Convenience Sampling
Definition
A non-probability sampling technique that attempts to obtain a sample of convenient
elements. The selection of sampling units is left primarily to the interviewer.
Explanation
1. It is a form of Non-Probability sampling.
2. It is mainly used for Dipstick studies. This type of sampling is normally used to
get basic information to take elementary decisions.
3. Convenience samples are often used in exploratory situations when there is a
need to get only an approximation of the actual value quickly and inexpensively.
4. Commonly used Convenience samples are associates and “the man on the
street”. Such samples are often used in the pre-test phase of the study, such as
pre-testing of a questionnaire.
Examples:
• Use of students, church groups, and members of social organizations,
• Mall-intercept interviews without qualifying the respondents,
• Department stores using charge account lists
• Tear out questionnaire included in a magazines, and
• People on the street interviews
Advantages
• Convenience sampling is the least expensive and least time consuming of all
sampling techniques.
• The sampling units are accessible, easy to measure and co-operative.
• This technique is used in exploratory research for generating ideas, insight or
hypothesis.
Disadvantages
• Convenience samples contain unknown amounts of both variables and
systematic selection errors.
• These errors can be very large when compared to the variable error in a simple
random sampling of the same size.
Page 10
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Judgmental sampling
Definition
A form of convenience sampling in which the population elements are purposively
selected based on the judgment of the researcher.
Explanation
A judgment sample is one in which there is an attempt to draw a representative
sample of the population using judgmental selection procedures. Judgment samples
are common in industrial market research.
Example
A sample of addresses taken by the municipal agency to which questionnaires on
bicycle riding habits were sent. A judgment sample was taken after researchers
looked at traffic maps of the city, considered the tax assessment on houses and
apartment buildings (per unit), and kept location of schools and parks in mind.
Advantages
• Judgmental sampling is low cost, convenient and quick.
• Judgmental sampling is subjective and its value depends entirely on the
researchers judgment, expertise and creativity.
• It is useful if broad population inferences are not required.
Disadvantage
• It does not allow direct generalization to a specific population, usually because
the population is not defined explicitly.
Quota Sampling
Page 11
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Definition
A non probability sampling techniques that is a two stage restricted judgmental
sampling. The first stage consists of developing control categories or quotas of
population elements. In the second stag, sample elements are selected based on
convenience or judgment.
Explanation
• It is a form of Non-Probability sampling.
• In Quota Sampling, the samples are selected in such a way that the interest
parameters represented in the sample are in the same proportion as they are in
the universe/ population.
• Quota Sampling is widely used in consumer panels.
• The following aspects must be kept in mind while choosing the control variables:
− The variables must be available and should be recent.
− They should be easy for the interviewer to classify.
− They should be closely related to the variable being measured in the study.
− The number of variable must be kept to a reasonable number so as to avoid
confusion while analyzing the data
The cost of sample per unit is directly proportional to the number of control variables.
In order to have a check mechanism about the quality of samples taken so as to
reduce the selection errors, Quota Samples are “validated” after they are taken.
The process of validation involves a comparison of the sample and the population
with respect to characteristics not used as control variables. For e.g. in a quota
sample taken from a consumer panel for which income, education, and age group
are used as control variables. If the comparison of this panel and the population
might be made with respect to such characteristics as average number of children,
occupation of the chief wage earner and home ownership. Then if the panel differed
significantly from the population with respect to any of these characteristics, it would
be an indication of the potential bias in the selection procedures. It should be noted
that the similarity does not necessarily mean the absence of bias.
Page 12
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Example
If one wants to select a Quota sample of persons for a test of flavored tea and wants
to control (control variables are the parameters based on which he would like to
classify the universe) it by ethnic background, income bracket, age group and
geographical area. Then the sample taken would have the same proportion of
people in each ethnic background, income bracket, age group and geographical area
as the population.
Disadvantages
• Scope for high variances
• Scope for sizable selection errors.
• Selection errors arise from the way interviewers select the persons/ variables to
fill the quota. Incorrect information of the proportions of the population in each of
the control variables, biases in the relationship of the control variables to the
variables being measured, and from other sources.
Probability Techniques:
Probability sampling techniques vary in terms of sampling efficiency. Sampling
efficiency is a concept that reflects a trade-offs between sampling cost and precision.
Precision refers to the level of uncertainty about the characteristic being measured.
The greater the precision, the greater the cost and most studies require trade-off.
every other element is selected independently of every other element. The sample is
drawn by a random procedure from a sampling frame. This method is equivalent to a
lottery system in which names are placed in a container, the container is shaken, and
the names of the winners are then drawn out in an unbiased manner.
To draw a simple random sample, the researcher first compiles a sampling frame in
which each element is assigned a unique identification number. Then random
numbers are generated to determine which element to include in the sample. The
random numbers may be generated with a computer routine or a table.
Advantages
• It is easy to understand
• The sample result may be projected to the target population.
Disadvantages
• It is often difficult to construct a sampling frame that will permit a simple random
sample to be drawn.
• SRS can result in samples that are very large or spread over large geographic
areas, thus increasing the time and cost of data collection.
• SRS often results in lower precision with larger standard errors than other
probability sampling techniques.
• SRS may or may not result in a representative sample. Although samples drawn
will represent the population well on average, a given simple random sample
may grossly misrepresent the target population. This more likely if the size of
the sample is small.
Systematic sampling
Definition
A probability sampling technique in which the sample is chosen by selecting a
random starting point and then picking every ith element in succession from the
sampling frame.
Page 14
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Explanation
In systematic sampling, the sample is chosen by selecting a random starting point
and then picking every ith element in succession from the sampling frame. The
sampling interval, i, is determined by dividing the population size N by the sample
size n and rounding to the nearest integer.
Example
Suppose there are 100,000 elements in the population and a sample of 1000
desired. In this case the sampling interval, i, is 100. A random number between 1 to
100 is selected. If say number 23 is selected, the sample will then consists of
elements 23, 123, 223, 323, 423, 523, and so on.
Systematic sampling is similar to SRS in that each population element has a known
and equal probability of selection. However, it is different from SRS in that only the
permissible samples of size n that can be drawn have a known and equal probability
of selection. The remaining samples of size n have a zero probability of being
selected.
For systematic sampling, the researcher assumes that the population elements are
ordered in some respect. In some cases the ordering (alphabetic listing in a
telephone book) is unrelated to the characteristic of interest. In other instances, the
ordering is directly related to the characteristic under investigation. (Credit card
customers may be listed in order of outstanding balances. If the population elements
are arranged in a manner unrelated to the characteristic of interest, systematic
sampling will yield result quite similar to SRS.
On the other hand, when the ordering of the element is related to the characteristic
of interest, systematic sampling increases the representatives of the sample.
Advantages
• Systematic sampling is less costly and easier that SRS, because random
selection is done only once.
• The random numbers do not have to be matched with individual element as in
SRS. Since some lists contains millions of elements, considerable time can be
saved. This in turn again reduces the cost.
Page 15
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Page 16
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
would be taken separately for sampling purposes. That is, the total population could
be divided into age groups and a separate sample is drawn from each group.
Cluster Sampling
Definition
The target population is divided into mutually exclusive and collectively exhaustive
subpopulation called clusters. Then a random sample of clusters is selected based
on probability sampling techniques such as simple random sampling. For each
selected clusters, either all the elements are included in the sample or a sample of
elements is drawn probabilistically.
Explanation
• If all the elements in each selected cluster are included in the sample, the
procedure is called one stage cluster sampling.
• If a sample of elements is drawn probabilistically from each selected cluster, the
procedure is called two-stage cluster sampling.
• The key distinction between cluster sampling and stratified sampling is that in
cluster sampling only a sample of subpopulations (clusters) is chosen, whereas
in stratified sampling all the subpopulations are selected.
• The objective of the cluster sampling is to increase the sampling efficiency by
decreasing costs.
Example
If the study requires studying the households in the city then in cluster sampling the
whole city is divided into Blocks and to take each household on each block selected.
Thus to get a representative whole of the universe.
Advantages
• Low population heterogeneity / high population homogeneity
• Low expected cost of errors.
• The main advantage of cluster sampling is the low cost per sampling unit as
compared to other sampling methods.
Page 17
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Disadvantage
• High potential of sampling error as compared to other methods.
• For eg: The lower cost per unit and higher sampling error potential of a cluster
sample is illustrated by considering a sample of 100 households to be selected
for personal interviews from a particular city. In this method the city would be
divided in blocks and 10 households from 10 selected blocks would be selected
and interviewed. Thus the cost of personal interview per unit will be low because
of the close proximity of the units in the cluster. This sample may not be the
exact representation of the entire city. Thus there is a possibility of sampling
error.
Page 18
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Probability Sampling
Page 19
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Page 20
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
In exploratory research the findings are treated as preliminary and the use of
probability sampling may not be warranted. On the other hand, in conclusive
research in which the researcher wishes to use the results to estimate overall market
shares or the size of the total market, probability sampling is favored. Probability
samples allow statistical projection of the results to a target population.
Page 21
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
This method concentrates on the cost of the information and is not concerned
about its value. Although cost always has to be considered in any systematic
approach to sample size determination, one also needs to give consideration to
how much the information to be provided by the sample will be worth. This
approach produces sample sizes that are larger than required as well as sizes
that are smaller than optimal.
3. Required Size Per Cell: This method of determining sample size can be used on
simple random, stratified random, purposive and quota samples. For example,
in a study of attitudes with respect to fast food establishments in a local
marketing area it was decided that information was desired for two occupational
groups and for each of the four age groups. This resulted in 2 x 4 = 8 sample
Page 22
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
cells. A sample size of 30 was needed per cell for the types of statistical
analyses that were to be conducted. The overall sample size was therefore 8 x
30 = 240.
4. Use of Traditional Statistical Model: The formula for traditional statistical model
depends upon the type of sample to be taken and it always incorporates three
common variables
• an estimate of the variance in the population from which the sample is to be
drawn,
• the error from sampling that the researcher will allow, and
• the desired level of confidence that the actual sampling error will be within the
allowable limits.
The statistical models for simple random sampling include estimation of means
and estimation of proportion.
5. Use of Bayesian Statistical Model: The Bayesian model involves finding the
difference between the expected value of the information to be provided by the
sample size. This difference is known as expected net gain from sampling
(ENGS). The sample size with the largest positive ENGS is chosen.
The Bayesian model is not as widely used as the traditional statistical models
for determining sample size, even though it incorporates the cost of sampling
and the traditional models do not. The reasons for the relative infrequent use of
Bayesian model are related to greater complexity and perceived difficulty of
making the estimates required for Bayesian model as compared to the
traditional models.
Page 23
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
Standard deviation is called standard error of the mean to indicate to indicate that it
applies to a distribution of sample means and not to a single sample or a population.
A basic characteristic of a sampling distribution is that the area under it
(between any two points) can be calculated so long as each point is defined by the
number of standard errors it is away from the mean. The number of standard error, a
point is away from the mean is referred as the Z value for that point.
Page 24
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
FORMULA:
The estimated standard error of the proportion (given a large sample size that is a
small proportion of the population) is
FORMULA:
1. Specification of error (e) that can be allowed –how close must the estimate be
(how accurate do we need to be)?
2. Specification of confidence coefficient –what level of confidence is required that
the actual sampling error does not exceed that specified (how sure do we want
to be that we have achieved our desired accuracy)?
3. Estimate of the population standard deviation( ) –what is the standard deviation
of the population (how “spread out” or diverse is the population)?
FORMULA:
The only unknown variable is sample size (n). A simpler formula for the size of
simple random samples can be derived from the above equation.
FORMULA:
The specifications that must be made to determine the sample size for an estimation
problem involving a proportion are very similar to those for a mean. They are
1. Specification of error (e) that can be allowed –how close must the estimate be?
2. Specification of confidence coefficient –what level of confidence is required that
the actual sampling error does not exceed that specified?
3. Estimate of the population proportion (P) using prior information –what is the
approximate or estimated population proportion?
Specifications, along with the sample size, collectively determine the sampling
distribution for the problem. Because sample size is the only remaining unknown, it
can be calculated. The above mentioned three specifications are related as follows:
Number of standard errors implied by confidence coefficient = allowable error
standard error
or in symbols,
FORMULA:
Page 26
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
In order to determine the sample the size in a hypothesis testing problem involving
proportion, the following specifications must be made:
1. the hypotheses to be tested: A null and an alternate hypothesis are involved in
each hypothesis test. A null hypothesis, designated by Ho, is one that, if
accepted, will result in no option being formed and/or action being taken that is
different from those currently held or being used. The null hypothesis in the
problem just described is
Ho: order rate = 3.5%
The alternate hypothesis, designated by H1, is one that will lead to opinions
being formed and/or actions being taken that are different from those currently
held or being used. The alternate hypothesis here is
H1: order rate = 5.0%
Although null hypothesis is always explicitly stated, this is sometimes not true of
the alternate hypothesis. In those instances when it is not stated it is understood
that it consists of all values of the proportion not reserved by the null hypothesis.
Page 27
Chapter 7 –Sampling V.E.S College of Arts, Science & Commerce
In this situation if the alternate hypothesis were not explicitly stated, it would be
understood that it would be
H1: order rate (is not equal to) 3.5%
2. the level of sampling error permitted in the test of each hypothesis: Two types of
error can be made in hypothesis testing problems. An error is made when null
hypothesis is true but the conclusion is reached that the alternate hypothesis
should be accepted. This is known as Type I error. The Type II error is made
when the alternate hypothesis is accepted
3. the test statistic to be used.
Page 28