0% found this document useful (0 votes)
7 views12 pages

OL-Chapter 4

Chapter 4 discusses the importance of sampling design in research, outlining the processes of selecting a sample from a population to draw conclusions. It defines key concepts such as population, sample, and various sampling methods, including probability and non-probability sampling techniques. Additionally, it highlights the significance of sample size, accuracy, and precision in ensuring that the sample effectively represents the population being studied.

Uploaded by

newaybeyene5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views12 pages

OL-Chapter 4

Chapter 4 discusses the importance of sampling design in research, outlining the processes of selecting a sample from a population to draw conclusions. It defines key concepts such as population, sample, and various sampling methods, including probability and non-probability sampling techniques. Additionally, it highlights the significance of sample size, accuracy, and precision in ensuring that the sample effectively represents the population being studied.

Uploaded by

newaybeyene5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

CHAPTER 4

SAMPLING DESIGN
Once the researcher has clearly specified the problem and developed the appropriate
design and data-collection instrument, the next step is to select the elements from which
information is collected. To effectively undertake research project, the researcher must
have the data, without data the analysis may not be convincing. Information (data) can
be generated from either population or sample.
The basic idea of sampling is that by selecting some of the elements in the population, we
may draw conclusions about the entire population.
Population: It is the totality of object or phenomena under consideration for a specific
study. A population is the total collection of elements about which we wish to make some
inferences.
Census: A census is a count of all the elements in a population. It is a survey that
includes the totality of objects or subjects or phenomenon. But it is not always possible
to undertake a census or a complete enumeration of all items in the population
particularly when the population is too large. So one has to resort to sample survey to
generate the data required for the investigation.
Sample: It is a proper subset or part of population. It is used to represent the population.
A population element is the subject on which the measurement is being taken. It is the
unit of study.
Sampling: is the procedure of selecting a sample from a population. Sampling aims at
obtaining consistent and unbiased estimates of the population. The aim in a sample
survey is not just the characteristics of the sample but also that of population from which
the sample has been drawn.
Why sample?
There are several reasons for sampling;
➢ Lower cost
➢ Greater accuracy of results
➢ Greater speed of data collection
➢ Availability of population elements.
SAMPLING CONCEPTS AND TERMINOLOGIES:
Sampling element: The unit of analysis or case in population, it is from which
information is collected which provides basis for analysis, it is the subject on which
measurement is being taken, and it can be people, a group or an organization.
Sampling unit: That element or set of elements considered for selection in some stage of
sampling. Ex: Commercial Banks in Ethiopia is sampling units.
Sampling frame: It is the actual list of sampling units from which the sample is selected.
It is closely related to the population. It is the list of elements from which the sample is
actually drawn.

1 Sampling Design
Sampling ratio: Size of the sample / size of population.
Steps in sampling design:
1. Define the target population
It is first necessary to define the target population of the collection of elements about
which the researcher wishes to make an inference. Research elements are the objects on
which the measurements are taken. The target population may be kids, children,
individuals, households, business firms etc. The simpler the definition of target
population, the higher the incidence and the easier and less costly it is to find the sample.
Incidence refers to the percentage of the general population that satisfies the criteria
defining the target population. When incidence is high, the cost and time to collect data
are minimized.
2. Identify the sampling frame
It is the listing of the elements from which the actual sample will be drawn. Example;
telephone directory.
3. Select a sampling procedure
The choice of sampling method depends largely on what the researcher can develop for a
sampling frame. For example; a simple random sample requires that a complete, accurate
list of population elements by name or other identification code be available.
4. Determine the sample size
It depends on homogeneity of the population (dispersion variance), degree of confidence
(how much precision they need), number of sub groups to be studied, cost and time
factors.
5. Select the sample elements
The researcher needs to choose the elements that will be included in the study. This
depends upon the type of sample being used and the sampling method.
6. Collect the data from the designated elements
Appropriate methods of data collection methods are adopted to gather information from
the sample elements to support the study, which leads to further analysis and conclusion
to the study.
Characteristics of good sampling:
The logic of the theory of sampling is the logic of induction. i.e., one proceeds from
particular (i.e., sample) to the general (i.e., population) and all results are expressed in
terms of probability. If the population is identical or homogeneous, no need for careful
sampling procedure, any sample would be sufficient. But, when faced with variation or
heterogeneity, in the population, more controlled sampling procedures are required. A
good sample must be, as representative of the entire population as possible, and ideally it
must provide the whole of the information about the population from which the sample
has been drawn. So, the ultimate test of a sample design is how well it represents the
characteristics of the population it purports to represent. In measurement terms the

2 Sampling Design
sample must be valid. Validity of the sample depends on two factors: accuracy and
precision.
a. Accuracy: It is the degree to which bias is absent from the sample. When the
sample is drawn properly, some sample elements underestimate the population
values being studied and others overestimate them. An accurate unbiased sample
is one which the underestimations and overestimations are a balanced among the
members of the sample. There is no systematic variance with an accurate sample.
Systematic variance is defined as the variation in measures due to some known
or unknown influences that cause the scores to lean in one direction more than
another.
b.Precision: The degree to which the standard errors are minimized. No sample
will fully represent its population in all aspects. A sample statistics may be
expected to differ from its parameters as a result of random fluctuations inherent
in the sampling process. This is referred to as the error of variances or sampling
error. Precision is measured by the standard errors of estimates, a type of
standard deviation measurement; the smaller the standard error of estimates, the
higher is the precision of the sample. The ideal sample design produces a small
standard error of the estimate.
Categories of sampling procedures:
There are two categories of sampling procedures: Random (Probabilistic) and Non-
random (Non-Probabilistic).
Probability sampling: Is a sampling technique in which every number of the population
will have a known, non-zero or equal probability of selection.
Non-Probabilistic sampling: is a sampling technique in which units of the sample are
selected on the bases of personal judgment or convenience.
I. Probability sampling techniques:
1. Simple random sampling: a simple random sample is a sample selected in such
a way that every element in the population has the same chance of being chosen,
every sample of size ‘n’ has the same chance to be chosen. Each population
element has a known and equal chance of being selected. It is same as the
practice of picking lottery winners. The disadvantage of this method it is too
expensive to interview a national face-to face sample based on simple random
method, it requires listing of the entire population of interest.
2. Systematic sampling: A systematic random sampling is a sample which contains
every ‘i' th element of the population. The first element is chosen randomly, the
rest systematically. It is one of the most widely used probability sampling
technique, its major advantages are its simplicity and flexibility, it may result in
error when the population is not uniform or homogenous or where there are
systematic arrangements, it is less expensive and easy to carry out. Example:
from 20 elements if we want to take four, then i th element = 20/4 = 5th thus every
5th element of the group of 20 elements after the 1st sample is taken is included in
our sample.

3 Sampling Design
3. Stratified random samples: A stratified sample is a probability sample that is
distinguished by the following two step procedure;
(1) The parent population is divided into mutually exclusive and exhaustive sub
sets.
(2) A simple random sample of elements is chosen independently from each
group or subset.
The subsets into which the universe elements are divided are called strata or
subpopulations. The division is mutually exclusive and exhaustive. This means
that every population element must be assigned to one and only one stratum and
that no population elements are omitted in the assignment procedure.
Most populations can be segregated into a number of mutually exclusive
subpopulations dividing the population to non-overlapping group (strata) is called
stratification. After a population is divided into the appropriate strata a simple
random sample can be taken from each stratum. The stratified sampling technique
is particularly useful when we have heterogeneous populations. With stratification
each stratum is homogenous internally and heterogeneous with other strata.
The researcher chooses a stratified random sample due to the following
reasons;
➢ To increase a sample’s statistical efficiency
➢ To provide adequate data for analyzing the various subpopulations and
➢ To enable different research methods and procedures to be used in
different strata.
The researcher must still decide whether to select;
a) a proportionate stratified sample,
b) a disproportionate stratified sampling.
With a proportionate stratified sampling the number of observations in the total
sample is allocated among the strata in proportion to the relative number of
elements in each stratum in the population. The sample drawn is proportionate to
the stratum’s share of the total population.
Disproportionate stratified sampling involves balancing the two criteria of
strata size and strata variability. With a fixed sample size, strata exhibiting more
variability are sampled more than proportionately to their relative size.
Conversely, those strata that are very homogenous are sampled less than
proportionately.
4. Cluster Sampling: If the total area of interest happens to be big one, a
convenient way in which a sample can be taken is to divide the area into a number
of smaller non-overlapping areas and then to randomly select a number of these
areas. The total area can be divided into groups or clusters of elements and some

4 Sampling Design
of the groups or clusters are selected randomly. This is cluster sampling. Cluster
sampling reduces cost by concentrating surveys in selected areas.
Cluster sampling differs from stratified sampling in several ways;
Stratified sampling Cluster sampling
➢ We divide the population into a few ➢ We divide the population into many
sub groups, each with many elements subgroups, each with a few elements
in it. The subgroups are selected in it. The subgroups are selected
according to some criterion that is according to some criterion of ease
related to the variables under study. or availability in data collection.
➢ We try to secure homogeneity within ➢ We try to secure heterogeneity
subgroups and heterogeneity between within subgroups and homogeneity
subgroups. between subgroups.
➢ We randomly choose elements from ➢ We randomly choose a number of
within each subgroup. subgroups, which we then study in
depth.

5. Area sampling
When research involves population identified from a geographical area, it is
advisable to use area sampling. It is an important form of cluster sampling. Suppose the
investigator is interested in estimating the consumption per household in the city of
Chicago, and how consumption is related to family income. One approach is;
1. Choose a simple random sample of ‘n’ blocks from the population of N blocks.
2. Determine consumption and income for all households in the selected blocks and
generalize the sample relationships to the larger population.
This is also called one stage area sampling.
In the two stage area sampling the selected areas themselves can be sub sampled.
Example;
Consider a universe of 100 blocks. Suppose that there are 20 households per each block.
Assume that a sample of 80 households is required from this total population of 2,000
households. The overall sampling fraction is thus 80/100 = 1/25. There are a number of
ways by which the sample can be selected;
1. Selecting 4 blocks and 20 households per block. (one-stage area sampling)
2. Selecting 10 blocks and 8 households per block (two-stage area sampling)

II. Non Probability sampling: The probability of selecting the population elements is
unknown. We cannot estimate any range within which to expect the population
parameter. There are a variety of ways to choose persons or cases to include in the
sample. Despite accepted superiority of probability sampling methods, non probability

5 Sampling Design
sampling maybe used when probability sampling is prohibitively expensive and when
precise representation is not necessary.
• It is used because of cost and time requirements
• It is used if there is no desire to generalize a population parameter
• The total population may not be available for the study in certain cases.
• It involves personal judgment somewhere in the selection process.
Different non-probability methods could be identified as follows:
1. Convenience sampling (accidental samples): Select anyone who is convenient,
it can produce, ineffective, highly unrepresentative sample and is not
recommended, it has high bias and systematic errors. It is least reliable but cheap
and easy to collect. There is no control to ensure precision. Ex: The person on the
street interviewed for a television programme.
2. Purposive: When one draws a non-probability sample that confirms to a certain
criteria, it is called purposive sampling. It occurs when one picks sample members
to conform to some criteria. Purposive sampling can be of two types; judgment
sampling and quota sampling.
a) Judgment sampling: It uses the judgment of experts in selecting cases or it
selects cases with specific purpose in mind. But the researcher does not know
whether the case selected represents the population: Ex: in the study of standard
of living, the cost of electricity, refrigerator, video recorder, satellite dish cannot
be included for all people in Ethiopia.
b) Quota sampling: Quotas are assigned to different strata group. The logic behind
quota sampling is that certain relevant characteristics describe the dimensions of
the population. In quota sampling, a researcher first identifies categories of people
then decides how many to get in each category. Thus the number of people in
various categories is fixed. It gives no assurance that the sample is a
representative on the variable being studied. As there is no element of
randomization, the extent of sampling error cannot be estimated.
3. Snowball sampling: Also called network, chain referral or computational
sampling is a method for identifying and sampling or selecting the case in a
network. It begins with one or a few people or cases and spreads out on the basis
of links to the initial case. Snowball sampling is a judgment sample used to
sample special populations. The researcher locates an initial set of respondents.
These respondents are used as informants to identify others with desired
characteristics.
Problems in sampling:
a) Non sampling errors: This refers to :
(i) Non-coverage error: this refers to sample frame defects: Ex: omission of part
of the population; soldiers, students, people in hospital are typically excluded
from national survey (but not series). It is serious in telephonic surveys since

6 Sampling Design
those who do not have a telephone are excluded. It also occurs when the cost
used for sampling are incomplete.
(ii) The wrong population is sampled: Be sure that the group being sampled is
drawn from the population. Ex: drawing a sample of college students generalize
about all college age persons.
(iii) Non-response error: The response rate is low. Some people refuse to be
interviewed because they are too busy, or simply do not trust the interviewer or
they may not be interested to.
(iv) Instrumental errors: instrument device to collect data (Ex: questionnaire), Ex:
when questionnaire is badly worded or asked, leading questions or carelessly
worded questions may be misinterpreted.
(v) Interviewer errors: When some characteristics of the interviewer (age, sex, etc)
affects the way in which the respondent answers questions. Ex: Questions about
racial discriminations might be differently answered depending on the racial
group of the interviewer.
b) Sampling errors: It is a random variation in the sample estimates around the true
population parameter. It is calculated only for probability sampling. Random sampling
allows unbiased estimates of sampling error. The measurement of sampling error is
usually called the precision of the sampling plan.
What sample size is required?
The sample size should be a function of the variation in the population parameters under
study and the estimating precision needed by the researcher. Some principles that
influence sample size includes;
➢ The greater the dispersion or variance within the population, the larger the sample
must be to provide estimation precision.
➢ The greater the desired precision of the estimate, the larger the sample must be.
➢ The narrower the interval range, the larger the sample must be.
➢ The higher the confidence level in the estimate, the larger the sample must be.
➢ The greater the number of subgroups of interest within a sample, the greater the
sample size must be, as each subgroup must meet the minimum sample size
requirements.
➢ If the calculated sample size exceeds 5 percent of the population, sample size may
be reduced without sacrificing precision.

7 Sampling Design
Sample size determination:
In the planning of a sample survey, a stage is always reached at which a decision
must be made about the size of the sample. The decision is important. Too large a sample
implies a waste of resources, and too small a sample diminishes the utility of the results.
The decision cannot always be made satisfactorily; often we do not posses enough
information to be sure that our choice of sample sizes is the best one. Sampling theory
provides a framework for determination of sample size.
Consider a situation with a market researcher who is preparing to study the
customer behaviour of some island. Among other things, he wishes to estimate the
percentage of customers belonging to brand A. Cooperation has been secured so that it is
feasible to take a simple random sample. How large should the sample be?
This equation cannot be discussed without first receiving an answer to another
question. How accurately does the market research wish to know the percentage of
customers with brand A? In reply he states that he will be content if the percentage is
correct within ±5% in the sense that, if the sample shows 43% customers use brand A ,
the percentage for the whole island is sure to lie between 38 and 48.
We know that we cannot absolutely guarantee accuracy within 5% except by
measuring everyone. However large n is taken, there is a chance of a very unlucky
sample that is in error by more than the desired 5%. The market researcher is willing to
take a 1 in 20 chance of getting an unlucky sample. Assuming the population is large and
the fpc is ignored, and the sample percentage p is assumed to be normally distributed
In technical terms, p is to lie in the range (P±5%), except for a 1 in 20 chance. Since p is
assumed normally distributed about P, it will lie in the range (P±2σp)
Apart from a 1 in 20 chance. Furthermore,

Hence, we may put

At this stage we should have some idea of the likely value of P. From the historical
records of the company P lies within the range 30 to 60%.
With this information PQ is maximum at P=0.50.
Whether the fpc is required depends on the number of people on the island. If the
population exceeds 8000, the sampling fraction is less than 5% and no adjustment for fpc
can be ignored . Otherwise we readjust the sample size according to the population size.
The chosen value of n must be appraised to see whether it is consistent with the resources
available to take the sample. This demands an estimation of the cost, labour, time, and
materials required to obtain the proposed size of sample. In such cases we reduce the

8 Sampling Design
sample size by changing the permissible error. Let d is the permissible error, N is the
population size
The formula for n in sampling for proportions

For practical use , an advance estimate p to P is substituted in this formula. If N is large, a


first approximation is

Consider a situation with brand preference with d =0.05, p =0.5, α =0.05, t =


2(from normal area tables 1.96)
Thus

Let us assume that there are only 3200 customers in the region. The fpc is needed, and we
find
N= 3200, n0= 400 then we have

Sometimes particularly when estimating the total number NP of units in class C, we wish
to control the relative error r instead of the absolute error in Np; for example, we may
wish to estimate NP with an error not exceeding 10%. That is, we want

For this specification, we substitute rP or rp for d in formulas. We get

In case of estimating the population mean or population total we use the formula

In this n depends is its coefficient of variation .

This is often more stable and easier to guess in advance than S itself.
As a first approximation we take

9 Sampling Design
Substituting an advance estimate of (S/ ). The quantity C is the desired (cv)2 of the
sample estimate.
If n0/N is appreciable we compute n as

Instead of the relative error r we wish to control the absolute error d in , we take

Consider another example in nurseries that produce young tress for sale it is advisable to
estimate, in late winter or early spring, how many healthy young tress are likely to be on
hand, since this determines policy toward the solicitation and acceptance of orders. The
data that follow were obtained from a bed of silver maple seedlings 1 ft wide and 430 ft
long. The sampling unit was 1ft of the length of the bed, so that N= 430. With the earlier
records it was found that .With simple random sampling, how many
units must be taken to estimate within 10% apart from a chance of 1 in 20?
We have

Since is not negligible, we take

Sample size in stratified random sample with Proportional allocation: Let n be the
W 2s 2
n0 =  h h
h whV
n0
total size of the sample. If is not negligible then
N
h Wh sh
n0 =
1
V +  Wh sh 2
n
We compute sh value through pilot survey or past records, V = d2 / t2 d is the permissible
error and t is obtained from area of normal / t- distribution at α level significance.
In case of estimating proportions we obtain sample size the sample size as follows

10 Sampling Design
n1
n0 = , where n1 =  (Wh ph qh / V ) for proportional allocation
 n1 
1+  
N

n1
n0 = , where n1 = ( (Wh ph qh )) 2 / V for optimal allocation.
  Wh ph qh 
1+  
 NV 

In a market research survey of estimating the proportion of customers preferring the


brand A the values of ph and Nh are obtained for four strata. Assuming that the estimated
population proportion shouldnot differ by 10% with a 95% confidence, obytain the
required sample size for proportional allocation and optimal allocation.
Strata Ph Nh
(h)
1 0.318 108
2 0.205 228
3 0.412 235
4 0.158 80

Here V= (0.10 / 1.96)2 = 2.60308X10-3


For proportional allocation n1 = 76 and n0 n= 68
For proportional allocation n1 = 74 and n0 = 67.
If the strata sizes are different, proportional allocation could be used to maintain a steady
sampling fraction throughout the population. The total sample size, n, should be allocated
to the strata proportionally to their sizes:

Nh
This implies nh = n = nWh
N
Optimum allocation: Optimum allocation takes into consideration both the sizes of
the strata and the variability inside the strata. In order to obtain the minimum sampling
variance the total sample size should be allocated to the strata proportionally to their
sizes and also to the standard deviation of their values, i.e. to the square root of the
variances.
nh = constant × Nh sh

11 Sampling Design
This implies

where n is total sample size, nh is the sample size in stratum h, Nh is the size
of stratum h and sh is the square root of the variance in stratum h.
Optimum allocation with variable cost: In some sampling situations, the cost of
sampling in terms of time or money is composed of a fixed part and of a variable part
depending on the stratum.
The sampling cost function is thus of the form:

where C is the total cost of the sampling, c0 is an overhead cost and ch is the cost per
sampling unit in stratum h, which may vary from stratum to stratum.

12 Sampling Design

You might also like