Bus Research Chapter Six

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 10



6.1. Some Fundamental Definitions

Researchers usually cannot make direct observations of every individual in the population they
are studying. Instead, they collect data from a subset of individuals – a sample – and use those
observations to make inferences about the entire population.
Ideally, the sample corresponds to the larger population on the characteristic(s) of interest.
In that case, the researcher's conclusions from the sample are probably applicable to the
entire population.
This type of correspondence between the sample and the larger population is most important
when a researcher wants to know what proportion of the population has a certain
characteristic – like a particular opinion or a demographic feature. Public opinion polls that
try to describe the percentage of the population that plans to vote for a particular candidate,
for example, require a sample that is highly representative of the population.
Before going to details and uses of sampling it is appropriate to be familiar with some basic
definitions concerning sampling:
1) Population: is the theoretically specified aggregation of survey elements from which the
survey sample is actually selected.
2) Sampling Frame: is the lists of elements from which the sample is drawn
3) Sample: A subset or some part of a larger population.
4) Sample design: is a definite plan for obtaining a sample frame.
5) Sampling: is the process of using a small number or part of a larger population to make
conclusion about the whole population.
6) Element: is unit from which information is collected and which provides the basis of
7) Statistic: is a characteristic of a sample
8) Parameter: is a characteristic of a population.
Example: when we work out certain measurement like, mean from a sample they are called
statistics. But when such measure describe the characteristics of the population, they are
called parameter(s)

6.2. Sampling Procedure
The first thing that the sample plan must include is a definition of the population to be
investigated. This involves the following procedure.
1) Defining population
2) Census Vs Sample
3) Sampling Design
4) Sample size
5) Estimate Cost of Planning
6) Execute Sampling Process
1) Defining population
The first thing the sample plan must include is a definition of the population to be
investigated. Defining the target population implies specifying the subject of the study.
Specification of a population involves identifying which elements (items) are included, as
well as where and when. If the researcher problem is not properly defined the defining
population will be difficult. Therefore, the researcher must begin with careful specification
of his/her population.
2) Census Vs Sample
Once the population has been defined, the researcher must decide whether the survey is to be
conducted among all members of the population or only a subset of the population. That is, a
choice must be made between census and sample.
Advantages of Census
 Reliability: Data derived through census are highly reliable. The only possible errors
can be due to computation.
 Detailed information: Census data yield much more information.
Limitation of Census
 Expensiveness: Investigating each elements of the population is expensive to any
individual researcher
 Excessive time and energy: Beside cost factor, census survey takes too long time and
consumes too much energy.
Need for Sampling
The use of sample in research project has the objective of estimating; testing and make
inference about a population on the basis of information taken from the sample.
In many cases sampling is the only way to determine something about the population. Some
of the major reasons why sampling is necessary are:
a) The destructive nature of certain test
b) The physical impossibility of checking all items in the population
c) The cost of studying all the items in a population is often prohibitive
d) The adequacy of sample results
e) To contact the whole population is often time consuming
Sampling techniques is used under the following conditions:
 Vast data
 When at most accuracy is not required
 Infinite population
 When census is impossible
 Homogeneity
Limitation of Sampling Technique
 Less accuracy
 Misleading conclusion
 Need for specialized knowledge
Essentials of ideal Sample
An ideal sample should fulfill the following four basic characteristics
a) Representativeness: an ideal sample must represent adequately the whole population. It
should not lack a quality found in the whole population.
b) Independence: each unit should be free to be included in the sample
c) Adequacy: the number of units included in the sample should be sufficient to enable
derivation of conclusion applicable for the whole population.
d) Homogeneity: the element included in the sample must bear likeness with other element.
3) Sampling Design
Operationally, sample design is the heart of sampling planning. Specification of sample
design includes the method of selecting individual sample unit involves both theoretical and
practical considerations. Sample design should answer the following:
 What type of sample to use
 What is the appropriate sample unit?
 What frame (list of sampling) is available for the population?
 How are refusals and non-response to be handled?
4) Sample Size Determination
A researcher is worried about sample size because of the fact that sample size (number of
elements in sample) and precision of the study are directly related. The larger the samples
size the higher the accuracy. The sample size determination is purely statistical activity,
which needs statistical knowledge. There are a number of sample size determination
1) Personal judgments: the personal judgment and subjective decision of the researcher in
some cases can be used as a base to demine the size of the sample.
2) Budgetary approach: is another way to determine the sample size. Under this approach
the sample size is determined by the available fund for the proposed study.
3) Traditional inferences: this is based on precision rate and confidence level. To estimate
sample size using this approach we need to have information about the estimated variance
population, the magnitude of acceptable error and the confidence interval.
5) Estimate Cost of Planning
The sample plan must take in to account the estimated cost of sampling. Such costs are of two
types, overhead costs and, variable costs. In reality however, it may be difficult and even for
some people not reasonable to separate sampling cost from over all study cost.
6) Execute Sampling Process
The last step in sample planning is the execution of the sample process (procedure). In short
the sample is actually chosen. The actual requirement for sampling procedure:
 Sample must be representative
 Sample must be adequate
6.3. Sampling Techniques
In most surveys, access to the entire population is near on impossible; however, the results
from a survey with a carefully selected sample will reflect extremely closely those that would
have been obtained had the population provided the data. Sampling therefore is a very
important part of the research process. If you have surveyed using an appropriate sampling
technique, you can be confident that your results will be generalized to the population in
question. If the sample were biased in any way, for example, if the selection technique gave
older people more of a chance of selection than younger people, it would be inadvisable to
make generalizations from the findings.

There are essentially two types of sampling: probability and non-probability sampling.
6.3.1. Probability Sampling
Probability or random sampling gives all members of the population a known chance of being
selected for inclusion in the sample and this does not depend upon previous events in the
selection process. In other words, the selection of individuals does not affect the chance of
anyone else in the population being selected. Many statistical techniques assume that a
sample was selected on a random basis. There are five basic types of random sampling
1) Simple Random Sampling
This is the ideal choice as it is a ‘perfect’ random method. Using this method, individuals are
randomly selected from a list of the population and every single individual has an equal
chance of selection. This method is ideal, but if it cannot be adopted, one of the following
alternatives may be chosen if any shortfall in accuracy.
2) Systematic Sampling

Systematic sampling is a frequently used variant of simple random sampling. When

performing systematic sampling, every Kth element from the list is selected (this is referred
to as the sample interval) from a randomly selected starting point. For example, if we have a
listed population of 6000 members and wish to draw a sample of 200, we would select every
30th (6000 divided by 200) person from the list. In practice, we would randomly select a
number between 1 and 30 to act as our starting point.

The one potential problem with this method of sampling concerns the arrangement of
elements in the list. If the list is arranged in any kind of order e.g. if every 30th house is
smaller than the others from which the sample is being recruited, there is a possibility that the
sample produced could be seriously biased.

3) Stratified Sampling
Stratified sampling is a variant on simple random and systematic methods and is used when
there are a number of distinct subgroups, within each of which it is required that there is full
representation. A stratified sample is constructed by classifying the population in sub-
populations (or strata), base on some well-known characteristics of the population, such as
age, gender or socio-economic status. The selection of elements is then made separately
from within each stratum, usually by random or systematic sampling methods.
Stratified sampling methods also come in two types – proportionate and disproportionate
stratified sampling.
In proportionate sampling, the strata sample sizes are made proportional to the strata
population sizes. For example if the first strata is made up of males, then as there are around

50% of males in the UK population, the male strata will need to represent around 50% of the
total sample.
In disproportionate methods, the strata are not sampled according to the population sizes,
but higher proportions are selected from some groups and not others. This technique is
typically used in a number of distinct situations. The costs of collecting data may differ from
subgroup to sub group. We might require more cases in some groups if estimations of
populations’ values are likely to be harder to make i.e. the larger the sample size (up to certain
limits); the more accurate any estimation is likely to be.
We expect different response rates from different groups of people. Therefore, the less co-
operative groups might be ‘over-sampled’ to compensate.

4) Cluster Sampling

This technique will sample economically while retaining the characteristics of a probability
sampling. In cluster sampling the primary sampling unit is no more the individual elements in
the population rather it is say manufacturing unit, city or block city, etc
Cluster sampling clearly will reduce costs by concentrating survey in selected cluster. But it
is less precise than random sampling. Cluster sampling is used only because of the economic
advantage it possesses.
5) Multi-Stage Sampling: Sometimes the population is too large and scattered for it to be
practical to make a list of the entire population from which to draw Simple random samples.
For instance, when the polling organization samples US voters, they do not do Simple
random samples. Since voter lists are compiled by counties, they might first do a sample of
the counties and then sample within the selected counties. This illustrates two stages. In some
instances, they might use even more stages. At each stage, they might do a stratified random
sample on sex, race, income level, or any other useful variable on which they could get
information before sampling.
How does one decide which type of sampling to use?
The formulas in almost all statistics books assume simple random sampling. Unless you are
willing to learn the more complex techniques to analyze the data after it is collected, it is
appropriate to use simple random sampling. To learn the appropriate formulas for the more
complex sampling schemes, look for a book or course on sampling.

Stratified random sampling gives more precise information than simple random sampling for
a given sample size. So, if information on all members of the population is available that
divides them into strata that seem relevant, stratified sampling will usually be used.
If the population is large and enough resources are available, usually one will use multi-stage
sampling. In such situations, usually stratified sampling will be done at some stages.
How do we analyze the results differently depending on the different type of sampling?
The main difference is in the computation of the estimates of the variance (or standard
deviation). An excellent book for self-study is A Sampler on Sampling, by Williams, Wiley.
In this, you see a rather small population and then a complete derivation and description of
the sampling distribution of the sample mean for a particular small sample size. I believe that
is accessible for any student who has had an upper-division mathematical statistics course
and for some strong students who have had a freshman introductory statistics course. A very
simple statement of the conclusion is that the variance of the estimator is smaller if it came
from a stratified random sample than from simple random sample of the same size. Since
small variance means more precise information from the sample, we see that this is consistent
with stratified random sampling giving better estimators for a given sample size.
6.3.2. Non-Probability Sampling
The difference between non probability and probability sampling is that non probability
sampling does not involve random selection and probability sampling does. Does that mean
that non probability samples aren't representative of the population? Not necessarily. But it
does mean that non probability samples cannot depend upon the rationale of probability
theory. At least with a probabilistic sample, we know the odds or probability that we have
represented the population well. We are able to estimate confidence intervals for the statistic.
With non probability samples, we may or may not represent the population well, and it will
often be hard for us to know how well we've done so. In general, researchers prefer
probabilistic or random sampling methods over non probabilistic ones, and consider them to
be more accurate and rigorous. However, in applied social research there may be
circumstances where it is not feasible, practical or theoretically sensible to do random
sampling. Here, we consider a wide range of non probabilistic alternatives.
We can divide non probability sampling methods into two broad types: accidental or
purposive. Most sampling methods are purposive in nature because we usually approach the
sampling problem with a specific plan in mind. The most important distinctions among these
types of sampling methods are the ones between the different types of purposive sampling
1) Accidental, Haphazard or Convenience Sampling
One of the most common methods of sampling goes under the various titles listed here. I
would include in this category the traditional "man on the street" (of course, now it's probably
the "person on the street") interviews conducted frequently by television news programs to
get a quick (although non representative) reading of public opinion. I would also argue that
the typical use of college students in much psychological research is primarily a matter of
convenience. (You don't really believe that psychologists use college students because they
believe they're representative of the population at large, do you?). In clinical practice, we
might use clients who are available to us as our sample. In many research contexts, we
sample simply by asking for volunteers. Clearly, the problem with all of these types of
samples is that we have no evidence that they are representative of the populations we're
interested in generalizing to -- and in many cases we would clearly suspect that they are not.
2) Purposive Sampling
In purposive sampling, we sample with a purpose in mind. We usually would have one or
more specific predefined groups we are seeking. For instance, have you ever run into people
in a mall or on the street who are carrying a clipboard and who are stopping various people
and asking if they could interview them? Most likely they are conducting a purposive sample
(and most likely they are engaged in market research). They might be looking for Caucasian
females between 30-40 years old. They size up the people passing by and anyone who looks
to be in that category they stop to ask if they will participate. One of the first things they're
likely to do is verify that the respondent does in fact meet the criteria for being in the sample.
Purposive sampling can be very useful for situations where you need to reach a targeted
sample quickly and where sampling for proportionality is not the primary concern. With a
purposive sample, you are likely to get the opinions of your target population, but you are
also likely to overweight subgroups in your population that are more readily accessible.
All of the methods that follow can be considered subcategories of purposive sampling
methods. We might sample for specific groups or types of people as in modal instance,
expert, or quota sampling. We might sample for diversity as in heterogeneity sampling. Or,
we might capitalize on informal social networks to identify specific respondents who are hard
to locate otherwise, as in snowball sampling. In all of these methods we know what we want
-- we are sampling with a purpose.
 Modal Instance Sampling
In statistics, the mode is the most frequently occurring value in a distribution. In sampling,
when we do a modal instance sample, we are sampling the most frequent case, or the
"typical" case. In a lot of informal public opinion polls, for instance, they interview a
"typical" voter. There are a number of problems with this sampling approach. First, how do
we know what the "typical" or "modal" case is? We could say that the modal voter is a person
who is of average age, educational level, and income in the population. But, it's not clear that
using the averages of these is the fairest (consider the skewed distribution of income, for
instance). And, how do you know that those three variables -- age, education, income -- are
the only or even the most relevant for classifying the typical voter? What if religion or
ethnicity is an important discriminator? Clearly, modal instance sampling is only sensible for
informal sampling contexts.
 Expert Sampling
Expert sampling involves the assembling of a sample of persons with known or demonstrable
experience and expertise in some area. Often, we convene such a sample under the auspices
of a "panel of experts." There are actually two reasons you might do expert sampling.
First, because it would be the best way to elicit the views of persons who have specific
expertise. In this case, expert sampling is essentially just a specific sub case of purposive
sampling. But the other reason you might use expert sampling is to provide evidence for the
validity of another sampling approach you've chosen. For instance, let's say you do modal
instance sampling and are concerned that the criteria you used for defining the modal instance
are subject to criticism. You might convene an expert panel consisting of persons with
acknowledged experience and insight into that field or topic and ask them to examine your
modal definitions and comment on their appropriateness and validity. The advantage of doing
this is that you aren't out on your own trying to defend your decisions -- you have some
acknowledged experts to back you. The disadvantage is that even the experts can be, and
often are, wrong.
 Quota Sampling
In quota sampling, you select people non- randomly according to some fixed quota. There are
two types of quota sampling: proportional and non proportional.
In proportional quota sampling you want to represent the major characteristics of the
population by sampling a proportional amount of each. For instance, if you know the
population has 40% women and 60% men, and that you want a total sample size of 100, you
will continue sampling until you get those percentages and then you will stop. So, if you've
already got the 40 women for your sample, but not the sixty men, you will continue to sample
men but even if legitimate women respondents come along, you will not sample them
because you have already "met your quota." The problem here (as in much purposive
sampling) is that you have to decide the specific characteristics on which you will base the
quota. Will it be by gender, age, education race, religion, etc.?
Non proportional quota sampling is a bit less restrictive. In this method, you specify the
minimum number of sampled units you want in each category. here, you're not concerned
with having numbers that match the proportions in the population. Instead, you simply want
to have enough to assure that you will be able to talk about even small groups in the
population. This method is the nonprobabilistic analogue of stratified random sampling in
that it is typically used to assure that smaller groups are adequately represented in your
 Heterogeneity Sampling
We sample for heterogeneity when we want to include all opinions or views, and we aren't
concerned about representing these views proportionately. Another term for this is sampling
for diversity. In many brainstorming or nominal group processes (including concept
mapping), we would use some form of heterogeneity sampling because our primary interest is
in getting broad spectrum of ideas, not identifying the "average" or "modal instance" ones. In
effect, what we would like to be sampling is not people, but ideas. We imagine that there is a
universe of all possible ideas relevant to some topic and that we want to sample this
population, not the population of people who have the ideas. Clearly, in order to get all of the
ideas, and especially the "outlier" or unusual ones, we have to include a broad and diverse
range of participants. Heterogeneity sampling is, in this sense, almost the opposite of modal
instance sampling.
 Snowball Sampling
In snowball sampling, you begin by identifying someone who meets the criteria for inclusion
in your study. You then ask them to recommend others who they may know who also meet
the criteria. Although this method would hardly lead to representative samples, there are
times when it may be the best method available. Snowball sampling is especially useful when
you are trying to reach populations that are inaccessible or hard to find. For instance, if you
are studying the homeless, you are not likely to be able to find good lists of homeless people
within a specific geographical area. However, if you go to that area and identify one or two,
you may find that they know very well who the other homeless people in their vicinity are
and how you can find them.


You might also like