0% found this document useful (0 votes)
20 views56 pages

Sampling

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views56 pages

Sampling

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

SAMPLING

SAMPLING
⦿ Population refers to the entire group of people, events, or things of
interest with the required characteristics that the researcher wishes
to investigate.
⦿ Because there is very rarely enough time or money to gather
information from everyone or everything in a population, the goal
becomes finding a representative sample (or subset) of that
population.
⦿ A sample is “a smaller & representative collection of units from a
population used to determine truths about that population” .
⦿ Sampling is the process of selecting a sufficient number of elements
from the population, so that a study of the sample and an
understanding of its properties or characteristics would make it
possible for us to generalize such properties or characteristics to the
population elements.
⦿ Sampling involves selecting a relatively small number of
elements (sample) from a larger defined group (population)
and expecting the information gathered from the small group
will enable judgments about the larger group.

⦿ Algebraically, let the population size be N and if a part of size


n (which is < N) of this population is selected according to
some rule for studying some characteristic of the population,
the group consisting of these n units is known as ‘sample’.
SAMPLING…….

⦿ 3 factors that influence sample representativeness


● Sampling procedure
● Sample size
● Participation (response)
● The extent of variation in the sampling population

⦿ When might you sample the entire population?


● When your population is very small
● When you have extensive resources
● When you don’t expect a very high response

4
Sampling Terms
Population

Theoretical population

Defined target & Study


population
population

Sampling units
(available elements)

Sampling frame
SAMPLING BREAKDOWN 6
SAMPLING…….

STUDY POPULATION

SAMPL
E

TARGET POPULATION

7
Terminology
Population
The entire group of people of interest from whom the researcher
needs to obtain information.
Element (population unit)
one unit from a population. If 1,000 employees in a hospital
happen to be the population of interest to a researcher, each
employee therein is an element.
Sampling
The selection of a subset of the population

Sampling Frame
Listing of population from which a sample is chosen
Census
A polling of the entire population
Survey
A polling of the sample
Terminology
Parameter
A parameter is a measure/number describing characteristic of
population based on all the elements of the population. eg
population mean or mode. It implies a summary description of the
characteristics of the target population.

Statistic
A statistic is defined as a numerical value, which is obtained from
a sample of data. It is a descriptive statistical measure and
function of sample observation. A statistic is something that
describes a sample (eg sample mean)and is used as an
estimator for a population parameter. (because samples should
represent populations!). the statistic is a summary value of a
small group of population i.e. sample.
SAMPLING FRAME
⦿ Source list from which sample is drawn. In the most
straightforward case, such as the sentencing of a batch of
material from production (acceptance sampling by lots), it
is possible to identify and measure every single item in
the population and to include any one of them in our
sample. However, in the more general case this is not
possible. There is no way to identify all rats in the set of
all rats. Where voting is not compulsory, there is no way
to identify which people will actually vote at a forthcoming
election (in advance of the election)
⦿ As a remedy, we seek a sampling frame which has the
property that we can identify every single element and
include any in our sample .
⦿ The sampling frame must be representative of the
population

10
Sampling
Methods

Probability
sampling
Nonprobabil
ity
sampling
PROBABILITY SAMPLING

⦿ A probability sampling is one in which every member of


a population has a known and equal chance of being
selected in the sample, and this probability can be
accurately determined.

⦿ When every element in the population does have the


same probability of selection, this is known as an 'equal
probability of selection' (EPS) design. Such designs are
also referred to as 'self-weighting' because all sampled
units are given the same weight.
NON PROBABILITY SAMPLING

Any sampling method where some elements of


population have no chance of selection (these are
sometimes referred to as 'out of coverage'/'under
covered'), or where the probability of selection
can't be accurately determined. It involves the
selection of elements based on assumptions
regarding the population of interest, which forms
the criteria for selection. Hence, because the
selection of elements is nonrandom, non
probability sampling not allows the estimation of
sampling errors.
Types of Sampling Methods

Probability Nonprobability
⦿ Simple random ⦿ Convenience
sampling sampling
⦿ Systematic ⦿ Judgment sampling
random sampling ⦿ Quota sampling
⦿ Stratified random ⦿ Snowball sampling
sampling
⦿ Cluster sampling
⦿ Multistage
sampling
1. SIMPLE RANDOM SAMPLING
• Applicable when population is small,
homogeneous & readily available
• All subsets of the frame are given an equal
probability. Each element of the frame thus has
an equal probability of selection.
• It provides for greatest number of possible
samples. This is done by assigning a number to
each unit in the sampling frame.
• A table of random number or lottery system is
used to determine which units are to be
selected.
15
■ Advantages
■ Representativeness and Freedom from Bias
■ Estimates are easy to calculate.
■ Minimal knowledge of population needed
■ Easy to analyze data

■ Disadvantages
■ Time and Labor Requirement
■ Requires sampling frame
■ Does not use researchers’ expertise
■ Larger risk of random error than stratified
2. SYSTEMATIC SAMPLING

⦿ Systematic sampling relies on arranging the target


population according to some ordering scheme and then
selecting elements at regular intervals through that
ordered list.
⦿ Systematic sampling involves a random start and then
proceeds with the selection of every kth element from
then onwards. In this case, k=(population size/sample
size).
⦿ It is important that the starting point is not automatically
the first in the list, but is instead randomly chosen from
within the first to the kth element in the list.
⦿ A simple example would be to select every 10th name
from the telephone directory (an 'every 10th' sample, also
referred to as 'sampling with a skip of 10').

17
SYSTEMATIC SAMPLING……
As described above, systematic sampling is an EPS method, because
all elements have the same probability of selection (in the example
given, one in ten). It is not 'simple random sampling' because different
subsets of the same size have different selection probabilities - e.g. the
set {4,14,24,...,994} has a one-in-ten probability of selection, but the
set {4,13,24,34,...} has zero probability of selection.

18
SYSTEMATIC SAMPLING……

ADVANTAGES:
⦿ Sample easy to select
⦿ Suitable sampling frame can be identified easily
⦿ Sample evenly spread over entire reference
population
⦿ Easier and less costlier method of sampling
DISADVANTAGES:
⦿ Sample may be biased if hidden periodicity in
population coincides with that of selection.
⦿ Difficult to assess precision of estimate from one
survey.

19
3. STRATIFIED SAMPLING
⦿ Where population embraces a number of distinct
categories i.e. If a population from which a sample is to be
drawn does not constitute a homogeneous group, stratified
sampling technique is generally applied in order to obtain a
representative sample. The frame can be organized into
separate "strata.” Under stratified sampling the population
is divided into several sub-populations that are individually
more homogeneous than the total population (the different
sub-populations are called ‘strata’) and then we select
items from each stratum to constitute a sample. Since each
stratum is more homogeneous than the total population, we
are able to get more precise estimates for each stratum and
by estimating more accurately each of the component
parts, we get a better estimate of the whole
⦿ Every unit in a stratum has same chance of being selected
20
STRATIFIED SAMPLING……
ADVANTAGES

⦿ Stratified sampling results in more reliable and


detailed information if population is heterogenous.
Drawbacks

⦿ First, sampling frame of entire population has to be


prepared separately for each stratum
⦿ Second, when examining multiple criteria, stratifying
variables may be related to some, but not to others,
further complicating the design, and potentially
reducing the utility of the strata.
⦿ Finally, in some cases (such as designs with a
large number of strata, or those with a specified
minimum sample size per group), stratified sampling
can potentially require a larger sample than would
other methods

21
STRATIFIED SAMPLING…….

Draw a sample from each stratum

22
4. CLUSTER SAMPLING
⦿ If the total area of interest happens to be a big one, a
convenient way in which a sample can be taken is to divide the
area into a number of smaller non-overlapping areas and then
to randomly select a number of these smaller areas (usually
called clusters), with the ultimate sample consisting of all (or
samples of) units in these small areas or clusters.
⦿ Thus in cluster sampling the total population is divided into a
number of relatively small subdivisions which are themselves
clusters of still smaller units and then some of these clusters are
randomly selected for inclusion in the overall sample Population
divided into clusters of homogeneous units, usually based on
geographical contiguity.
⦿ Sampling units are groups rather than individuals.
⦿ A sample of such clusters is then selected.
⦿ All units from the selected clusters are studied.
23
Advantages

■ Minimal knowledge of population needed


■ Easy to analyze data
▪ Cuts down on the cost of preparing a sampling frame.

Disadvantages

■ High cost; low frequency of use


■ Larger risk of random error than stratified
Difference Between Strata and Clusters
Although strata and clusters are both non-overlapping
subsets of the population, they differ in several ways.
⦿ All strata are represented in the sample; but only a
subset of clusters are in the sample.
⦿ With stratified sampling, the best survey results occur
when elements within strata are internally
homogeneous. However, with cluster sampling, the best
results occur when elements within clusters are
internally heterogeneous

25
5. MULTISTAGE SAMPLING
⦿ Multi-stage sampling is a further development of the principle
of cluster sampling. Suppose we want to investigate the
working efficiency of nationalized banks in India and we want
to take a sample of few banks for this purpose. The first stage
is to select large primary sampling unit such as states in a
country. Then we may select certain districts and interview all
banks in the chosen districts. This would represent a two-stage
sampling design with the ultimate sampling units being clusters
of districts. If instead of taking a census of all banks within the
selected districts, we select certain towns and interview all
banks in the chosen towns. This would represent a three-stage
sampling design. If instead of taking a census of all banks
within the selected towns, we randomly sample banks from
each selected town, then it is a case of using a four-stage
sampling plan. If we select randomly at all stages, we will have
what is known as ‘multi-stage random sampling design’.26
Non Probability
Sampling
1. CONVENIENCE SAMPLING
⦿ Sometimes known as opportunity sampling or accidental
sampling.
⦿ A type of non probability sampling which involves the sample
being drawn from that part of the population which is close to
hand. That is, readily available and convenient.
⦿ The researcher using such a sample cannot scientifically make
generalizations about the total population from this sample
because it would not be representative enough.
⦿ For example, if the interviewer wants to conduct a survey at a
hospital early in the morning on a given day, the patients that
he/she could interview would be limited to those given there at
that given time, which would not represent the views of other
patients of society in such an area, if the survey was to be
conducted at different times of day and several times per week.
⦿ This type of sampling is most useful for pilot testing.
28
■ Advantages
■ Very low cost
■ Extensively used/understood
■ No need for list of population elements

■ Disadvantages
■ Variability and bias cannot be measured
or controlled
■ Projecting data beyond sample not
justified.
2. Judgmental sampling or Purposive sampling
⦿ The researcher chooses the sample based on who they think would be
appropriate for the study. This is used primarily when there is a limited
number of people that have expertise in the area being researched
⦿ This type of sampling technique might be the most appropriate if the
population to be studied is difficult to locate or if some members are thought
to be better (more knowledgeable, more willing, etc.) than others to
interview. This determination is often made on the advice and with the
assistance of the client. For instance, if you wanted to interview incentive
travel organizers within a specific industry to determine their needs or
destination preferences, you might find that not only are there relatively few,
they are also extremely busy and may well be reluctant to take time to talk
to you. Relying on the judgment of some knowledgeable experts may be far
more productive in identifying potential interviewees than trying to develop
a list of the population in order to randomly select a small number.
30
■ Advantages
■ Moderate cost
■ Commonly used/understood
■ Sample will meet a specific objective

■ Disadvantages
■ Bias!
■ Projecting data beyond sample not
justified.
3. Snowball Sampling
▪ Also called chain referral sampling.
▪ Selection of additional respondents is based on referrals
from the initial respondents.
› friends of friends
▪ Thus the sample group appears to grow like a rolling
snowball
▪ Used to sample from low incidence or rare populations.
▪ This sampling technique is often used in hidden
populations which are difficult for researchers to access;
example populations would be drug users or sex
workers.
Method of snowball sampling

⦿ Draft up a participation program (likely to be


subject to change, but indicative).
⦿ Approach stakeholders and ask for contacts.
⦿ Gain contacts and ask them to participate.
⦿ Community issues groups may emerge that can
be included in the participation program.
⦿ Continue the snowballing with contacts to gain
more stakeholders if necessary.
⦿ Ensure a diversity of contacts by widening the
profile of persons involved in the snowballing
exercise.
■ Advantages
■ low cost
■ Useful in specific circumstances
■ Useful for locating rare & hidden populations

■ Disadvantages
■ Bias because sampling units not independent
■ Projecting data beyond sample not justified.
■ Not Random
4. Quota Sampling

› The population is divided into cells on the basis of relevant


control characteristics.
› A quota of sample units is established for each cell.
● 50 women, 50 men
› A convenience sample is drawn for each cell until the quota
is met.
(similar to stratified sampling)
METHOD OF QUOTA SAMPLING

⦿ The population is first segmented into mutually exclusive


sub-groups, just as in stratified sampling.
⦿ Then judgment used to select subjects or units from each
segment based on a specified proportion.
⦿ For example, an interviewer may be told to sample 200
females and 300 males between the age of 45 and 60.
⦿ It is this second step which makes the technique one of
non-probability sampling.
⦿ In quota sampling the selection of the sample is
non-random.
⦿ For example interviewers might be tempted to interview
those who look most helpful. The problem is that these
samples may be biased because not everyone gets a
chance of selection. This random element is its greatest
weakness and quota versus probability has been a matter
of controversy for many years
36
■ Advantages
■ Moderate cost
■ Very extensively used/understood
■ No need for list of population elements
■ Introduces some elements of
stratification
■ Disadvantages
■ Variability and bias cannot be measured
or controlled (classification of subjects0
■ Projecting data beyond sample not
justified.
Steps in Sampling Process

1.Define the population


2.Identify the sampling frame
3.Select a sampling design or
procedure
4.Determine the sample size
5.Draw the sample
Sampling Design Process
Define Population

Determine Sampling Frame

Determine Sampling Procedure

Probability Sampling Non-Probability Sampling


Type of Procedure Type of Procedure
Simple Random Sampling Convenience
Stratified Sampling Judgmental
Cluster Sampling Quota

Determine Appropriate
Sample Size

Execute Sampling
Design
1. Define the Target Population
It addresses the question “Ideally, who do you want to
survey?” i.e. those who have the information sought
What are their characteristics. Who should be
excluded?
– age, gender, product use, those in industry
– Geographic area
It involves
– defining population units
– setting population boundaries
– Screening (e.g. security questions, product use)
Define the Target Population

The Element ...... individuals


families
seminar groups

sampling Unit…. individuals over 20


families with 2 kids
seminar groups at ”new” university

Extent ............ individuals who are unmarried


families who eat fast food
seminar groups doing MR

Timing .......... Working since 3 years


2. Determine the Sampling Frame

Obtaining a “list” of population (how will you reach sample)


▪ Students who eat at McDonalds?

▪ young people at random in the street?

▪ phone book

▪ students union listing

▪ University mailing list


▪ E.g. individuals who have spent two or more hours on the
internet in the last week
3. Selecting a Sampling Design
Probability sampling - equal chance of being
included in the sample (random)
› simple random sampling
› systematic sampling
› stratified sampling
› cluster sampling

Non-probability sampling - - unequal chance of being


included in the sample (non-random)
› convenience sampling
› judgment sampling
› snowball sampling
› quota sampling
Determining Sample Size
⦿ What data do you need to consider
› Variance or heterogeneity of population
› The degree of acceptable error (confidence interval)
› Confidence level

› Generally, we need to make judgments on all these


variables
Determining Sample Size

⦿ Variance or heterogeneity of population

› Previous studies? Industry expectations? Pilot


study?
› Rule of thumb: the value of standard
deviation is expected to be 1/6 of the range.
Determining Sample Size for cross
sectional studies/surveys
Formulas:
M.E. = Z√p(1-p)/n

Z at 95% confidence = 1.96


Z at 99% confidence = 2.58
⦿ where n = sample size
⦿ Z = confidence level (ex. 95% confidence = 1.96),
⦿ M.E = Margin of error (how much error you are willing to accept)
generally can range from 1% to 5%
⦿ p =prior judgement of population
⦿ Keep p as 0.5 if we have no estimation of population as error is
maximum when we keep p as 0.5.
Example
A survey estimated that 20% of all
Americans aged 16 to 20 drove under the
influence of drugs or alcohol. A similar
survey is planned for New Zealand. They
want a 95% confidence interval to have a
margin of error of 0.04.
⦿ (a) Find the necessary sample size if they
expect to find results similar to those in the
United States.
⦿ (b) Suppose instead they used the
conservative formula based on ^p = 0:5.
What is now the required sample size?
Solution

⦿ .04= 1.96√0.2x0.8/n
⦿ n= 0.2x0.8x1.96x1.96/.04x.04
⦿ n= 384.2
If population range is given

⦿ ME = Z( s/√n)
⦿ n = (sZ/ME)2
⦿ Standard deviation is approx. 1/6th of
range.
⦿ use the z value, e.g. 1.96 for a 95%
confidence interval.
Slovin’s formula
⦿ When population size is known
⦿ n = N / (1 + Ne2)
Where:
• n = Sample size
• N = Total population
• e = Margin of error
• Suppose Population size is 1000
• n = N / (1 + N e2) =
1,000 / (1 + 1000 * 0.05 2) = 285.714286
• =286
Sample size calculation for case control studies
Suppose a researcher wants to see association between childhood
violence with psychiatric disorder in adulthood. He will take a sample
of adult persons with psychiatric disorders and will take another
sample of adults with no psychiatric disorders. He will go
retrospectively to see history of childhood violence in both groups.
Exposure to both groups will be compared and odds ratio will be
calculated.
So if the researcher wants to calculate sample size for the
above-mentioned case control study to know link between childhood
violence with psychiatric disorder in adulthood and he wants to fix
power of study at 80% and assuming expected proportions in case
group and control group are 0.35 and 0.20 respectively, and he
wants to have equal number cases and control; then the sample
size per group will be
Sample size calculation for case control studies
Sample size estimation in intervention studies
Sample size estimation in intervention studies (single
group)

You might also like