Chapter Seven
Chapter Seven
1
Chapter Goals
After completing this chapter, you are expected to:
• Describe a random sample and why sampling is
important.
• Explain the difference between probability and non-
probability sampling.
• Define the concept of a sampling distribution.
• Determine the mean and standard deviation for the
sampling distribution of the sample mean &
proportion.
2
7.1 Basic Concepts
• Survey - a method of data collection with no special
control on one or more of the factors.
• Two types of surveys: census (complete enumeration)
and sampling.
• The two situations where census is the only option:
– If information is needed from each and every unit of
the study population ;
– If a study is on rare events
• A population is the set of all items or individuals of
interest possessing some common characteristic.
• Example: All likely voters in the next election.
• A Sample is a subset of the population
• Example: 1000 voters selected at random for interview.
3
Basic Concepts . . .
• Parameters: descriptive measures for a population.
• Statistic is a descriptive measure for a sample.
• Sampling: The process by which any portion of a
population as representative of that population is
chosen.
• Sampling Unit: an element or a set of elements
considered for selection at some stage of sampling, e.g.,
animals, persons, households etc.
• Sampling Frame: listing of all the things or sampling
units that makes up a given population.
• Sample Size: The number or amount of elements
included in the sample.
4
Basic Concepts . . .
• Sample Design: A set of rules or procedures that
specify how a sample is to be selected.
• Sampling Error: the difference between the true
population value and the statistic.
– Can be minimized by increasing the size of sample.
• Non-sampling Error: errors occurring due to data are
incorrectly collected, recorded or analyzed.
– May happen both in census survey and sample
survey.
– Potential sources:
• Personal bias
• Poor measurement and/ or instrumentation
• imperfection in specifying the population
5
7.2 Reasons for Sampling
• Less time consuming than a census.
• Less costly to administer than a census.
• The only option for infinite population.
• Recommended in destructive type experiments.
• Possible to obtain statistical results of a sufficiently
high precision based on samples.
• Disadvantages:
– If we don’t have a representative sample, the
extracted results may be misleading.
– Minority groups may not be properly
represented.
6
7.3 Types of Sampling Techniques
• Non-probability Samples: Not every element has a
chance to be sampled. Selection process usually
involves subjectivity.
• Convenience
• Haphazard
• Quota
• Judgment
• Probability (random) Samples: Each element to be
sampled has a calculable chance of being selected.
• Simple random
• Systematic
• Stratified
• Cluster 7
Non-probability Sampling
• Widely used as a case selection method in qualitative
research.
• Convenience • Elements are sampled because of ease
and availability.
• Haphazard • The sample is selected haphazardly
by picking a sample of 10 rabbits
• Quota from a large cage in a laboratory.
• Elements are sampled, but not
randomly, from every layer, or
• Judgment stratum, of the population.
• Elements are sampled because the
researcher believes the members are
representative of the population.
8
Probability Sampling
• Random sampling can be done with replacement or
without replacement.
• Simple random • Every subject (case) has an equal
chance of being selected.
• Best when a sampling frame exists
and population is homogeneous.
• Two methods of selection:
•lottery method,
•table of random numbers.
•Lottery method is commonly used
when size of the population is small.
9
Random Number Table
• This consists of a randomly generated series of digits (0
– 9).
• All the units of a population should be numbered from 1
to N or from 0 to N-1.
1. Determine the number of digits required based on N.
11
Example
• Use a table of random numbers to select a sample of
size 5 from the population having 59 elements.
• Number of digits = 2
• Direction: downward
• Starting point: first cell
• Two digit numbers on the selected direction: {84,
59, 57, 26, 07, 89, 76, 62, 63, 78, 60, 84, 14}
• Selected samples: {59, 57, 26, 07, 14 }
12
Random Sampling
• Randomly sample elements from
every layer, or stratum, of the
• Stratified
population. Best when elements
within strata are homogeneous.
• Sampling error will arise primarily
from variability within the strata
14
7.4 Sampling Distribution of the Sample Mean
▪ A sampling distribution is a distribution of all of the
possible values of a statistic for a given size sample
selected from a population.
Sampling
Distributions
15
Constructing a
Sampling Distribution
• Assume there is a population …
• Population size (N)=4 A C
D
B
• Random variable, X
is age of individuals.
• Values of X:
18, 20, 22, 24 (years)
16
Sampling Distribution . . .
μ=
X i P(x)
N
.25
18 + 20 + 22 + 24
= = 21
4
0
σ=
(X − μ)
i
2
= 2.236
18
A
20
B
22
C
24
D
x
N
Uniform Distribution
17
Sampling Distribution . . .
▪ Now consider all possible samples of size n = 2.
Total number of samples in selection with replacement = 42 = 16
nd
2 Observation
16 Sample
1st Obs 18 20 22 24 Means
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24 1st 2nd Observation
Obs 18 20 22 24
22 22,18 22,20 22,22 22,24
18 18 19 20 21
24 24,18 24,20 24,22 24,24
20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement) 24 21 22 23 24
18
Sampling Distribution . . .
Sampling Distribution of All Sample Means
E(X) =
X
=
18 + 19 + 21+ + 24
i
= 21 = μ
N 16
σX =
( X i − μ) 2
N
(18 - 21)2 + (19 - 21)2 + + (24 - 21)2
= = 1.58
16
20
Comparing the Population with its
Sampling Distribution
.2 .2
.1 .1
0
18 20 22 24 X
0
18 19 20 21 22 23 24
_
X
A B C D
21
Expected Value of Sample Mean
E(X) = Xi Pr( Xi )
i=1
22
Standard Error of the Mean
• Different samples of the same size from the same
population will yield different sample means.
• A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
σ
σX =
n
• Note that the standard error of the mean decreases as the
sample size increases
23
Sampling from a Normal Population
σ
μX = μ σX =
n
and
24
Z-value for Sampling Distribution
of the Sample Mean
( X − μ) ( X − μ)
Z= =
σX σ
n
Normal Population
μx = μ Distribution
•
μ
Normal Sampling
(i.e. is unbiased ) Distribution
(has the same mean,
but different variance)
μx
27
Sampling Distribution Properties . . .
Smaller
sample size
μ
28
Sampling from a Non-normal Population
• We can apply the Central Limit Theorem:
σ
μx = μ σx =
and n
29
7.5 Sampling distribution of the sample proportion
30
Sampling distribution of the sample proportion
Pq
n
Pˆ qˆ
n
Student 1 2 3 4 5
Response Yes No No Yes Yes
31
Sampling distribution of the sample proportion
Possible samples
1,2,34 (Yes,No,No,Yes) 2/4
1,2,3,5(Yes,No,No, Yes) 2/4
1,3,4,5 (Yes,No, Yes, Yes) ¾
1,2,4,5 (Yes, No, Yes, Yes) ¾
32
2,3,4,5 (No, No, Yes, Yes) 2/4
Sampling distribution of the sample proportion
0.5 3/5
0.75 2/5
b) Exercise
Pr(.16 P .18) = Pr .16 −.15 Z .18 −.15
c) ˆ
0.015969
0.015969
= Pr(0.63 Z 1.88)= 0.2342
35