Chapter 4 Sampling Distributions PDF
Chapter 4 Sampling Distributions PDF
Chapter 7 Chapter 11
2
In this chapter, you learn:
To distinguish between different sampling
methods
The concept of the sampling distribution
To compute probabilities related to the sample
mean and the sample proportion
The importance of the Central Limit Theorem
3
A sample is a portion or part of the population of interest
??????????????
Sample
Population Mean
(mean, μ, is X = 50
unknown)
Sample
4
For the safety of the consumer.
Sampling – A means for gathering useful
information about a population
Data are gathered from samples and
conclusions are drawn about the population as
a part of the inferential statistics process
5
Sampling vs. census has advantages
Sampling can save money.
Sampling can save time.
For given resources, the sample can broaden the scope of
the study.
Because the research process is sometimes destructive,
the sample can save product.
If accessing the population is impossible, the sample is
the only option.
6
Eliminate the possibility that a random sample is
not representative of the population.
9
Samples
Simple Stratified
Judgment Convenience Random
Systematic Cluster
10
Nonrandom Sampling (non probability sampling)- Every
unit of the population does not have the same probability of
being included in the sample. Members of nonrandom
samples are not selected by chance
11
The statistical methods presented and
discussed in this course are based on the
assumption that the data come from random
samples.
Nonrandom sampling methods are not
appropriate techniques for gathering data to
be analyzed by most of the statistical methods
presented in this course.
12
Samples
Simple Stratified
Judgment Convenience Random
Systematic Cluster
13
See pp. 223-225
Simple Random Sample – basis for other random
sampling techniques
Each unit is numbered from 1 to N (N is the population size)
A table of random numbers or a random number generator can
be used to select n (n<<N) items from the sample
14
See Black (2008), pp. 220-221
Select a sample of six companies
15
Select a sample of six companies
Population frame?
16
Number every member of the population
17
Use a table of random numbers to select the
6 items
18
Use a random number generator to select the 6 items (SPSS 20)
File “ch 04 sampling dist example 1.sav”
Click Transform Random # Generators Active Generator
Initialization Fixed Value Enter a numeric seed
▪ Click Data Select Cases
▪ Under Select, choose Random Sample of Cases. Click on Sample
box.
▪ Choose Exactly n (10) of the first N (31), click Continue
▪ Click OK. Slashes will appear in the case numbers of the cases not
in the sample, and an indicator variable representing Filter will be
attached to end of dataset.
▪ Analyze the Data
19
20
Samples are used to estimate population
characteristics (i.e., sample mean is used to
estimate the population mean)
It is unlikely that the sample statistic (i.e., sample
mean) would be exactly equal to its correponding
population parameter (i.e., population mean)
Sampling error: The difference between a sample
statistic and its corresponding population
parameter
21
See p. 224
File « ch 02 sampling dist example 2.sav»
The population mean is equal to 3.13
24
Assume there is a population …
D
A C
Population size N=4 B
Random variable, X,
is age of individuals
Values of X: 18, 20,
22, 24 (years)
25
Summary Measures for the Population Distribution:
μ
X i P(x)
N .3
18 20 22 24 .2
21
4 .1
σ
(X μ)
i
2
2.236
18
A B
20
C
22
D
24 x
N
Uniform Distribution
26
Example: Developing a Sampling
Distribution
16 Sample Means
1st 2nd Observation
Obs
18 20 22 24
18 18,18 18,20 18,22 18,24 1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement)
24 21 22 23 24
27
Example: Developing a Sampling
Distribution
18 19 19 24
μX 21
16
(18 - 21) 2 (19 - 21) 2 (24 - 21) 2
σX 1.58
16
29
Population Sample Means Distribution
N=4 n=2
μ 21 σ 2.236 μX 21 σ X 1.58
_
P(X) P(X)
.3 .3
.2 .2
.1 .1
0
18 20 22 24 X
0
18 19 20 21 22 23 24
_
X
A B C D
30
Different samples of the same size from the same
population will yield different sample means
A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
(This assumes that sampling is with replacement or
sampling is without replacement from an infinite population)
σ
σX
n
Note that the standard error of the mean decreases as the
sample size increases
31
Proper analysis and interpretation of a sample statistic
requires knowledge of its distribution.
Calculate x
Pop ulation to estimate
Samp le
Process of x
(p arameter) Inferential Statistics
(statistic)
" Start here."
Select a
random sample
32
The sample mean is one of the more common
statistics used in the inferential process.
To compute and assign the probability of occurrence
of a particular value of a sample mean, the researcher
must know the distribution of the sample means.
One way to examine the distribution is to take a
population with a particular distribution, randomly
select samples of a given size, compute the sample
means, and attempt to determine how the means are
distributed.
33
Suppose a small finite population consists of N=8
numbers {54, 55, 59, 63, 64, 68, 69, 70}
34
The distribution of these sample means can be
represented using an histogram
35
Data from a Poisson distribution of values with a population mean of
1.25.
90 samples of size n = 30 are taken randomly from a Poisson
distribution with = 1.25 and the means are computed on each sample.
The resulting distribution of sample means is displayed
36
Although the samples were drawn from a Poisson
distribution, which is skewed to the right, the
sample means approaches a symmetrical, nearly
normal-curve-type distribution.
37
Notice that even for small sample sizes, the distributions
of sample means for samples taken from the uniformly
distributed population begin to “pile up” in the middle.
38
We examined three populations with different
distributions
The mean of the sample means is exactly equal
to the population mean
X
39
The dispersion of the sampling distribution of sample
means is narrower than the population distribution
X
n
X is called the standard error of the mean
41
Advantage of Central Limits theorem is when
sample data is drawn from populations not
normally distributed or populations of unknown
shape can also be analyzed because the sample
means are normally distributed due to large
sample sizes
42
If samples of size n are drawn randomly from a
population that has a mean of μ and a standard
deviation of σ, the sample means are approximately
normally distributed fror sufficiently large sample
sizes (n≥30) regardless of the shape of the
population distribution
If the population is normally distributed, the sample
means are normally distributed for any sample size.
43
The central limit theorem creates the potential for applying the
normal distribution to many problems when sample size is
sufficiently large.
Sample means that have been computed for random samples
drawn from normally distributed populations are normally
distributed.
However, the real advantage of the central limit theorem comes
when sample data drawn from populations not normally
distributed or from populations of unknown shape also can be
analyzed by using the normal distribution because the sample
means are normally distributed for sufficiently large sample
sizes.
44
How large must a sample be for the central limit
theorem to apply?
The sample size necessary varies according to the shape of
the population. However, in this text (as in many others), a
sample of size 30 or larger will suffice.
Recall that if the population is normally distributed, the
sample means are normally distributed for sample sizes as
small as n = 2.
Z Formula for Sample Means
X X X
Z
X / n
47
Suppose, for example, that the mean expenditure
per customer at a tire store is $85, with a standard
deviation of $9.
What is the probability that the sample average
expenditure per customer for this sample will be $87 or
more if a random sample of 40 customers
is taken?
49
Because the sample size is greater than 30, the central
limit theorem can be used, and the sample means are
normally distributed
With μ= $85.00, σ = $9.00, and the z formula for
sample means, z is computed as shown next
X
9
1
40
.5000 .5000
1.42
.4207 .4207
85 87 X 0 1.41 Z
X - 87 85 2
Z= 1. 41 Equal Areas
9 1. 42 of .0793
n 40
51
Population Parameters: 85, 9
Sample Size: n 40 87 85
P Z
87 X 9
P( X 87) P Z
X 40
PZ 1.41
87 .5 (0 Z 1.41)
P Z
.5 .4207
n .0793
7.93% of the samples of size 40 will have an average expenditure per customer
of $87 or more
52
Suppose that during any hour in a large department
store, the average number of shoppers is 448, with
a standard deviation of 21 shoppers.
What is the probability that a random sample of 49 in
different shopping hours will yield a sample mean
between 441 and 446 shoppers?
53
24.15% of the samples of
size 49 will have a mean
between 441 and 446
shoppers.
54
X
3 1
.4901 .4901
.2486 .2486
.2415 .2415
56
As the size of the finite population becomes larger
in relation to sample size, the finite correction
factor approaches 1
In theory, whenever researches are working with a
finite population, they can use the finite correction
factor
If the sample size is less than 5% of the finite
population size (n/N<0.05), the finite correction
factor does not significantly modify the solution
The baggage limit for an airplane is set at 100
pounds per passenger. The weight of the baggage
of an individual passenger is a random variable
with a mean of 95 pounds and a standard deviation
of 35 pounds.
If we randomly select a random sample of size 50
in a particular flight and compute the passengers’
baggage mean, what is the probability that sample
mean will exceed the 100-pound limit? Interpret.
The manufacturer of cans of Solomon that are supposed
to have a net weight of 6 ounces tells you that the net
weight is actually a normal random variable with a
mean of 6.05 ounces and a standard deviation of 0.18
ounces. Suppose that you draw a random sample of 36
cans.
Find the probability that the mean weight of the sample
is less than 5.97 ounces.
Suppose your random sample of 36 cans of solomon
produced a mean weight that is less than 5.97 ounces.
Comment on the statement made by the manufacturer.
p̂
60
p = the proportion of the population having
a characteristic of interest
0≤p≤1
p is usually unknown
N is usually unknown
Examples: proportion of students having GPA ≥2.0
61
p = the proportion of the population having
a characteristic of interest
Sample proportion ( p̂) provides an estimate
of p:
X number of items in the sample having the characteri stic of interest
p̂
n sample size
62
Sampling Distribution of p̂
63
Sampling Distribution of p̂
Approximated by a
Sampling Distribution
normal distribution if: P( pˆ )
.3
np 5 .2
and .1
0
n(1 p ) 5 0 .2 .4 .6 8 1 p
64
Standardize p̂ to a Z value with the formula:
p̂ pˆ p̂ p
Z
σ p̂ p(1 p)
n
65
If the true proportion of voters who support
Proposition A is p = 0.4, what is the
probability that a sample of size 200 yields a
sample proportion between 0.40 and 0.45?
66
(continued)
67
(continued)
Standardized
Sampling Distribution Normal Distribution
0.4251
Standardize
69
Population Parameters
. 15 P
P = 0 . 10 P Z
PQ
Q = 1 - P 1 . 10 . 90 n
Sample . 15 . 10
P Z
n = 80 (. 10 )(. 90 )
80
X 12
0 . 05
X 12 P Z
p 0 . 15 0 . 0335
n 80
P ( Z 1. 49 )
. 15 p . 5 P ( 0 Z 1. 49 )
P ( p . 15 ) P Z . 5 . 4319
p
. 0681
70
p
0. 0335 1
.5000 .5000
.4319 .4319
^
0.10 0.15 p 0 1.49 Z
.4319
^
0.10 0.15 p
72
In the last election, a state representative received
52% of the votes cast. One year after the election,
the representative organized a survey that asked a
random sample of 300 people whether they would
vote for him in the next election. If we assume that
his popularity has not changed,
what is the probability that more than half of the
sample would vote for him?
What is the probability that less than 20% of the
sample would vote for him?
73
Discussed probability and nonprobability samples
Introduced sampling distributions
Described the sampling distribution of the mean
For normal populations
Using the Central Limit Theorem
Described the sampling distribution of a proportion
Calculated probabilities using sampling distributions
74