0% found this document useful (0 votes)
26 views

Lecture 5 - Sampling Distribution

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Lecture 5 - Sampling Distribution

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Lecture 6

BUSINESS STATISTICS SAMPLING DISTRIBUTION


Advanced Educational Program

Reading materials:
Chap 9 (Keller)

1 2

Outline Distribution of Samples: example (1)


• Data were collected on the time taken for a pizza order to be
completed in minutes (from order taken to pizza handed over
to customer). Below is a histogram of 50 observations and
• Distribution of sample means some summary statistics.
• The central limit theorem
10
Frequency

10 12 14 16 18 20 22 24 26
Pizza time

Variable N Mean Median StDev


Pizza time 50 17.256 17.041 3.743
3 4

Another 50 observations; 1000 observations, 10,000 observations on the time to complete a


on the time to complete a pizza order (2) pizza order (3)

600

500

400
Frequency

100
10
300

200
Frequency

Frequency

5 50
100

10 20 30 40
0 0

6 8 10 12 14 16 18 20 22 24 26
Pizza time
10 20 30
Pizza time Pizza time

Variable N Mean Median StDev Variable N Mean Median StDev


Pizza time 50 17.585 17.374 3.872 Pizza time 10000 18.046 17.744 4.006
Variable N Mean Median StDev
Pizza time 1000 17.934 17.627 4.009
5 6
Distribution of sample means
General notice
• One thousand datasets, each with 10 observations in it (that
is, 1 thousand samples of size 10) are generated (simulated
• When the sample size gets large (infinitive), the data) from this model and for each sample, the average
distribution of the sample is approximately normal. (sample mean), median (sample median) and sample
standard deviation are calculated and recorded.
Variable N Mean Median StDev
average 1000 18.007 18.020 1.231
median 1000 17.757 17.804 1.433

90
80
80
70 70

60 60

Frequency

Frequency
50 50

40 40

30 30
20 20
10 10
0 0

13 14 15 16 17 18 19 20 21 22 14 15 16 17 18 19 20 21 22 23
average median
7 8

More random numbers


S.D for the 1000 random samples of size 10
• Another thousand datasets are generated from the same model,
but this time each dataset has 25 observations.
90
80
70
100
80
60 90
Frequency

70
80
50
70 60
40 50
Frequency

Frequency
60

30 50 40
40
20 30
30
20
10 20
10 10
0
0 0

1 2 3 4 5 6 7 15.5 16.5 17.5 18.5 19.5 20.5 14 15 16 17 18 19 20 21 22

average median
stdev

Variable N Mean Median StDev n Variable N Mean Median StDev


stdev 1000 3.8183 3.7282 0.9505
n average 1000 17.991 17.982 0.814
n median 1000 17.711 17.675 1.017
9 10

S.D for samples of size 25 Notices as we take larger samples….


• The histograms for all three statistics (sample mean,
sample median and sample standard deviation) are
70

60
becoming more and more symmetric and bell-shaped
50
and less variable, particularly those for the sample
Frequency

40

30

20
mean
• Also notice that the estimated standard deviation of
10

the sample mean is not only decreasing as sample


2 3 4 5 6

stdev

size increases, but is also approximately the same for


Variable N Mean Median StDev the same sample sizes.
stdev 1000 3.9637 3.9391 0.6048

11 12
A general result of great importance
The Central Limit Theorem (1)
— No matter what model a random sample is taken
from, as the sample size (number of random • Whatever the population
observations) increases, the distribution of the dist. looks like (normal
sample mean becomes closer and closer to the or not), when a sample
normal distribution. And size is large enough, the
— No matter what model a random sample is taken distribution of sample
from, and for any sample size n, the standard means will be normal and
deviation of the sample mean is the model standard we can use Z-statistic to
deviation, σ , (the theoretical standard deviation) calculate probability of
divided by n , that is, σ / n =>Called standard any mean value
error of the means (SE).
13 14

The Central Limit Theorem (2) This is the Central Limit Theorem

• If X is a random variable with a mean µ and


variance σ², then in general,

⎛ σ2 ⎞
X → N ⎜ µ, ⎟
⎝ n ⎠
X −µ
Z= → Z ~ N ( 0,1) as n → ∞.
σ n

15 16

So, how large does n need to be? So, how large does n need to be?
• It depends on the original distribution of X.
– If X has a normal distribution, then the sample mean has a
normal distribution for all sample sizes.
– If X has a distribution that is close to normal, the
approximation is good for small sample sizes (e.g. n=20).
– If X has a distribution that is far from normal, the
approximation requires larger sample sizes (e.g. n=50).

17 18
In general Activity 1

• The average height of Vietnamese women is 1.6m,


with a standard deviation of 0.2m. If I choose 25
women at random, what is the probability that their
average height is less than 1.53m?

19 20

You might also like