Lab1 FA09 Model Ans
Lab1 FA09 Model Ans
For the example solutions here I'll use the random seed 619 (area code of San Diego).
Plot 1
8 600
7
500
6
400
5
Frequency
Frequency
4 300
3
200
8 600
7
500
6
400
5
Frequency
Frequency
4 300
3
200
2
100
1
0 0
14 18 22 26 30 14 1
Bin
Ok, Plots 1 and 2 are both approximations of the Uniform(10,30) distribution. We knew that, of course, becau
A histogram, don't forget, is a statistic calculated from the sample data (like the sample mean and SD) and is t
And, just as a sample mean will get closer to the true population mean as the sample size increases, so too wil
to the true distribution of the population (which, in this case, is Uniform(10,30)).
Consider this analogy. If you rolled a die thirty times, you would expect to have 5 ones, 5 twos, 5 threes, etc.
you wouldn't be that surprised because of the randomness of throwing dice. However, if you rolled the dice se
of ones to be near 1/6 and the proportion of twos to be near 1/6 etc. And the more times you roll the dice, the
That is the Law of Large Numbers (which, by the way, has almost nothing to do with means).
Let me redo Plots 1 and 2 but this time with proportions on the y-axis, not frequencies. Since I'm using 5 bins
With smaller sample sizes, random variation plays a big role but with larger sample sizes random variation is c
see the plots below for a more visual description.
Plot 1
0.5 0.4
0.46
0.42 0.35
0.38
0.3
0.34
0.3
0.25
0.26
Proportion
Proportion
0.22 0.2
0.18
0.14 0.15
0.1
0.1
0.06
0.02
0.05
-0.02
-0.06 0
14 18 22 26 30 14
Bin
Moving on, 250 samples of 20 Normal(25,5^2), same seed.
SD of means 1.16
number of rejections 14
too high 15
too low 7
total rejections 22
pvalues
tail area right 0.72 0.58 0.63 0.49 0.63 0.63
tail area left 0.28 0.42 0.37 0.51 0.37 0.37
pvalue 0.57 0.84 0.73 0.98 0.75 0.75
number of rejections 22
Ok, Plot 4 approximates the Sampling Distribution of the means. It is Normal with mean = 10 and SE = 5/sqrt
When I calculated the standard deviation of the means I got 1.14 which is not far from 1.118 (what I expected
Plot 6 is positive and slightly right-skewed but is NOT representative of a Chi-Square distribution.
But when you apply some minor changes to the variances as in the formula for the Chi-Square statistic you get
Plot 7 which approximates the Chi-Square Distribution WITH 19 df. Note the right-skew is a bit more pronounc
Check both ways that I calculated how many rejections I had and make sure you understand each method
We expect 25 out of the 250 test to (erroneously) reject the null hypothesis. I happened to get 22 rejections.
Plot 4
80 80
70 70
60 60
Bin Frequency
50 50
22 0
23 8
Frequency
Frequency
40 24 31 40
25 72
30 30
20 20
50 50
Frequency
Frequency
40 40
26 57
30 27 21 30
28 10
29 1
20 20
200
10 10
0 0
22 23 24 25 26 27 28 29 -3 -2
Bin
Bin Frequency Bin
4 0 Plot 6 5
70 11 4 70 10
18 31 15
25 64 20
32 64 60 25
60
39 27 30
46 8 35
53 2 40
50 50
40 40
Frequency
Frequency
30 30
20 20
10 10
0 0
4 11 18 25 32 39 46 53 5 10
Bin
of San Diego).
500 50
400 40
Frequency
Frequency
300 30
200 20
600 60
500 50
400 40
Frequency
Frequency
300 30
200 20
100 10
0 0
14 18 22 26 30 17 18 19 20
Bin
ew that, of course, because that's how we made them. Plot 3 looks different becaus
ple mean and SD) and is therefore subject to random variations. And, in this case, it is the No
size increases, so too will the histogram get closer in shape We know this because we kn
comes from has mean = 20
see below
es, 5 twos, 5 threes, etc. But if it was somewhat different than that
r, if you rolled the dice several thousand times, you'd expect the proportion
imes you roll the dice, the closer those proportions will get to 1/6. mean
SD Unif
s. Since I'm using 5 bins, I'd expect each bin to capture 20% of the values. SD Norm
zes random variation is cut dramatically.
The Central Limit Theorem s
the means is approximately
Plot 2
But let's be very clear here.
0.4 So a sample size of 15 is ok
normal distribution. 150 sam
0.35 to have a fairly accurate hist
Don't confuse the two.
0.3
0.25
Proportion
0.2
0.15
0.1
0.05
0
14 18 22 26 30
Bin
16.81 27.45 21.29 29.12 23.53 27.47 21.57
33.64 25.08 29.29 15.65 26.72 25.4 25.58
27.32 21.37 23.63 26.78 26.81 20.36 19.41
17.61 22.89 25.56 32.4 19.39 28.98 25.73
25.12 22.75 23.07 18.02 22.63 22.34 30.07
35.44 23.15 18.74 28.18 20.92 20.36 20.79
17.68 21.26 22.2 19.68 18.72 20.89 28.36
23.28 26.44 26.86 25.75 22.18 27.51 21
22.01 26.9 21.49 27.86 30.13 22.57 21.65
22.11 20 29.72 24.58 24.74 25.67 29.06
25.61 34.42 25.67 29.26 32.58 28.52 25.37
24.31 30.45 24.83 19.13 29.33 20 30.16
23.84 20.3 21.15 18.93 26.9 35.59 32.18
22.65 28.69 27.45 21.02 15.37 21.38 27.62
30.29 21.59 28.81 15.4 23.41 27.56 19.96
27.33 14.85 18.6 34.96 21.21 27.59 26.18
25.37 23.4 30.84 24.79 18.9 15.26 14.68
29.65 30.44 18.05 25.71 29.35 22 35.96
12.46 27.01 21.43 28.9 19.31 18.99 27.75
17.32 28.93 23.47 25.39 18.28 23.71 34.04
sample has the population mean = 10 and so the null hypothesis is true for all samples).
the level of the test at the beginning to be the probability of commiting this
12.5
distribution.
hi-Square statistic you get
ew is a bit more pronounced in this histogram as well
erstand each method
ned to get 22 rejections.
4 bins 7 5
11 10
18 15
25 20
32 25
39 30
46 35
53 40
Plot 5
Bin Frequency
80
-3 0
-2 9
70 -1 27
0 75
1 60
60 2 24
3 4
4 1
50
Frequency
40
30
20
50
Frequency
40
30
20
10
0
-3 -2 -1 0 1 2 3 4
Bin
Frequency
0 Plot 7
70 10
40
59
60 62
20
7
2
50
0
40
Frequency
30
20
10
0
5 10 15 20 25 30 35 40
Bin
21.5 26.9 10.18 27.5 27.23 23.49 26.76
10.3 12.22 16.63 28.26 22.63 10.22 10.04
22.91 22.58 29.91 28.48 22.55 26.66 18.29
27.67 29 22.74 11.34 16.99 12.99 14.15
29.92 24.52 12.38 24.71 17.22 29.53 18.1
29.79 11.27 18.71 28.9 26.26 24.2 29.07
21.21 29.18 20.55 20.46 29.12 16.48 28.84
25.65 21.31 20.1 17.21 22.91 12.35 21.48
11.29 18.3 23.48 15.6 27.56 22.85 24.98
27.26 28.66 11.44 13.19 14.65 29.18 21.68
23.8 16.34 11.68 16.97 26.73 11.87 29.29
17.88 23.11 14.75 26.46 15.5 13.86 18.45
19.34 24.55 22.29 20.53 26.78 29.08 17.07
20.46 26.71 21.15 22.03 25.37 11.79 10.14
12.02 28.75 23.24 16.17 26.76 21.02 27.75
Plot 3
18 19 20 21 22 23 24
Bin
20
5.77
1.29