0% found this document useful (0 votes)
255 views9 pages

MAST20005 Statistics Assignment 2

This document contains Brendan Hill's statistics assignment with 6 questions. 1) A sample size calculation to achieve a 95% confidence interval of width ±0.5 for a population with standard deviation 34.9, resulting in a required sample size of 537. 2) Analysis of a linear regression model, including confidence intervals for parameters and a prediction interval. 3) Probability calculations for a hypothesis test on an exponential distribution. 4) A t-test on sample data to test if the mean is equal to a hypothesized value, along with a confidence interval. 5) A one-sample t-test to test if a sample mean is equal to a hypothesized value, including
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
255 views9 pages

MAST20005 Statistics Assignment 2

This document contains Brendan Hill's statistics assignment with 6 questions. 1) A sample size calculation to achieve a 95% confidence interval of width ±0.5 for a population with standard deviation 34.9, resulting in a required sample size of 537. 2) Analysis of a linear regression model, including confidence intervals for parameters and a prediction interval. 3) Probability calculations for a hypothesis test on an exponential distribution. 4) A t-test on sample data to test if the mean is equal to a hypothesized value, along with a confidence interval. 5) A one-sample t-test to test if a sample mean is equal to a hypothesized value, including
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

MAST20005 Statistics, Assignment 2

Brendan Hill - Student 699917 (Tutorial Thursday 10am)

November 19, 2016

Question 1
To determine the size for the new sample resulting in a 100(1 − α)% confidence interval of width ± we must solve
the following equation for n:
σ
 = zα/2 · √
n

√ a 95% confidence interval (α = .05) of width ± = 0.5, and from the previous experiment we can assume
We want
σ = 34.9. Hence:


34.9
0.5 = z0.025 · √
n
 √ 2
34.9
⇒ n = 1.96 · 0.5

⇒ n = 536.2677

Rounding up, the sample size required is n = 537.

1
Question 2
I will use the textbook convention of yi = α + β(xi − x̄) + i . Note that x̄ = 23.0667

(a)
The least squares regression line is:

y = 26.33333 + 0.5062(x − 23.0667)

(b)
The scatterplot with regression line is:

assignment 2 Q2b.PNG
While there is significant variance, the linear model may still be appropriate.

(c)
Point estimates for linear model (where σ̂ 2 is calculated using (n − 2) degrees of freedom):

α̂ = ȳ = 26.3333
Pn
yi (xi − x̄)
β̂ = Pi=1
n 2
= 0.5062
i=1 (xi − x̄)
n
X
σ̂ 2 = 1
n−2 (yi − α̂ − β̂(xi − x̄)) = 16.29896
i=1

(d)
The 95% confidence intervals for α, β and σ 2 are given by:
σ̂
α : α̂ ± t0.025 (n − 2) √ = [24.08137, 28.58530]
n
" #
σ̂
β : β̂ ± t0.25 (n − 2) pPn = [0.0445, 0.9678]
2
i=1 (xi − x̄)
h nσ̂ 2 nσ̂ 2 i
σ2 : , = [9.88390, 48.81145]
χ0.975 (n − 2) χ0.025 (n − 2)

(e)
Let x0 = 25. Then, the 95% confidence interval for the mean score is:
s
0 1 x0 − x̄
yc : α + β(x − x̄) ± t0.025 (n − 2) · σ̂ + Pn 2
= [24.88953, 29.73430]
n i=1 (xi − x̄)

And the 95% prediction interval:


s
1 x0 − x̄
yp : α + β(x0 − x̄) ± t0.025 (n − 2) · σ̂ 1+ + Pn 2
= [18.25994, 36.36390]
n i=1 (xi − x̄)

2
Question 3
(a)
Given H0 : θ = 2, the probability of a Type I error is:

α = P (X > 3|θ = 2) = 1 − (1 − e−3/θ ) = 0.223140

(b)
Given H1 : θ = 5, the probability of a Type II error is:

β = P (X ≤ 3|θ = 5) = 1 − e−3/θ = 0.451188

(c)
The power of the test is:
1 − β = 0.548812

(d)
Note that under H0 :

P (X > 5.991465|θ = 2) = 1 − (1 − e−5.991465/2 ) ≈ 0.05


So the following test of H0 and H1 has a significance of 0.05:

Reject H0 if the observed value x > 5.991465.

3
Question 4
(a)
Assume that X ≈ N (µ, σ 2 ). Given H0 : µ = 0.5, the test with significance 0.05 is:

X̄ − 0.5
t= √ ≥ t0.05 (n − 1)
s/ n

(b)
The sample provided yields n = 10, x̄ = 0.484, s = 0.2398, so:
0.484 − 0.5
t= √ = −0.210973
0.2398/ 10
t0.05 (9) = 1.833113
Since it is not the case that −0.210973 > 1.833113, this sample does not provide enough evidence to reject H0 .

(c)
The two-sided 95% confidence interval is given by the following formula:

x̄ ± t0.025 (n − 1) · √s
n

So the two-side confidence interval given by the sample is:

0.2398
= 0.484 ± t0.025 (9) · √
10
= 0.484 ± 2.262157 · 0.075839
= 0.484 ± 0.1715597
= [0.3124403, 0.6555597]

(d)
The test statistic t did not fall in the (one sided) critical region, which is sufficient to reject the alternative hypothesis
H1 : µ > 0.5 at the 0.05 significance level.

Additionally, the null hypothesis H0 : µ = 0.5 falls within the 95% (two-sided) confidence interval for µ, which
would be sufficient to reject an alternative hypothesis H2 : µ 6= 0.5 at the 0.05 significance level.

4
Question 5
(a)
The test statistic t and critical value are given by the following the following inequality:

W̄ − 0
t= √ ≤ −t0.05 (n − 1)
s/ n

(b)
The sample provided yields n = 20, w̄ = −0.325, s = 0.6463, so:
−0.325
t= √ ≤ −t0.05 (19)
0.6463/ 20
Hence:

t = −2.248709
−t0.05 (19) = −1.729133
Since −2.248709 ≤ −1.729133, the observed value of w̄ is more extreme that we would expect under H0 at the 95%
confidence level, so we reject the null hypothesis.

(c)
At the 99% confidence level we have the critical value:

−t0.01 (19) = −2.539483


Since −2.539483 < −2.248709 = t however, we cannot reject H0 at this level of confidence.

(d)
The p-value is 0.018295.

5
Question 6
We shall assume that the plant growth rates distribute normally.

2
So the growth rate of plants exposed to normal air distributes according to N (µX , σX ), and the growth rate of
2
plans exposed to enriched air distributes according to N (µY , σY ).

The sample variances are sX = 0.9562 and sY = 1.6098. Given the difference, we will not assume that the vari-
ances are equal.

Let the null hypothesis be H0 : µX = µY and the alternative hypothesis be H1 : µX < µY .

The test statistic and critical value for 95% confidence are given by:

X̄ − Ȳ
t= q 2 2
≤ −t0.05 (n + m − 2)
SX SY
n + m

The sample yields n = 12, m = 8, x̄ = 4.16333, ȳ = 5.105, sX = 0.9562 and sY = 1.6098, so:
4.16333 − 5.105
t= q = −1.488675
0.95622 1.60982
12 + 8

−t0.05 (18) = −1.734064


Since it is not the case that t < −1.734064, we cannot reject H0 at the 95% confidence level.

Hence, there is not enough evidence from this sample to conclude that the enriched air increased plant growth.

6
Question 7
Suppose the null hypothesis is H0 : σ 2 = σ02 and the alternative hypothesis is H1 : σ 2 > σ02 .

Then, the usual test statistic t at significance level α is:

(n − 1)s2
t= ≥ χ2α (n − 1)
σ02
In general, a χ2 distribution approaches a normal distribution as the degrees of freedom v becomes large, according
to the following relationship:

χ2 (v) − v
√ ≈ N (0, 1), as v → ∞
2v
Hence for large enough n, given degrees of freedom v = (n − 1), the following test statistic can be used:
(n−1)s2
σ02
− (n − 1)
z= p ≥ zα
2(n − 1)
So for large enough n, an approximate critical region for testing H0 against H1 at the α significance level is given by:
(n−1)s2
2 −(n−1)
σ0
⇒ √ ≥ zα
2(n−1)

(n−1)s2
p
⇒ σ02
− (n − 1) ≥ zα 2(n − 1)
(n−1)s2
p
⇒ σ02
≥ (n − 1) + zα 2(n − 1)

s2 2(n−1)
⇒ σ02
≥ (n−1)
(n−1) + zα (n−1)
q
s2 2
⇒ σ02
≥ 1 + zα n−1
 q 
2
⇒ s2 ≥ σ02 1 + zα n−1

7
Question 8
(a)
Given the large sample size, the normal distribution can be used instead of the T distribution.

So the test statistic and critical region are:


p̂1 − p̂2
z=p q ≥ z0.05 = 1.64
p̂(1 − p̂) n11 + 1
n2

Y1 +Y2
Where p̂ = n1 +n2 , given that under the null hypothesis p1 = p2 .

(b)
Note that p̂ = (135 + 77)/(900 + 700) = 0.1325, p̂1 = (135/900) = 0.15 and p̂2 = (77/700) = 0.11. So the test statistic
is:

0.15 − 0.11
z=p q = 2.3411
1 1
0.1325(1 − 0.1325) 900 + 700

Since 2.3411 > 1.64, we reject H0 at the 95% significance level.

(c)
If α = 0.01 then the critical region is give by z > z0.01 = 2.3263 .

Since z = 2.3411 > 2.3263, we reject the null hypothesis at the 99% confidence level as well.

(d)
The p-value of this test is 0.009613

8
Question 9
Given a random sample of size n from a population distributed according to N (µ, σ 2 ) with known σ 2 , the sample
mean X̄ distributes according to a normal distribution (for large enough n), which when standardized is:

X̄ − µ0
tN = √ ∼ N (0, 1)
σ/ n
Since the sum of k standard normal distributions each squared is a χ2 (k) distribution, and we have a single standard
normal distribution on the LHS, squaring both sides gives the following test statistic:
 X̄ − µ 2
tχ2 = √ 0 ∼ χ2 (1)
σ/ n
Hence for large enough n, the hypothesis H0 : µ = µ0 can be tested against the alternative H1 : µ 6= µ0 using the
following test statistic and critical region:
 X̄ − µ 2
tχ2 = √ 0 ≥ χ2α (1)
σ/ n
The squaring of the standard normally distributed variable causes both the left and right tails (each of area α/2) to
map to the right tail of the χ2 . Since this reduces to a single tail, a significance level α is appropriate.

You might also like