0% found this document useful (0 votes)
13 views13 pages

W10PS

Uploaded by

polar neckson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views13 pages

W10PS

Uploaded by

polar neckson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Statistics for Data Science - 2

Week 10 practice Assignment


Hypothesis testing

1. Consider nine samples from Normal(100, 22 ). Let we wish to test H0 : µ = 100 against
HA : µ 6= 100.

(i) If the acceptance region is defined as 98.5 ≤ X ≤ 101.5, find the significance level.
Write your answer correct to two decimal places.
(Use P (−2.25 < Z < 2.25) = 0.975)

Solution:
Given that

H0 : µ = 100, HA : µ 6= 100
The acceptance region is defined as 98.5 ≤ X ≤ 101.5.
Now,

α = P (reject H0 |H0 is true)


= P ((X > 101.5 or X < 98.5)|µ = 100)
= P (|X − 100| > 1.5)
 
X − 100 1.5
=P 2 > 2
/3 /3
= P (|Z| > 2.25)
= 1 − P (−2.25 < Z < 2.25)
= 1 − 0.975 = 0.02

(ii) Find the power of the test against an alternative that the mean is 103. Write your
answer correct to two decimal places.
(Use P (−6.75 < Z < −2.25) = 0.012)

Solution:
1 − β = P (reject H0 |HA is true)
= P ((X > 101.5 or X < 98.5)|µ = 103)
= P (X < 98.5) + P (X > 101.5)
= P (X − 103 < −4.5) + P (X − 103 > −1.5)
   
X − 103 −4.5 X − 103 −1.5
=P 2/3
< 2 +P > 2
/3 2/3 /3
= P (Z < −6.75) + P (Z > −2.25)
= 1 − P (−6.75 < Z < −2.25)
= 1 − 0.012 = 0.98

2. Air crew escape systems are powered by a solid propellant. The mean burning rate of
this propellant must be 50 centimeters per second. We know that the standard deviation
of burning rate is σ = 2 centimeters per second. An engineer suspects that the mean
burning rate is greater than 50. The engineer decides to test at a significance level of
0.05 and selects a random sample of n = 25 and obtains a sample average burning rate
of 51.3 centimeters per second.

(i) Define null hypothesis and alternative hypothesis.


(a) H0 : µ = 50, HA : µ 6= 50
(b) H0 : µ = 50, HA : µ < 50
(c) H0 : µ = 50, HA : µ > 50
(d) H0 : X = 50, HA : X > 50
Solution:
Since, the mean burning rate of the propellant must be 50 centimeters per second
and engineer suspects that the mean burning rate is greater than 50. Therefore,
null and alternative hypothesis will be

H0 : µ = 50, HA : µ > 50

(ii) What is the critical value (c) if the acceptance region is X ≤ c? Write your answer
correct to two decimal places.
(use: FZ (1.64) = 0.95)

Solution:
If the significance level of the test is 0.05, then

Page 2
P (reject H0 |H0 is true) = 0.05
⇒P (X > c|µ = 50) = 0.05
⇒P (X − 50 > c − 50) = 0.05
 
X − 50 c − 50
⇒P 2/5
> 2 = 0.05
/5
 
c − 50
⇒P Z > 2 = 0.05
/5
c − 50
⇒1 − FZ ( 2 ) = 0.05
/5
c − 50
⇒FZ ( 2 ) = 0.95
/5
c − 50
⇒ 2 = 1.64
/5
2
⇒c = 50 + (1.64)
5
⇒c = 50.65

(iii) What conclusions should be drawn from the selected sample?


(a) The mean burning rate of the propellant is 50.
(b) The mean burning rate of the propellant is greater than 50.
(c) The mean burning rate of the propellant is lesser than 50.
(d) No conclusion can be drawn from the given sample.
Solution:
Given that X = 51.3
We will reject H0 if X > 50.65 and X = 51.3 > 50.65, we will reject the null
hypothesis.
It implies that the mean burning rate of the propellant is greater than 50.

3. Suppose a manufacturer of memory chips observes that the probability of chip failure
is p = 0.05. A new procedure is introduced to improve the design of chips and lower
the probability of chip failure. To test this new procedure, 200 chips are produced using
this new procedure and tested. We would accept the new procedure if the total number
of failed chips is less than 5 out of 200. Find the significance level of the test. Use the
normal approximation. Write your answer correct to three decimal places.
(Use P (Z < −1.62) = 0.052)
Solution:
A new procedure is introduced to improve the design of chips and lower the probability
of chip failure. Therefore, null and alternative hypothesis will be

H0 : p = 0.05, HA : p < 0.05

Page 3
Define a test statistic T as T = number of failed chips out of 200.

Given that: We would accept the new procedure if the total number of failed chips is
less than 5 out of 200.

It implies that we will reject the null hypothesis if T < 5.

Notice that T ∼ Binomial(200, p).


When the null hypothesis is true, E[T ] = 200p = 200(0.05) = 10 and
Var(T ) = 200p(1 − p) = 200(0.05)(0.95) = 9.5

By CLT, we can say that


T − 10
√ ∼ normal(0, 1)
9.5
Now, significance level is given by

α = P (reject H0 |H0 is true)


= P (T < 5)
 
T − 10 5 − 10
=P √ < √
9.5 9.5
= P (Z < −1.62)
= FZ (−1.62)
= 0.052

4. The mean lifetime of a sample of 100 light bulbs produced by a company is computed
to be 1570 hours with a standard deviation of 120 hours. µ is the mean lifetime of all
the bulbs produced by the company,

(i) Test the hypothesis µ = 1600 against the alternative hypothesis µ 6= 1600 at a level
of significance of 0.05.
(a) Reject the null hypothesis
(b) Accept the null hypothesis
Solution:
Given that
H0 : µ = 1600, HA : µ 6= 1600
Define a test statistic T as T = X.
Test: reject H0 if |X − 1600| > c Notice that when null hypothesis is true, we have

X − 1600
120/10
∼ Normal(0, 1)

Page 4
Now,

α = P (reject H0 |H0 is true)


⇒P (|X − 1600| > c) = 0.05
 
X − 1600 c
⇒P 120 > = 0.05
/10 120/10
 c
⇒P |Z| > = 0.05
 12 
−c
⇒2P Z < = 0.05
12
 
−c
⇒FZ = 0.025
12
−c
⇒ = −1.96
12
⇒c = 12(1.96) = 23.52

It implies that we will reject the null hypothesis if |X − 1600| > 23.52

Given that X = 1570


⇒ |X − 1600| = |1570 − 1600| = 30 > 23.52
Therefore, we will reject the null hypothesis.

(ii) Find the P -value. Write your answer correct to three decimal places.
Solution:
P -value is the minimum significance level at which null hypothesis is rejected for
the observed test statistic value.
Therefore, P -value is given by

α = P (|X − 1600| > |1570 − 1600|)


= P (|X − 1600| > 30)
 
X − 1600 30
= P 120
>
/10 120/10
= P (|Z| > 2.5)
= 2P (Z < −2.5)
= 2(0.0062) = 0.012

5. The average IQ of the students of a school is reported to be 107 with a standard deviation
of 4. You suspect that the average may be higher, possibly 110, and decide to sample
students to find their IQs. What sample size do you need for a test at the significance
level 0.05 and power 0.95?
(Use: FZ (1.64) = 0.95 and FZ (−1.64) = 0.05)

Page 5
Solution:
According to the question, we have

H0 : µ = 107, HA : µ > 107

Define a test statistic T as T = X.


Test: reject H0 if X > c.
Notice that when null hypothesis is true, we have

X − 107
4/√n
∼ Normal(0, 1)

Now, the significance level of the test is given to be 0.05. It implies that

P (reject H0 |H0 is true) = 0.05


⇒P (X > c) = 0.05
 
X − 107 c − 107
⇒p 4/√n
> 4√ = 0.05
/ n
 
c − 107
⇒P Z > 4 √ = 0.05
/ n
 
c − 107
⇒1 − P Z ≤ 4 √ = 0.05
/ n
 
c − 107
⇒P Z ≤ 4 √ = 0.95
/ n
c − 107
⇒ 4√ = 1.64
/ n
4
⇒c = 107 + (1.64) √ ...(1)
n

Again, when alternative hypothesis is true, we have

X − 110
4/√n
∼ Normal(0, 1)

Now, the power of the test is given to be 0.95. It implies that

Page 6
1 − β =P (reject H0 |HA is true) = 0.95
⇒P (X > c) = 0.95
 
X − 110 c − 110
⇒p 4/√n
> 4√ = 0.95
/ n
 
c − 110
⇒P Z > 4 √ = 0.95
/ n
 
c − 110
⇒1 − P Z ≤ 4 √ = 0.95
/ n
 
c − 110
⇒P Z ≤ 4 √ = 0.05
/ n
c − 110
⇒ 4√ = −1.64
/ n
4
⇒c = 110 − (1.64) √ ...(2)
n

From equation (1) and (2), we have


4 4
107 + (1.64) √ = 110 − (1.64) √
n n
4
⇒2(1.64) √ = 3
n
√ 2 × 1.64 × 4
⇒ n= = 4.37
3
⇒n = 19.12
⇒n = 20

6. An instructor gives a quiz involving 10 true-false questions. To test the hypothesis that
the student is guessing, the following decision rule is decided: (i) If 7 or more are correct,
the student is not guessing; (ii) if fewer than 7 are correct, the student is guessing. Find
the significance level of the test. Write your answer correct to two decimal places.
(Hint: If student is guessing then, probability of getting a question correct is p = 0.5)
Solution:
If a student is guessing the answer then, each question is equally likely to get corrected
that is p = 0.5 but if student is not guessing the answer then, probability of getting the
question correct is more than 0.5 that is p > 0.5.
It implies that
H0 : p = 0.5, HA : p > 0.5
Define a test statistic T as T = number of correct answers out of ten.

Page 7
As per the given information, we will reject the null hypothesis if T ≥ 7

Notice that if null hypothesis is true then, T ∼ Binomial(10, 0.5).

Now,

α = P (reject H0 |H0 is true)


= P (T ≥ 7)
10
X
10
= Ci (0.5)10
i=7
= ( C7 + 10 C8 + 10 C9 + 10 C10 )(0.5)10
10

= (120 + 45 + 10 + 1)(0.00097)
= 0.17

7. A cricket ball production line must produce of balls weights 163 g with a standard
deviation of 4 g in order to get top rating. To test the hypothesis of mean weights of the
balls to be 163, a sample of 16 balls are considered. If we want 0.01 level of significance,
what will be the acceptance region?
(Use FZ (2.57) = 0.995)

(a) [162.43, 164.57]


(b) [158.13, 166.57]
(c) [160.43, 165.57]
(d) [162.13, 164.98]

Solution:
Since, a cricket ball production line must produce of balls weights 163 g, null and alter-
native hypothesis are given by

H0 : µ = 163, HA : µ 6= 163

Define test statistic T as T = X.

Test: reject the null hypothesis if |X − 163| > c.

X − 163
Notice that when null hypothesis is true, 4/4
= X − 163 ∼ Normal(0, 1)

Now, the significance level of the test is given to be 0.01. It implies that

Page 8
P (reject H0 |H0 is true) = 0.01
⇒P (|X − 163| > c) = 0.01
⇒P (|Z| > c) = 0.01
⇒2P (Z < −c) = 0.01
⇒FZ (−c) = 0.005
⇒ − c = −2.57
⇒c = 2.57

Therefore, acceptance region will be [163 − 2.57, 163 + 2.57] = [160.43, 165.57].

8. A researcher has recently come into contact with a number of left-handed artists and
wonders whether artists are more likely to be left-handed than peoples in the general
population. She selects a random sample of 150 members of the Artists and asks each
whether they are left-handed or not. The sample proportion (who are left-handed) is
0.15. Suppose that 10% of people are left-handed in the general population.

(i) Does the data provide strong evidence that artists are more likely than the general
public to be left-handed if she decides a significance level of 0.05?
(a) Yes
(b) No
Solution:
10% of people are left-handed in the general population but a researcher wonders
whether artists are more likely to be left-handed. So, probability of an artist being
left-handed will be more than 0.1. Therefore, null and alternative hypothesis are
given by
H0 : p = 0.1, HA : p > 0.1
X1 + X2 + . . . + X150
Define a test statistic T as T = X = , where each Xi ∼
150
Bernoulli(0.1) (If null hypothesis is true).
p(p − 1) (0.1)(0.9) 0.09
Therefore, E[X] = p = 0.1 and Var(X) = = =
n 150 150
X − 0.1
Then, by CLT p ∼ Normal(0, 1).
0.09/150

Test: reject H0 if X > c.


Now, the significance level of the test is given to be 0.05. It implies that

Page 9
P (reject H0 |H0 is true) = 0.05
⇒P (X > c) = 0.05
!
X − 0.1 c − 0.1
⇒P p >p = 0.05
0.09/150 0.09/150
!
c − 0.1
⇒P Z > p = 0.05
0.09/150
!
c − 0.1
⇒1 − P Z ≤ p = 0.05
0.09/150
!
c − 0.1
⇒FZ p = 0.95
0.09/150

0.3
⇒c = 0.1 + (1.64) √
150
⇒c = 0.14

Since, X = 0.15 > 0.14, we will reject the null hypothesis. It implies that artists
are more likely than the general public to be left-handed if she decides a significance
level of 0.05.

(ii) Find the P -value. Write your answer correct to three decimal places.

Solution:
P -value is the minimum significance level at which null hypothesis is rejected for
the observed test statistic value.
Therefore, P -value is given by

α = P (X > 0.15
= P (X − 0.1 > 0.15 − 0.1)
!
X − 0.1 0.05
=P p >p
0.09/150 0.09/150

= P (Z > 2.04)
= P (Z < −2.04)
= 0.02

9. A cereal manufacturer tests its equipment weekly to be assured that the correct weight of
cereal is in each box. The company wants to test if the weight differs from the expected
weight. The weight of each box is expected to be 500g with a standard deviation of
100g. The manufacturer takes a random sample of 100 boxes and finds that the average

Page 10
weight is 520g. What is the sample’s P -value? Write your answer correct to two decimal
places.
(Use FZ (−2) = 0.022)

Solution:
The company wants to test if the weight differs from the expected weight and the weight
of each box is expected to be 500g. So, null and alternative hypothesis are given by

H0 : µ = 500, µ 6= 500

Define a test statistic T as T = X.

Test: reject the null hypothesis if |X − 500| > c.

X − 500 X − 500
By CLT, we can say that 100/√100
= ∼ Normal(0, 1).
10
P -value is the minimum significance level at which null hypothesis is rejected for the
observed test statistic value.
Therefore, P -value is given by

α = P (|X − 500| > |500 − 520|)


= P (|X − 500| > 20)
 
X − 500
= P >2
10
= P (|Z| > 2)
= 2P (Z < −2)
= 2(0.022) = 0.04

10. A machine produces iron rods of mean weight 12kg with a standard deviation of 2kg. An
engineer suspects that average weight is less than 12kg, probably 10kg. So, he collects
the weights of n iron rods. He wants the significance level to be less than 10−4 and
probability of type two error to be less than 10−8 .
(use FZ (−3.74) = 10−4 and FZ (5.61) = 1 − 10−8 )

(i) Find the required sample size.

Solution:
According to the question, we have

H0 : µ = 12, HA : µ < 12

Define a test statistic T as T = X.

Page 11
Test: reject H0 if X < c.
Notice that when null hypothesis is true, we have

X − 12
2/√n
∼ Normal(0, 1)

Now, the significance level of the test is given to be less than 10−4 . It implies that

P (reject H0 |H0 is true) ≤ 10−4


⇒P (X < c) ≤ 10−4
 
X − 12 c − 12
⇒p 2/√n
< 2√ ≤ 10−4
/ n
 
c − 12
⇒P Z < 2 √ ≤ 10−4
/ n
c − 12
⇒ 2 √ ≤ −3.74
/ n
2
⇒c ≤ 12 − (3.74) √ ...(1)
n

Again, when alternative hypothesis is true, we have

X − 10
2/√n
∼ Normal(0, 1)

Now, probability of type two error to be less than 10−8 . It implies that

β =P (accept H0 |HA is true) ≤ 10−8


⇒P (X ≥ c) ≤ 10−8
 
X − 10 c − 10
⇒p 2/√n
≥ 2√ ≤ 10−8
/ n
 
c − 10
⇒P Z ≥ 2 √ ≤ 10−8
/ n
 
c − 10
⇒1 − P Z < 2 √ ≤ 10−8
/ n
 
c − 10
⇒P Z < 2 √ ≥ 1 − 10−8
/ n
c − 10
⇒ 2 √ ≥ 5.61
/ n
2
⇒c ≥ 10 + (5.61) √ ...(2)
n

Page 12
From equation (1) and (2), we have
2 2
12 − (3.74) √ = 10 + (5.61) √
n n
2
⇒(5.61 + 3.74) √ = 2
n

⇒ n = 9.35
⇒n = 87.42
⇒n = 88

(ii) Find the critical value (for the acceptance region to be defined as X ≥ c, where
X is the mean weight of the rods). Write your answer correct to two decimal places.

Solution:
Putting the value of n in the equation (1), we have
2
c ≤ 12 − (3.74) √
88
⇒c ≤ 11.20 ...(3)

Putting the value of n in the equation (2), we have


2
c ≥ 10 + (5.61) √
88
⇒c ≥ 11.19 ...(4)

From the equation (3) and (4), we have c = 11.19

Page 13

You might also like