0% found this document useful (0 votes)
18 views25 pages

STA-CM 121 Lecture 4

The document provides an overview of statistical hypothesis testing, focusing on one-sample tests using the Student t distribution and examples of testing population means. It includes detailed examples, calculations, and explanations of confidence intervals, prediction intervals, and tolerance intervals. Additionally, it discusses tests concerning differences between population means and includes examples with solutions for clarity.

Uploaded by

famuyiwasemilore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views25 pages

STA-CM 121 Lecture 4

The document provides an overview of statistical hypothesis testing, focusing on one-sample tests using the Student t distribution and examples of testing population means. It includes detailed examples, calculations, and explanations of confidence intervals, prediction intervals, and tolerance intervals. Additionally, it discusses tests concerning differences between population means and includes examples with solutions for clarity.

Uploaded by

famuyiwasemilore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Statistics

A Virtual Lecture Facilitated by

J. N. Onyeka-Ubaka (Ph.D)
[email protected] +2348059839937
One-Sample test Cont’d
▪ We use Student t distribution when the sample size is small; and when
the variance of the distribution is not known with given datapoints less
than 30.
▪ Testing about the population mean, μ when variance ( 𝜎 2 ) is
unknown, we use Student t distribution with the test statistic
given as: t = x −  0 ~ 𝒕𝒏−𝟏 degrees of freedom (df)
s/ n
NB: You will see the Properties of t distribution on Page 187.
Example 1
In 12 test runs over a marked course, a newly designed motorboat
averaged 33.6 seconds with a standard deviation of 2.3 seconds.
Assuming that it is reasonable to treat the data as a random sample from
a normal population, test the null hypothesis μ = 35 and alternative
hypothesis μ < 35 at the 0.05 level of significance.
Solution
Given: n = 12, 𝑥ҧ = 33.6, s = 2.3
The null hypothesis 𝑯𝟎 : 𝝁 = 𝟑𝟓
against the alternative hypothesis 𝑯𝟏 : 𝝁 < 𝟑
𝐓𝐡𝐞 𝐥𝐞𝐯𝐞𝐥 𝐨𝐟 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐜𝐞 𝜶 = 𝟎. 𝟎𝟓
ഥ−𝝁
𝒙
The sample size is small, so we use t statistic 𝒕 = 𝒔 ~ 𝒕𝒏−𝟏 df
𝒏
ഥ−𝝁
𝒙 𝟑𝟑.𝟔−𝟑𝟓
Computing the test statistic, we have 𝒕 = 𝒔 = 𝟐.𝟑 = -2.11
𝒏 𝟏𝟐

From the alternative hypothesis, this is a left tailed-test, so


𝒕𝜶,𝒏−𝟏 = 𝒕𝟎.𝟎𝟓,𝟏𝟏 = −𝟏. 𝟕𝟗𝟔
Since the calculated value falls on the rejection region, we reject H0
and conclude that the newly designed motorboat averaged less than 35
seconds.
Test of Hypothesis Concerning a Population Mean
(Two sided Alternatives)
Example 2
A manufacturer of extinguisher systems used for fire protection in office
buildings claims that the true average system-activation temperature is
130oF. A sample of 9 systems, when tested, yields a sample average
activation temperature of 131.08oF. If the distribution of activation times
is normal with standard deviation 1.5oF, does the data contradict the
manufacturer’s claim at significance level 𝛼 = 0.01?
Solution
Let X be the activation times, X ~ N(𝝁, 𝝈𝟐 ).
𝐆𝐢𝐯𝐞𝐧: 𝒏 = 𝟗, 𝝁 = 130oF , 𝑥ҧ =131.08oF, 𝜎 = 1.5
Null hypothesis, 𝑯𝟎 : 𝝁 = 𝟏𝟑𝟎 (null value 𝝁𝟎 = 130)
Alternative hypothesis 𝑯𝟏 : 𝝁 ≠ 𝟏𝟑𝟎
Significance level, 𝛼 = 0.01
ҧ 0
𝑥−𝜇
Test statistic, 𝑍 = 𝜎
𝑛

Computing the test statistic , we have


ҧ 0
𝑥−𝜇 131.08−130 1.08
𝑍= 𝜎 = 1.5 = = 2.16
0.5
𝑛 9

From the alternative hypothesis, this is a two-tailed test, s0


𝑍𝛼 = 𝑍0.01 = 𝑍0.005 = ±2.575 (i.e. 𝑧 ≥ 2.575 or 𝑧 ≤ −2.575)
2 2

Decision: The computed value z = 2.16 does not fall in the rejection
region (-2.575 < 2.16 < 2.575), so the null hypothesis is not
rejected at 1% significance level.
Conclusion: The data does not give strong support to the claim that the
true average differs from the design value of 130.
Remarks:
(i) Critical Value for a hypothesis test is a threshold to which the value
of the test statistic in a sample is compared to determine whether or not
the null hypothesis should be rejected. It is obtained from statistical
table.

(ii) Critical Region or rejection region is a set of values of the test


statistic for which the null hypothesis is rejected in a hypothesis test.

(iii) Noncritical Region or acceptance region is a set of values of the test


statistic for which the null hypothesis is accepted in a hypothesis test.
Prediction Interval
A 100(1 - 𝛼)% prediction interval for a single observation is to be selected
from a normal population distribution with
1
x  t / 2,n−1  s 1 +
Example 3 n
An investigator observed that people from a certain city take pork meat as part
of their weakly menu. He recorded a sample of fat content (in percentage) of
10 randomly selected hot pork as follows:
17 19.5 16 25.2 25.5 22.8 21.3 21 20.9 29.8
Assuming that these were selected from a normal population distribution,
construct
(i) a 99.5% confidence interval for (interval estimate of) the population mean
fat content.
(ii)a 99.5% prediction interval for the fat content of a single hot pork.
Solution
n = 10, x = 21.9, s = 4.134, t.0025,9 = 3.690

(i) The 99.5% confidence interval (CI) for population mean 𝜇 is


s 4.134
x  t.0025,9  = 21.9  3.690  = 21.9  4.82
n 10
= (17.08, 26.72)
(ii) A 99.5% prediction interval (PI) for a single hot pork is
1 1
x  t.0025,9  s 1 + = 21.9  (3.690)(4.134) 1 + = 21.9  15.999
n 10

= (5.901, 37.899)
 This interval is quite wide, indicating substantial uncertainty about fat
contents. Notice that the width of the PI is more than three times that
of the CI.
Tolerance Intervals
 Let k be a number between 0 and 100. A tolerance interval for capturing at
least k% of the values in a normal population distribution with a confidence
level 95% has the form
x  (tolerance critical value)  s

 Tolerance critical values for k = 90, 95 and 99 in combination with various


sample sizes are given in your Statistical Table.
Example 4
Given n = 16, x = 14532 .5, s = 2055 .67 and a normal probability plot of the
data indicating that population normality was quite plausible. Find a confidence
level of 95%, two-sided tolerance interval for capturing at least 95% of the
modulus of elasticity values for specimens of lumber in the population sampled.
Solution
The tolerance critical value is 2.903. Therefore, the resulting 95% confidence
level is 14532 .5  (2.903)(2055 .67) = 14532 .5  5967 .6
= (8564.9, 20500.1)
We can be highly confident that at least 95% of all lumber specimens have
modulus of elasticity values between 8564.9 and 20500.
Tests Concerning Differences Between Population Means
 To test hypothesis about difference between two means (𝜇1 − 𝜇2 )
when the sample sizes are large OR when both variances (𝜎12 and 𝜎22 )
are known, we use the test statistic (TS)
(  −  ) − ( 1 −  2 ) which is N(0, 1).
z=
 12  22
+
m n
Example 5
A study of the number of business lunches that executives in the
insurance and banking industries claim as deductible expenses per
month was based on random samples and yielded the following results:
n1 = 40 x1 = 9.1 s1 = 1.9

n 2 = 50 x 2 = 8.0 s 2 = 2.1

Test the null hypothesis 𝜇1 − 𝜇2 = 0 against the alternative hypothesis


𝜇1 − 𝜇2 ≠ 0 at 5% significance level.
Solution
Null hypothesis, 𝐻0 : 𝜇1 − 𝜇2 =0
Alternative hypothesis, 𝐻1 : 𝜇1 − 𝜇2 ≠ 0
Significance level: 𝛼 = 0.05
Since both 𝜎12 and 𝜎22 are known, we use the test statistic
(  −  ) − ( 1 −  2 )
z=
 12  22
+
m n
9.1 − 8.0 − 0
=
1.9 2 / 40 + 2.12 / 50
1 .1
= = 2.60
0.4224
From the alternative hypothesis, this is a two-tailed test, so 𝑍𝛼 = 𝑍0.05 = 𝑍0.025 = ±1.96
2 2

-1.96 0 +1.96 2.60 z

Decision Rule: If TS falls in RR, reject 𝐻0 .


Conclusion: Reject H0 and conclude that there is a significant difference between the average number of
lunches.
Tests Concerning Differences Between Population Means Cont’d
 To test hypothesis about difference between two means (𝜇1 − 𝜇2 )
when the sample sizes are small OR when both variances
(𝜎12 and 𝜎22 ) are unknown but assumed to have equal common
value 𝜎 2 , we use the test statistic (TS)

(  −  ) − ( 1 −  2 )  −  − *
t = =
S p2 S p2 1
+
1
+ Sp
m n m n

which has a student-t distribution with (m - 1) + (n – 1) degrees of


freedom (That is m + n – 2 degrees of freedom)

where S p2 =
 (  − ) 2
+ ( −  ) 2
OR S p2 =
(m − 1) s12 + (n − 1) s 22
(m − 1) + (n − 1) m+n−2
Example 6
 In a study of gain in weight of rats, 10 males and 9 females are
selected for study under special diet. The experiment yielded the
following information, assuming normality of gain in weight.

Females n1 = 9 x1 = 20 S1 = 1.2

Males n2 = 10 x2 = 16.5 S 2 = 1.5

(a) Test their equality of gain in weight for the two groups at 0.5% level
of significance. Assume equality of variance.
(b) Find the 95% confidence interval for difference of means of gain in
weight for the two groups.
solution
(a) This can be performed using pooled sample variance. Let (X) be
female and Y be male. The hypothesis to be tested is
Null hypothesis, 𝐻0 : 𝜇1 − 𝜇2 = 0
Alternative hypothesis, 𝐻1 : 𝜇1 − 𝜇2 ≠ 0
Significance level: 𝛼 = 0.005
Since both 𝜎12 and 𝜎22 are unknown but assumed to have equal
variance, the test statistic is
(  −  ) − ( 1 −  2 )  −  − *
t= =
S p2 S p2 1
+
1
+ Sp
n1 n2 n1 n2

Calculating the pooled variance, we have 𝜇∗ = 𝜇1 − 𝜇2 = 0 and


(n1 − 1) s12 + (n2 − 1) s 22 (9 − 1)(1.2) 2 + (10 − 1)(1.5) 2
S =
2
=
n1 + n2 − 2 9 + 10 − 2
p

8(1.2) 2 + 9(1.5) 2
= = 1.87
17
Therefore, S p = 1.87 = 1.367
 Substituting in the test statistic gives
 −  − * 20 − 16.5
t = = = 5.57
1 1 1 1
Sp + 1.367 +
n1 n2 9 10

 From the alternative hypothesis, this is a two-tailed test, so 𝑡𝛼 =


2
𝑡0.005 = 𝑡0.0025,17 = ±3.222
2
 Decision Rule: If TS falls in RR, reject 𝐻0 .
 Conclusion: Since 5.57 > 3. 222 (i.e. It falls on the rejection
region), reject 𝐻0 .
Solution to Example 6 Cont’d
(b) The 95% confidence interval for difference of means of gain in
weight for the two groups is
1 1
𝑋ത − 𝑌ത ± 𝑡𝛼,𝑛+𝑛 −2
𝑆𝑝2 +
2 1 2 𝑛1 𝑛2
= 20 – 16.5 ± 3.222(0.628)
= 3.5 ± 2.023416
Therefore, the 95% confidence interval for 𝜇1 − 𝜇2 is
(1.476584, 5.523415).
Assignment
The marks scored in STA 121 in 2024 by the males and females in the
class are normally distributed with means 𝜇1 and 𝜇2 respectively but
with unknown common variance 𝜎 2 . A random sample of some males
and females in this class gave the following marks in the 2024
examination.
Males: 60, 30, 70, 77, 45
Females: 40, 50, 25, 60
Test the null hypothesis, 𝐻0 : 𝜇1 = 𝜇2 against the alternative
hypothesis, 𝐻1 : 𝜇1 > 𝜇2 at 0.5% level of significance.
Tests Concerning Proportions
pˆ − 
 For one-sample proportion, the test statistic is Z =
pq
n

 For two-sample proportions (say A and B), the test statistic is

Z cal =
( pˆ  − pˆ  ) − ( −  )
pq pq
+
n  n
Example 7
A trainer claims that 80 percent of his audience usually enjoys his teaching. His
supervisor is of the opinion that the claim cannot be true. He then selected a
sample of 40 persons from the trainer’s audience and found out that 35 persons
claim to enjoy the teaching. Can you conclude that the trainer’s claim is valid at
5 percent level of significance?
Solution
35
Given 𝑝 = 0.8; 𝑞 = 0.2; 𝑛 = 40; 𝑛𝑓 = 35 ⇒ 𝑝Ƹ = 40 = 0.875
𝐻0 : p = 0.8
𝐻1 : p ≠ 0.8
pˆ −  =
0.875 − 0.8
=
0.075
= 1.186
 Z cal = (0.8)(0.2) 0.0632
pq
40
n
From the alternative hypothesis, this is a two-tailed test, so
𝑍𝛼 = 𝑍0.05 = 𝑍0.025 = ±1.96
2 2
Since, the calculated statistic does not fall into the critical region, we fail to
reject the null hypothesis and conclude that the trainer’s claim is valid.
Example 8
A random sample of size 40 workers was selected from a group of
workers in company A and 8 of them were found to be performing below
expectation. A second sample of 50 workers was selected from a group
of workers in company B and 9 were found to be performing below
expectation. Test at 5% level of significance the hypothesis that the
proportion of workers performing below expectation in the two
companies is not the same.
Solution
8
n  = 40; freq. of  = 8;  pˆ  = = 0.2
40
9
n = 50; freq. of  = 9;  pˆ  = = 0.18
50
Null hypothesis, 𝐻0 : 𝑃𝐴 = 𝑃𝐵
Alternative hypothesis, 𝐻1 :𝑃𝐴 ≠ 𝑃𝐵
Significance level: 𝛼 = 0.05
 Test statistic Z = pˆ  − pˆ  =
0.2 − 0.18
cal
pq pq (0.2)(0.8) (0.18)(0.82)
+ +
n  n 40 50

0.02 0.02
= = = 0.24
0.006852 0.0834

From the alternative hypothesis, this is a two-tailed test, so


𝑍𝛼 = 𝑍0.05 = 𝑍0.025 = ±1.96
2 2

 Since Zcal falls in the acceptance region, we fail to reject the null
hypothesis and conclude that the two proportions are the same
Questions and Answers
Conclusion
 A test statistic is a statistic, calculated from the sample data which
is used to test the hypothesis.
 The rejection region is those values of the test statistic that lead to
the rejection of the null hypothesis.

ATTEMPT ANY TWO Questions on

11.11 Exercise pages 204 and 205


References

 Onyeka-Ubaka, J. N. (2023). MULTI-LEVEL STATISTICS:


An Academic Companion for Interdisciplinary
Professional Competence. Third Edition,
Printvillamedia, Lagos.

 Multi-Level Statistical Table


Thank you!

You might also like