Topic11.Inferential Stat. Hypo Tests - Corrected
Topic11.Inferential Stat. Hypo Tests - Corrected
Topic 11
Statistical Inference – Hypothesis Testing
𝑛1 = 500, 𝑋̅ = 146
𝜇 = 140
𝑛2 = 500, 𝑋̅ = 135
𝑛3 = 500, 𝑋̅ = 140
Due to variation in our samples, we have different values for our estimates, and now we have doubts
about the true value of the parameter. We want to cast these doubts away by testing our hypothesis.
We define hypothesis testing (HT) as the process of determining which of the possible given
hypotheses is more acceptable as true or which hypothesis is more likely to be false. To start, we
write:
𝐻0 is called the null hypothesis or “H naught”. This is the hypothesis that we want to test. This is a
hypothesis of equality or sameness or a hypothesis of no difference or a hypothesis of no relationship.
This is a hypothesis that is assumed to be true. On the other hand, 𝐻𝐴 is called the alternative
hypothesis, the hypothesis that contradicts 𝐻0 . It is the hypothesis that is accepted as true when 𝐻0
is rejected.
hvvvalle
Page 2 of 14
𝐻0 : 𝜇1 = 𝜇2 𝐻0 : 𝜇1 = 𝜇2 𝐻0 : 𝜇1 = 𝜇2
𝐻𝐴 : 𝜇1 > 𝜇2 𝐻𝐴 : 𝜇1 < 𝜇2 𝐻𝐴 : 𝜇1 ≠ 𝜇2
𝐻0 : 𝜇1 = 𝜇2 𝐻0 : 𝜇1 = 𝜇2 𝐻0 : 𝜇1 = 𝜇2
𝐻𝐴 : 𝜇1 ≥ 𝜇2 𝐻𝐴 : 𝜇1 ≤ 𝜇2 𝐻𝐴 : 𝜇1 = 𝜇2
4. The signs in the 𝐻𝐴 may be implied from the words used in the problem. Pay attention to words
like “the same”, “significant difference”, “less than”, “greater than”, “exceeds”, “more than”,
“below”, “above”, “equal to”, “not equal to”, “below par”, “above par”, “different”, “differ”,
“better than”, etc.
Based on the statement of the null and alternative hypotheses, we can determine if the test is one-
tailed or two-tailed. For example, we want to test that 𝐻0 : 𝜇 = 140. Then we have three possible
alternative hypotheses:
It is therefore important that you know the signs in the alternative hypothesis.
hvvvalle
Page 3 of 14
Example: State the appropriate null and alternative hypotheses in the following situations.
1. The average number of heartbeats for an adult human male is 72 beats per minute
𝑯𝟎 : 𝝁 = 𝟕𝟐 𝑯𝑨 : 𝝁 ≠ 𝟕𝟐 two-tailed test
2. Female students perform better than male students in the midterm Biostatistics exam, on the
average. The scores follow a normal distribution.
𝑯𝟎: 𝝁𝑭 = 𝝁𝑴 𝑯𝑨: 𝝁𝑭 > 𝝁𝑴 one-tailed test
3. The average systolic blood pressure of female athletes falls below 120.
𝑯𝟎: 𝝁 = 𝟏𝟐𝟎 𝑯𝑨: 𝝁 < 𝟏𝟐𝟎 one-tailed test
4. A Psychology student at USM hypothesizes that the the average GPA of USM academic scholars is
better than 1.75.
𝑯𝟎: 𝝁 = 𝟏. 𝟕𝟓 𝑯𝑨: 𝝁 < 𝟏. 𝟕𝟓 one-tailed test
5. The average weight of newborn children differs from 6.1 pounds.
𝑯𝟎 : ___________ 𝑯𝑨 : ____________
1. Reject 𝐻0 When 𝐻0 is rejected, there is sufficient evidence in the sample data which
suggests that 𝐻0 is more likely to be false.
2. Fail to reject 𝐻0 When 𝐻0 is not rejected, it does not mean that 𝐻0 is true. It only means
that there is insufficient evidence in the sample data to suggest that 𝐻0 is false,
hence 𝐻0 is “accepted” as true.
When a decision is reached in HT, two types of errors can be incurred as shown in the table:
• A Type I error is made when we reject a true 𝐻0 . The probability of committing this error is called
the level of significance 𝛼. The value of 𝛼 is small, usually more or less equal to 0.05. Those in
pharmaceuticals and allied fields use smaller values of 𝛼. Why is this?
• A Type II error is made when we fail to reject a false 𝐻0 . The probability of committing this error
is called the power of the test 𝛽.
Example
Suppose the level of significance in HT is 𝛼 = 0.05. This means that there are 5 chances out of 100
chances that the null hypothesis is rejected when it is actually true.
Suppose the power of the test in HT is equal to 𝛽 = 0.01. This means that there is 1 chance out of
100 chances that a false null hypothesis is not rejected when it is actually false.
hvvvalle
Page 4 of 14
Example
Romeo is under trial for rape. Judge Julieta has to make a decision based on evidences presented.
What are the consequences if Judge Julieta makes a Type I error? a Type II error? a correct decision?
Type I error
Judge Julieta decides to reject 𝐻0 so she accepts 𝐻𝐴 : Romeo is guilty of the crime. Since
Romeo is guilty, then he is sent to jail. But actually 𝐻0 is true (Romeo is innocent).
Consequence: Romeo is sent to jail although he is innocent of the crime.
Type II error
Judge Julieta decides not to reject 𝐻0 . Since Romeo is innocent, then he is set free. But
actually, 𝐻0 is false (Romeo not innocent/ he is guilty).
Consequence: Romeo is set free although he is guilty of the crime.
Correct decision #1
Judge Julieta decides to reject 𝐻0 and 𝐻0 is actually false.
Consequence: Romeo is sent to jail because he really is guilty of the crime.
Correct decision #2
Judge Julieta decides not to reject 𝐻0 and 𝐻0 is actually true.
Consequence: Romeo is set free because he really is innocent of the crime.
hvvvalle
Page 5 of 14
This test compares a sample mean 𝑋̅ to a hypothesized population mean 𝜇0 , whether there is
significant difference between the two or whether one exceeds or falls short of the other. Either the
𝒁 test or the 𝒕 test is used, depending on the circumstances given. Some assumptions must be
observed: the observations must be independent and they come from a normal distribution.
The format of the HT process is given below. Corresponding 𝐻𝐴 and Decision Rules are color-coded.
(Observe the signs of the 𝑯𝑨 : <, >, ≠ )
1. 𝐻0 : 𝜇 = 𝜇0
𝐻𝐴 : 𝜇 < 𝜇0
𝐻𝐴 : 𝜇 > 𝜇0 Possible alternative hypotheses…Choose the appropriate one.
𝐻𝐴 : 𝜇 ≠ 𝜇0
3. Test Statistic:
𝑋̅ −𝜇
𝑍𝑐 = 𝜎 if 𝜎 is known/given
⁄ 𝑛
√
𝑋̅ −𝜇
𝑍𝑐 = 𝑠 if 𝜎 is not known/not given but 𝑛 > 30
⁄ 𝑛
√
𝑋̅ −𝜇
𝑡𝑐 = 𝑠 if 𝜎 is not known/not given and 𝑛 ≤ 30
⁄ 𝑛
√
4. Decision Rule:
If 𝑍 test is used:
Reject 𝐻0 if 𝑍𝑐 < −𝑍𝛼 . Fail to reject 𝐻0 otherwise. .
If 𝑡 test is used:
Reject 𝐻0 if 𝑡𝑐 < −𝑡𝛼,𝑑𝑓 . Fail to reject 𝐻0 otherwise. (𝑁𝑜𝑡𝑒: 𝑑𝑓 = 𝑛 − 1)
If 𝑍 test is used:
Reject 𝐻0 if 𝑍𝑐 > 𝑍𝛼 . Fail to reject 𝐻0 otherwise. .
If 𝑡 test is used:
Reject 𝐻0 if 𝑡𝑐 > 𝑡𝛼,𝑑𝑓 . Fail to reject 𝐻0 otherwise. (𝑁𝑜𝑡𝑒: 𝑑𝑓 = 𝑛 − 1)
If 𝑍 test is used:
Reject 𝐻0 if 𝑍𝑐 < −𝑍𝛼 or 𝑍𝑐 > 𝑍𝛼 . Fail to reject 𝐻0 otherwise.
2 2
If 𝑡 test is used:
Reject 𝐻0 if 𝑡𝑐 < −𝑡𝛼,𝑑𝑓 or 𝑡𝑐 > 𝑡𝛼,𝑑𝑓 . Fail to reject 𝐻0 otherwise. (𝑁𝑜𝑡𝑒: 𝑑𝑓 = 𝑛 − 1)
2 2
5. Computation:
Compute 𝑍𝑐 or 𝑡𝑐 .
7. Conclusion: At = ______ ,
hvvvalle
Page 6 of 14
Example
Suppose the average temperature (in degrees 𝐶𝑒𝑙𝑐𝑖𝑢𝑠) of COVID-19 patients at the onset of infection
is claimed to be 38.50 with a variance of 0.160 𝐶 2 . It is also known that these temperatures follow a
normal distribution. To test this claim, a researcher at the Department of Health wants to refute it
since he believes that it is much higher than this value. So he gathers a random sample of 25 COVID-
19 patients who are confined in different hospitals in the city and suburbs and finds out that the
patients’ average temperature at the onset of the infection is 38.90 𝐶. At the 5% level of significance,
is the doctor’s claim justified?
0 1.64
5. Computation:
̅ −𝝁
𝑿 𝟑𝟖.𝟗−𝟑𝟖.𝟓
𝒁𝒄 = 𝝈 = 𝟎.𝟒⁄ =𝟓
⁄ 𝒏
√ √𝟐𝟓
6. Decision: Reject 𝐻0 .
7. Conclusion: At 𝛼 = 0.05, there is sufficient evidence to show that the average temperature of
COVID-19 patients at the onset of infection is much higher than 38.50 𝐶.
hvvvalle
Page 7 of 14
Example
It is known that mean number of days between injury and initial magnetic resonance imaging (MRI) is
15 days. Mr. Hanayori Dango and other scientists would like to find out if there is a significant
difference between this value and the mean of 13 days that they have computed from a random
sample of 10 subjects with standard deviation of 3 days. At the 1% level of significance, what can they
conclude? Assume that the data come from a normal distribution.
1. 𝐻0 : 𝜇 = 15
𝐻𝐴 : 𝜇 ≠ 15
(since 𝜎 is not known/not given and 𝑛 < 30)
4. Decision Rule:
Reject 𝐻0 if 𝑡𝑐 < −𝑡𝛼,𝑑𝑓 or 𝑡𝑐 > 𝑡𝛼,𝑑𝑓 . Fail to reject 𝐻0 otherwise.
2 2
. Rewriting: (𝑁𝑜𝑡𝑒: 𝑑𝑓 = 𝑛 − 1 = 10 − 1 = 9)
Reject 𝐻0 if 𝑡𝑐 < −𝑡0.01,(9) or 𝑡𝑐 > 𝑡0.01,(9) . Fail to reject 𝐻0 otherwise.
2 2
Reject 𝐻0 if 𝑡𝑐 < −𝑡0.005,(9) or 𝑡𝑐 > 𝑡0.005,(9) . Fail to reject 𝐻0 otherwise.
Reject 𝐻0 𝑡𝑐 < −3.250 or 𝑡𝑐 > 3.250. Fail to reject 𝐻0 otherwise.
−3.250 0 3.250
5. Computation:
̅ −𝝁
𝑿 𝟏𝟑−𝟏𝟓
𝒕𝒄 = 𝒔 = 𝟑⁄ = −𝟐. 𝟏𝟎𝟖
⁄ 𝒏
√ √𝟏𝟎
7. Conclusion: At 𝛼 = 0.05, there is insufficient evidence to show that there is a difference between
the two values or there is no significant difference between the two values.
hvvvalle
Page 8 of 14
This test is used to compare the means of two independent populations, say, male and female
populations. It is also called the 𝒕 test for independent samples or independent samples 𝒕 test. Like
the previous test, either the 𝒁 test (𝜎1 and 𝜎2 are known) or the 𝒕 test (𝜎1 and 𝜎2 are unknown) can
be used depending on the circumstances given. But more often than not, the population standard
deviations are unknown that is why the 𝒕 test is preferred. In this particular 𝒕 test here, we assume
that the observations come from normal distributions with equal and unknown variances. This is to
distinguish it from the 𝒕 test for two means with unequal and unknown variances.
The format of the HT process is given below. Corresponding 𝐻𝐴 and Decision Rules are color-coded.
3. Test Statistic:
𝑋̅1 −𝑋̅2
𝑡𝑐 = 1 1
𝑠𝑝 √𝑛 +𝑛
1 2
5. Computation:
Compute 𝑡𝑐 .
7. Conclusion: At 𝛼 = ______,
hvvvalle
Page 9 of 14
Example
Suppose the average temperature (in degrees 𝐶𝑒𝑙𝑐𝑖𝑢𝑠) of COVID-19 patients at the onset of infection
is claimed to be 38.50 for a sample of 12 male patients with a variance of 0.360 𝐶 2 and 38.90 for a
sample of 15 female patients with a variance of 0.090 𝐶 2 . It is also known that these temperatures
follow a normal distribution with equal variances. At the 5% level of significance, is it safe to say that
on the average, male patients have lower temperatures at the onset of infection compared female
patients? Assume equal population variances.
Given:
Male patients Female patients
𝑋̅1 = 38.5 𝑋̅2 = 38.9
𝑛1 = 12 𝑛2 = 15
𝑠12 = 0.36 𝑠22 = 0.09
𝛼 = 0.05
1. 𝐻0 : 𝜇1 = 𝜇2
𝐻𝐴 : 𝜇1 < 𝜇2
−𝟏. 𝟕𝟎𝟖 0
5. Computation:
̅ 𝟏 −𝑿
𝑿 ̅𝟐 𝟑𝟖.𝟓−𝟑𝟖.𝟗 −𝟎.𝟒
𝒕𝒄 = 𝟏 𝟏
= 𝟏 𝟏
= 𝟏+𝟏
= −𝟐. 𝟐𝟔
𝒔𝒑 √𝒏 +𝒏
𝟏 𝟐
𝟎.𝟒𝟓𝟕√𝟏𝟐+𝟏𝟓 𝟎.𝟒𝟓𝟕√𝟏𝟐 𝟏𝟓
6. Decision: Reject 𝐻0 .
7. Conclusion: At 𝛼 = 0.05, it safe to say that on the average, male patients have lower temperatures
at the onset of infection compared female patients.
hvvvalle
Page 10 of 14
Example
A test designed to measure mothers’ attitude towards their labor and delivery experiences was given
to two groups of new mothers. Sample 1 (attenders) had attended prenatal classes held at the local
health department. Sample 2 (non-attenders) did not attend the classes. The means and standard
deviations of the test scores were as follows:
Sample 𝒏 ̅
𝑿 𝒔
1 15 4.75 1.0
2 22 3.00 1.5
Is there a significant difference in the mean scores of the attenders and non-attenders? Use 𝛼 = 0.05.
Assume that the variances of the two populations are equal.
Sample 𝒏 ̅
𝑿 𝒔
1 15 4.75 1.0 𝜶 = 𝟎. 𝟎𝟓
2 22 3.00 1.5
Given:
1. 𝐻0 : 𝜇1 = 𝜇2
𝐻𝐴 : 𝜇1 ≠ 𝜇2
4. Decision Rule: Reject 𝐻0 if 𝑡𝑐 < −𝑡𝛼,𝑑𝑓 or 𝑡𝑐 > 𝑡𝛼,𝑑𝑓 . Fail to reject 𝐻0 otherwise.
2 2
Rewriting: (𝑁𝑜𝑡𝑒: 𝑑𝑓 = 15 + 22 − 2 = 35)
Reject 𝐻0 if 𝑡𝑐 < −𝑡0.05,(35) or 𝑡𝑐 > 𝑡0.05,(35) . Fail to reject 𝐻0 otherwise.
2 2
Reject 𝐻0 if 𝑡𝑐 < −𝑡0.025,(35) or 𝑡𝑐 > 𝑡0.025,(35) . Fail to reject 𝐻0 otherwise.
Reject 𝐻0 if 𝑡𝑐 < −1.960 or 𝑡𝑐 > 1.960. Fail to reject 𝐻0 otherwise.
̅ 𝟏 −𝑿
𝑿 ̅𝟐 𝟒.𝟕𝟓−𝟑.𝟎𝟎 𝟏.𝟕𝟓
𝒕𝒄 = 𝟏 𝟏
= 𝟏+𝟏
= 𝟏+𝟏
= 𝟑. 𝟗𝟓
𝒔𝒑 √𝒏 +𝒏
𝟏 𝟐
𝟏.𝟑𝟐𝟑√𝟏𝟓 𝟐𝟐
𝟏.𝟑𝟐𝟑√𝟏𝟓 𝟐𝟐
6. Decision: Reject 𝐻0 .
7. Conclusion: At 𝛼 = 0.05, there is a significant difference in the mean scores of the attenders and
non-attenders.
hvvvalle
Page 11 of 14
This test is also called the dependent samples t test or matched pairs t test. It determines whether
there is a significant difference between two sets of observations that are in pairs (think of before-
after, pretest-posttest, etc). Let us illustrate: Does a new exercise routine lower the weights of five
male obese patients? Obese patients are those whose Body Mass Indexes (𝐵𝑀𝐼) are greater than 30.
We let 𝑋1 be the weight of a patient before exercise and 𝑋2 be his weight after exercising for 5 weeks.
To know if there is a decrease in the weights, we get the difference between the 𝑋1 and 𝑋2 of patient
1, the difference between the 𝑋1 and 𝑋2 of patient 2, etc. We DO NOT get the difference between
the 𝑋1 of patient 1 and 𝑋2 of patient 2 nor the difference between the 𝑋1 of patient 2 and 𝑋2 of
patient 5, and so on. An arrow indicates that 𝑋1 and 𝑋2 are paired observations (taken from the
same patient).
Patient 𝑋1 𝑋2 𝑑𝑖 = 𝑋1 − 𝑋2
1
2
3
4
5
The concern of this test is to see if the mean of the differences between the 𝑋1 and 𝑋2 is significantly
different from a specific value 𝑑𝑜, of which 𝑑𝑜 = 0 is most commonly used which translates into no
significant differences between the two weight measurements, in this particular example. Given below
are assumptions of this test as stated from www.jmp.com (Statistical Discovery from SAS):
• Subjects must be independent. Measurements for one subject do not affect measurements for any
other subject.
• Each of the paired measurements must be obtained from the same subject. For example, the
before-and-after weight for an obese patient in the example above must be from the same person.
• The measured differences are normally distributed.
Format:
3. Test statistic:
hvvvalle
Page 12 of 14
̅ −𝝁𝑫
𝒅
𝒕𝒄 = 𝒔𝒅
⁄
√𝒏
5. Computation:
𝑛
∑ 𝑑
𝑑̅ = 𝑖=1 𝑖 where 𝑑𝑖 = 𝑋1 − 𝑋2 for 𝑖 = 1,2, … , 𝑛
𝑛
2
𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑑𝑖 −(∑𝑖=1 𝑑𝑖 )
𝑠𝑑2 = → 𝑠𝑑 = +√𝑠𝑑2
𝑛(𝑛−1)
̅ −𝝁𝑫
𝒅
𝒕𝒄 = 𝒔𝒅
⁄
√𝒏
7. Conclusion: At = _____ ,
Example
Let us go back to the illustration given before. Suppose the weights before the new exercise routine
(𝑋1) and weights after exercising for 6 weeks (𝑋2) of the five male obese patients are given below:
Patient 𝑋1 𝑋2
1 88 87
2 90 88
3 79 74
4 98 99
5 100 97
At the 5% level of significance, can we claim that the new exercise routine is effective in lowering their
weights of obese patients?
hvvvalle
Page 13 of 14
Solution:
1. 𝐻0 : 𝜇𝐷 = 0 (There is no significant difference in the weight before the new exercise routine
and weight after the new exercise routine.) or
(The mean of the differences between the weight before the new exercise routine
and the weight after the new exercise routine is equal to 0.)
𝐻𝐴 : 𝜇𝐷 > 0 (The new exercise routine is effective in lowering the weights.) or
(The mean of the differences between the weight before the new exercise routine
and the weight after the new exercise routine is greater than 0.)
You may ask why we use 𝐻𝐴 : 𝜇𝐷 > 0. Take note that 𝑑𝑖 = 𝑋1 − 𝑋2. If the new exercise routine
is effective in lowering the weights, then 𝑋1 > 𝑋2 (weight before is greater than weight after)
which makes 𝑑𝑖 > 0 (the difference between weight before and weight after is positive). If we
∑𝑛
𝑖=1 𝑑𝑖
solve 𝑑̅ = , then we expect that 𝑑̅ > 0 also. Thus, we use 𝐻𝐴 : 𝜇𝐷 > 0
𝑛
̅ −𝝁𝑫
𝒅
3. Test statistic: 𝒕𝒄 = 𝒔𝒅
⁄
√𝒏
4. Decision Rule:
Reject 𝐻0 if 𝑡𝑐 > 𝑡𝛼,𝑑𝑓. Fail to reject 𝐻0 otherwise.
Rewriting: (𝑁𝑜𝑡𝑒: 𝑑𝑓 = 𝑛 − 1 = 5 − 1 = 4)
Reject 𝐻0 if 𝑡𝑐 > 𝑡0.05,(4). Fail to reject 𝐻0 otherwise.
Reject 𝐻0 if 𝑡𝑐 > 2.132. Fail to reject 𝐻0 otherwise.
5. Computation:
Patient 𝑋1 𝑋2 𝑑𝑖 = 𝑋1 − 𝑋2
1 88 87 1
2 90 88 2
3 79 74 5
4 98 99 −1
5 100 97 3
5
∑𝑖=1 𝑑𝑖 = 10
𝒏
̅ = ∑𝒊=𝟏 𝒅𝒊 = 𝟏𝟎 = 𝟐
𝒅
𝒏 𝟓
𝟐
𝒏 ∑𝒏 𝟐 𝒏
𝒊=𝟏 𝒅𝒊 −(∑𝒊=𝟏 𝒅𝒊 ) 𝟓[𝟏𝟐+𝟐𝟐+𝟓𝟐+(−𝟏)𝟐+𝟑𝟐]−[𝟏+𝟐+𝟓+(−𝟏)+𝟑]𝟐
𝒔𝟐𝒅 = = =𝟓
𝒏(𝒏−𝟏) 𝟓(𝟓−𝟏)
hvvvalle
Page 14 of 14
̅ −𝝁𝑫
𝒅 𝟐−𝟎
𝒔𝒅 = +√𝒔𝟐𝒅 = +√𝟓 = 𝟐. 𝟐𝟑𝟔 𝒕𝒄 = 𝒔𝒅 = 𝟐.𝟐𝟑𝟔 =𝟐
⁄ ⁄
√𝒏 √𝟓
7. Conclusion: At 𝛼 = 0.05, there is insufficient evidence in the data thus we cannot claim that the
new exercise routine is effective in lowering the weights of male obese patients.
hvvvalle