ECT702 Lecture6 Hypothesis Testing-1
ECT702 Lecture6 Hypothesis Testing-1
where 𝑛 is the sample size and 𝑝̅ is the proportion of successes in the sample.
Reject 𝐻0 at 𝛼 level of significance if
Example:
The Dean of Students Office at a local university is conducting a survey to determine the proportion of
incoming first years that will need financial aid. A survey on housing needs, financial aid and academic
interests is collected from 400 of the incoming first years. The Dean of Students Office hypothesized that 30%
of the first years will need financial aid and the sample from the survey indicated that 101 would need financial
aid. At 5% significance level, is this an accurate guess?
Solution:
𝐻0 : 𝑝 = 0.30
𝐻1 : 𝑝 ≠ 0.3
101
𝑝̅ = = 0.2525
400
𝑝̅ − 𝑝0 0.2525 − 0.30
𝑍𝑐 = = = −2.0731
𝑝0 (1−𝑝0 ) 0.30(1−0.30)
√ √
𝑛 400
2.5%
2.5%
-1.96 1.96
Since |𝑍𝑐 | = 2.0731 > 1.96, we reject the null hypothesis (that 𝑝 = 0.30) and conclude that the population of
first years needing financial aid is significantly different from 30%.
Since the test statistic is negative, we can conclude at 5% significance level that in the population of incoming
first years, less than 30% of the students will need financial aid.
Note: For 1% level of significance.
1%
= 0.5% = 0.005
2
We need to find the value of the point z such that the total area to the left of that point is 100%-0.5%=99.5%
99.5
= 0.995
100
2.0731 < 2.58
In this case, we fail to reject the null hypothesis and conclude that proportion of students in the population in
need of financial aid is 30%
Exercise: Repeat the above example using 1% instead of 5% level of significance. What is your conclusion?
Now repeat it with 10% significance level, what is your conclusion?
Example:
The MoE report indicated that in 2018, 75% of University students aged 17 to 22 saw a counsellor in the past
year. An investigator wants to assess whether the use of counselling services is similar in students of a local
university. A sample of 125 university students aged 17 to 22 from the local university are surveyed and 64
reported seeing a counsellor over the past 12 months. At 1% level of significance, is there a significant
difference in use of counselling services between the students of the local university and the national data?
Solution:
𝐻0 : 𝑝 = 0.75
𝐻1 : 𝑝 ≠ 0.75
64
𝑝̅ = = 0.512
125
The test statistic is given by
𝑝̅ − 𝑝0 0.512 − 0.75 −0.238
𝑍𝑐 = = = = −6.1451
𝑝0 (1−𝑝0 ) 0.75×0.25 0.0387
√ √
𝑛 125
0.5%
0.5%
-2.58 2.58
Since |𝑍𝑐 | = 6.1451 > 2.58, we reject the null hypothesis at 1% significance level and conclude that the
proportion of students requiring counselling services in the local university is significantly different from the
proportion nationally.
Exercise:
The NACADA report indicated that in 2019 the prevalence of Marijuana smoking among Kenyan adults was
21.1%. Data on prevalent smoking in 3536 participants who attended a medical camp indicated that 482 of the
respondents were currently smoking Marijuana at the time of the camp. Suppose we want to assess whether the
prevalence of Marijuana smoking is lower in the medical camp sample given the focus on health in that
community. At 5% significance level, is there evidence of a statistically lower prevalence of Marijuana
smoking in the Medical Camp study as compared to the prevalence among all Kenyans?
Ans. |𝑍𝑐 | = 10.93 > 1.65; reject 𝐻0 and conclude that we have statistically significant evidence at 5%
significance level to show that the prevalence of Marijuana smoking in the Medical Camp is lower than the
prevalence nationally.
HYPOTHESIS TESTS ABOUT THE POPULATION PROPORTIONS: TWO SAMPLE INFERENCE
Here we consider the situation where there are two independent comparison groups and the outcome of interest
is dichotomous (e.g., Successes/Failure; Yes/No). The goal of the analysis is to compare proportions of
successes between the two groups.
Suppose we wish to test the null hypothesis 𝐻0 : 𝑝1 = 𝑝2 against the alternative 𝐻1 : 𝑝1 ≠ 𝑝2 at 𝛼 level of
significance.
The test statistic is given by
𝑝̅1 − 𝑝̅2
𝑍𝑐 =
𝑝̅𝑐 (1−𝑝̅𝑐 ) 𝑝̅𝑐 (1−𝑝̅𝑐 )
√( + )
𝑛1 𝑛2
where
𝑥
𝑝̅1 is the proportion of successes in sample 1; i.e., 𝑝̅1 = 𝑛1
1
𝑥1 is the number of successes in sample 1
𝑛1 is the size of sample 1
𝑥
𝑝̅2 is the proportion of successes in sample 2; i.e., 𝑝̅2 = 2
𝑛2
𝑥2 is the number of successes in sample 2
𝑛2 is the size of sample 2
𝑥 +𝑥
𝑝̅𝑐 is the proportion of successes in the pooled sample; i.e., 𝑝̅𝑐 = 𝑛1 +𝑛2
1 2
Note:
The above formula is appropriate for large samples, defined as at least 5 successes (𝑛𝑝 ≥ 5) and at least 5
failures (𝑛(1 − 𝑝) ≥ 5) in each of the two samples. If there are fewer than 5 successes or failures in either
comparison group, then alternative procedures, called exact methods must be used to estimate the difference in
population proportions.
Example:
The following table summarizes data from 3799 participants who attended a Medical Camp. The outcome of
interest is prevalent cardiovascular disease and we want to test whether the prevalence of cardiovascular disease
is significantly higher in smokers as compared to non-smokers.
Free of CVD History of CVD Total
Non-smoker 2757 298 3055
Current smoker 663 81 744
Total 3420 379 3799
Let sample 1 (resp. 2) be the one corresponding to non-smokers (resp. smokers)
We wish to test the hypothesis
𝐻0 : 𝑝1 = 𝑝2 ; 𝑣𝑠 𝐻1 : 𝑝1 < 𝑝2
𝑥 298
𝑛1 = 3055; 𝑥1 = 298; 𝑝̅1 = 𝑛1 = 3055 = 0.0975
1
𝑥 81
𝑛2 = 744; 𝑥2 = 81; 𝑝̅2 = 𝑛2 = 744 = 0.1089
2
𝑥1 + 𝑥2 298 + 81 379
𝑝̅𝑐 = = = = 0.0998
𝑛1 + 𝑛2 3055 + 744 3799
The test statistic is given by
𝑝̅1 − 𝑝̅2 0.0975 − 0.1089 −0.0114
𝑍𝑐 = = = = −0.9346
𝑝̅𝑐 (1−𝑝̅𝑐 ) 𝑝̅𝑐 (1−𝑝̅𝑐 ) 0.0988×0.9002 0.0988×0.9002 √0.0000291 + 0.0001197
√( + ) √ +
𝑛1 𝑛2 3055 744
5%
1.65
Since |𝑍𝑐 | = 0.9346 < 1.65, we fail to reject the null hypothesis and conclude that the data does not provide
any evidence of statistically significant difference in the proportion of prevalence of cardiovascular diseases in
the two populations.
Exercise:
The makers of a new drug for ADHD reported that 26 of the 374 subjects who took the drug (experimental
group) experienced vomiting as a side effect, compared to 8 of the 210 subjects who were on the placebo
(control group). Note that patients did not know which treatment they were given. At 5% level of significance,
is there sufficient evidence to suggest that the entire population on the drug would experience more vomiting?
Ans.: |𝑍𝑐 | = 1.60; do not reject 𝐻0
Let sample 1 (resp. 2) be the one corresponding to experimental (resp. control) group.
We wish to test the hypothesis
𝐻0 : 𝑝1 = 𝑝2 ; vs 𝐻1 : 𝑝1 > 𝑝2
𝑥 26
𝑛1 = 374; 𝑥1 = 26; 𝑝̅1 = 𝑛1 = 374 = 0.070
1
𝑥 8
𝑛2 = 210; 𝑥2 = 8; 𝑝̅2 = 𝑛2 = 210 = 0.038
2
𝑥1 + 𝑥2 26 + 8
𝑝̅𝑐 = = = 0.058
𝑛1 + 𝑛2 374 + 210
The test statistic is given by
𝑝̅1 − 𝑝̅2 0.070 − 0.038 0.032
𝑍𝑐 = = = = 1.60
𝑝̅𝑐 (1−𝑝̅𝑐 ) 𝑝̅𝑐 (1−𝑝̅𝑐 ) 0.058×0.942 0.058×0.942 0.020
√( + ) √ +
𝑛1 𝑛2 374 210
Example:
The Energy Regulatory Authority (ERA) reported in June 2021, that the mean price of a liter of Kerosene was
Sh. 103.70. A random sample of 25 petrol stations had a mean price of Sh. 103.90. Assuming normality, and a
population standard deviation of Sh. 0.5, test using a 5% level of significance whether the population mean for
kerosene has risen since June 2021.
Solution:
We wish to test the hypothesis 𝐻0 : 𝜇 = 103.7 against the alternative 𝐻1 : 𝜇 > 103.7 at 5% level of significance
Example:
The Ministry of Health reports that in 2020, the mean cost of a stay in a hospital for Kenyan women aged 18-44
was Sh. 15,200. A random sample of 400 hospital stays for women aged 18-44 showed a mean cost of Sh.
16,000, with a standard deviation of Sh. 5000. Test whether the population mean cost has increased since 2020,
using a 5% level of significance.
Solution:
We wish to test the hypothesis 𝐻0 : 𝜇 = 15,200 against the alternative 𝐻1 : 𝜇 > 15,200 at 5% level of
significance
Example:
The program director for an accounting program wishes to test, at 5% level of significance, the hypothesis that
her students score higher than national average of 615 on national final exam. She randomly selects 11 recent
graduates of a two-year program and discovers that 𝑋̅ = 630 and 𝑆 = 23.
In this case, we wish to test 𝐻0 : 𝜇 = 615 vs. 𝐻1 : 𝜇 > 615 at 5% level of significance.
√𝑛(𝑋̅ −𝜇0 ) √11(630−615)
The test statistic is given by 𝑡𝑐 = = = 2.1630.
𝑆 23
Since 2.1630 > 1.812, we reject 𝐻0 at 5% level of significance and conclude that her students score higher
than national average of 615.
Exercise:
1. Thirteen data values are observed in a fire-prevention study of sprinkler activation time (in seconds).
27 41 22 27 23 35 30 33 24 27 28 22 24
Actual average activation time is supposed to be 25 seconds. Test if it is more than this at 5% level of
significance.
Ans: 𝑡𝑐 = 1.876 > 1.782
2. A large software company gives job applicants a test of programming ability and the mean for that test
has been 160 in the past. Twenty-five job applicants are randomly selected from one large university and
they produce a mean score and standard deviation of 183 and 12, respectively. At 5% level of
significance, test the claim that this sample comes from a population with a mean score greater than 160.
Ans: 𝑡𝑐 = 9.583 > 1.711
𝑋̅1 − 𝑋̅2
𝑍𝑐 =
𝜎12 𝜎22
√ +
𝑛1 𝑛2
𝑋̅1 − 𝑋̅2
𝑍𝑐 =
𝑆12 𝑆22
√ +
𝑛1 𝑛2
Example:
Many people take ginkgo supplements advertised to improve memory. Are these over-the-counter supplements
effective? In a study, elderly adults were assigned to the treatment group or control group. The 104 participants
who were assigned to the treatment group took 40 mg of ginkgo 3 times a day for 6 weeks. The 115 participants
assigned to the control group took a placebo pill 3 times a day for 6 weeks. At the end of 6 weeks, the Wechsler
Memory Scale was administered. Higher scores indicate better memory function. Summary values are given in
the following table:
𝑋̅1 − 𝑋̅2
𝑡𝑐 =
𝑆𝑝2 𝑆𝑝2
√ +
𝑛1 𝑛2
Where
(𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22
𝑆𝑝2 =
𝑛1 + 𝑛2 − 2
Reject 𝐻0 at 𝛼 level of significance if |𝑡𝑐 | > 𝑡𝑛1 +𝑛2 −2,𝛼⁄2
Example:
Daily protein intake (in grams) is measured on a sample of individuals living below the poverty level and another sample
living above the poverty level with the results:
Suppose we wish to see whether these two groups differ significantly in their mean protein intake. Our hypotheses would
then be
𝐻0 : 𝜇1 = 𝜇2 against the alternative 𝐻1 : 𝜇1 ≠ 𝜇2 where 1 and 2 represent the mean protein intake of those below
and above the poverty level.
We first must find the sample mean and s.d. for each sample. These are:
Below poverty level Above poverty level
Sample mean x1 66.29 x 2 77.49
Sample s.d. S1 9.17 S 2 11 .34
Sample size n1 15 n 2 10
The pooled variance is given by
(𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22
𝑆𝑝2 =
𝑛1 + 𝑛2 − 2
14 × 9.172 + 9 × 11.342
= = 101.50
15 + 10 − 2
The test statistic is given by
𝑋̅1 − 𝑋̅2 66.29 − 77.49
𝑡𝑐 = = = −2.72
𝑆𝑝2 𝑆𝑝2 101.50 101.50
√ +𝑛 √ +
𝑛1 2 15 10
At 5% significance level,
𝑡0.05,23 = 2.069
Since |𝑡𝑐 | = 2.72 > 2.069, we reject the null hypothesis and conclude that the mean protein intakes in the two
groups differ significantly.
TESTS FOR GOODNESS-OF-FIT AND CONTINGENCY TABLES
The chi-square (read kie square) test for goodness of fit is one of the most common nonparametric
procedures. The test evaluates whether there is a statistically significant difference between
observed scores for a sample and expected or hypothesized scores in a population. The test is
based on the assumption that observed scores fall randomly into one category or another, and the
chance of a score falling into a particular category can be estimated. In other words, the test may be
used to check the extent to which a distribution of observed (sample) scores fits an expected or
theoretical distribution.
The test is designed to assess the extent of agreement between the observed and expected
outcomes in each category. Unless the researcher has information or a rationale to the contrary, the
expected frequencies represent an equal proportion of cases in each category. Thus, the expected
frequency may be calculated by
𝐸𝑖 = 𝑁⁄𝑘
where 𝑁 represents the total number of cases, and 𝑘 represents the total number of categories. The
test statistic for the chi-square goodness-of-fit is given by
𝑘
(𝑂𝑖 − 𝐸𝑖 )2
𝜒𝑐2 =∑
𝐸𝑖
𝑖=1
where 𝑂𝑖 and 𝐸𝑖 are the observed and expected frequencies, respectively, for each category. Reject
2
𝐻0 at 𝛼 level of significance if 𝜒𝑐2 > 𝜒𝑘−1,𝛼
Example:
The following data on absenteeism was collected from a manufacturing plant. At 1% level of
significance, can we support the claim that there is a difference in the absence rate by day of the
week?
Day Frequency
Monday 95
Tuesday 65
Wednesday 60
Thursday 80
Friday 100
Solution:
We are testing the following hypotheses:
95 + 65 + 60 + 80 + 100
𝐸𝑖 = = 80
5
Day 𝑂𝑖 𝐸𝑖 𝑂𝑖 − 𝐸𝑖 (𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2
𝐸𝑖
Monday 95 80 15 225 2.8125
Tuesday 65 80 -15 225 2.8125
Wednesday 60 80 -20 400 5
Thursday 80 80 0 0 0
Friday 100 80 20 400 5
Total 15.625
2
Since 𝜒𝑐2 = 15.625 > 𝜒4,0.01 = 13.277, we reject 𝐻0 and conclude that there is a difference in
absenteeism due to day of the week.
Example:
During the COVID-19 containment period, the State Department for Transport collected data
from a random sample of 43,000 employees across the country on how they were getting to work.
Suppose you wanted to know if people who live in Nairobi commute with similar proportions as
the entire nation. To do this, you pick a random sample of 1000 employees in Nairobi County from
which you obtain the following data:
Example:
Suppose we wish to establish whether employees in the County Government of Embu represent
the sub counties in Embu County in equal proportions. To do this, we randomly select a group
of 95 employees of the County Government of Embu, and find that 60 of them come from
Embu county. We then examine the home sub-county for each of the 60 employees and the
result is as follows:
Sub-County No. of Employees
Embu East 15
Embu West 8
Embu North 12
Mbeere North 10
Mbeere South 15
Clearly, there are sub-county-to-sub-county differences. The issue is whether the divergence
from the expected equality is sufficient to be statistically significant, or could any differences be
just the random variations that can occur any time one draws a sample.
2
Since 𝜒𝑐2 = 3.1667 < 𝜒4,0.05 = 9.488, we fail to reject 𝐻0 and conclude that the sub-county-to-
sub-county differences are not statistically significant.
Chi-Square Test of Independence/Association
The chi-square test of independence/association determines whether there is an association
between categorical variables; that is, whether the variables are independent or related. The test
utilizes a contingency table to analyze the data. A contingency table (also known as a cross-
tabulation) is an arrangement in which data are classified according to two categorical variables.
The categories for one variable appear in rows, and the categories for the other variable appear in
columns. Each variable must have two or more categories. Each cell reflects the total count of
cases (the observed frequency) for a specific pair of categories. The expected frequency is
computed as follows:
Example:
A researcher conducted a poll with 1000 teachers on their opinion about re-introduction of
corporal punishment in secondary schools. The data are presented in the following table:
Opinion on CP Males Females Total
Re-introduce 270 230 500
Don’t re-introduce 205 245 450
Unsure 25 25 50
Total 500 500 1000
At 5% level of significance, can we conclude that gender and the opinion about re-introducing
corporal punishment are dependent events?
Solution:
We wish to test the following hypotheses
𝐻0 : Gender and opinion are independent
𝐻1 : Gender and opinion are dependent
2
Since 𝜒𝑐2 = 6.7556 > 𝜒2,0.05 = 5.991, we reject 𝐻0 and conclude that gender and opinion are
dependent variables. Male teachers are more likely to support re-introduction of corporal
punishment.
Example:
A marriage counselor is interested in whether men or women are more likely to seek professional
counseling for conflicts within their marriages, to ignore conflicts, or to try to resolve conflicts
themselves. The counselor collects data from a sample of 110 men and women and asks the
participants to answer questions about conflict resolution in a manner that reflects their true
feelings. The 3 × 2 contingency table is displayed in the following table.
Question: What would have been the conclusion if we tested at 5% significance level?
Example:
Consider the following data, collected from 88 school children. The data are categorized by gender
and by whether the child had repeated or has never repeated a grade.
Repeat
Norepeat Repeated Total
Gender Male 45 10 55
Female 31 2 33
Total 76 12 88
At 5% level of significance, can we conclude that there is a significant relationship between gender
and repeating a grade?
Solution:
Repeat
Norepeat Repeated Total
Gender Male 45 10 55
(47.5) (7.5)
Female 31 2 33
(28.5) (4.5)
Total 76 12 88
(𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2
𝑂𝑖 𝐸𝑖 𝐸𝑖
45 47.5 6.25 0.131579
10 7.5 6.25 0.833333
31 28.5 6.25 0.219298
2 4.5 6.25 1.388889
Total 2.5731
2
Since 𝜒𝑐2 = 2.5731 < 𝜒1,0.05 = 7.879, we fail to reject 𝐻0 and conclude that gender and repeating
a grade are independent variables.
ERRORS IN HYPOTHESIS TESTING
- Type I error occurs when we reject a true 𝐻0
- Type II error occurs when we fail to reject a false 𝐻0
Decision