0% found this document useful (0 votes)

21 views17 pages

ECT702 Lecture6 Hypothesis Testing-1

The document discusses hypothesis testing in statistics, outlining its purpose, steps, and methods for testing population proportions and means. It includes examples demonstrating how to apply hypothesis testing to real-world scenarios, such as assessing financial aid needs and counseling service usage among university students. The document emphasizes the importance of determining statistical significance in research claims.

Uploaded by

erastuzkipyegon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views17 pages

ECT702 Lecture6 Hypothesis Testing-1

Uploaded by

erastuzkipyegon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

UNIVERSITY OF EMBU

ECT 702: STATISTICS FOR EDUCATIONAL RESEARCH

LECTURE 6: HYPOTHESIS TESTING
Introduction
The goal of hypothesis testing is to bring data to bear on some assertion or claim, or to examine the credibility
associated with a specific research result. In other words, hypothesis testing helps in determining whether there
is enough evidence in a given data set to conclude that the assertions made are statistically significantly true
(beyond chance).
Hypothesis testing starts with the assumption that the null hypothesis is true, which implies that the
hypothesized effect or relationship does not exist in the population.
Hypothesis testing is usually formulated as a test of a reference null hypothesis, 𝐻0 against an alternative
hypothesis, 𝐻1 .
The method of hypothesis testing can be summarized in four steps:
1. Identify a hypothesis or claim that needs to be tested; e.g., the mean number of minutes per day that the
youths spend on social media is 120 minutes.
2. Select a criterion upon which we decide whether the hypothesis being tested should be accepted or not.
3. Select a sample from the population and measure
4. Compare what we observe in the sample to what we expect to observe if the claim we are testing is true.
HYPOTHESIS TESTS ABOUT THE POPULATION PROPORTION: ONE SAMPLE INFERENCE
Suppose we wish to test the hypothesis 𝐻0 : 𝑝 = 𝑝0 against the alternative 𝐻1 : 𝑝 ≠ 𝑝0 at 𝛼 level of significance.
The test statistic is given by
𝑝̅ − 𝑝0
𝑍𝑐 =
𝑝0 (1−𝑝0 )
√
𝑛

where 𝑛 is the sample size and 𝑝̅ is the proportion of successes in the sample.
Reject 𝐻0 at 𝛼 level of significance if

|𝑍𝑐 | > 𝑍𝛼⁄

Example:
The Dean of Students Office at a local university is conducting a survey to determine the proportion of
incoming first years that will need financial aid. A survey on housing needs, financial aid and academic
interests is collected from 400 of the incoming first years. The Dean of Students Office hypothesized that 30%
of the first years will need financial aid and the sample from the survey indicated that 101 would need financial
aid. At 5% significance level, is this an accurate guess?
Solution:
𝐻0 : 𝑝 = 0.30
𝐻1 : 𝑝 ≠ 0.3
101
𝑝̅ = = 0.2525
400
𝑝̅ − 𝑝0 0.2525 − 0.30
𝑍𝑐 = = = −2.0731
𝑝0 (1−𝑝0 ) 0.30(1−0.30)
√ √
𝑛 400

At 5% significance level, 𝑍0.025 = 1.96

2.5%

-1.96 1.96
Since |𝑍𝑐 | = 2.0731 > 1.96, we reject the null hypothesis (that 𝑝 = 0.30) and conclude that the population of
first years needing financial aid is significantly different from 30%.
Since the test statistic is negative, we can conclude at 5% significance level that in the population of incoming
first years, less than 30% of the students will need financial aid.
Note: For 1% level of significance.
1%
= 0.5% = 0.005
2
We need to find the value of the point z such that the total area to the left of that point is 100%-0.5%=99.5%
99.5
= 0.995
100
2.0731 < 2.58
In this case, we fail to reject the null hypothesis and conclude that proportion of students in the population in
need of financial aid is 30%
Exercise: Repeat the above example using 1% instead of 5% level of significance. What is your conclusion?
Now repeat it with 10% significance level, what is your conclusion?
Example:
The MoE report indicated that in 2018, 75% of University students aged 17 to 22 saw a counsellor in the past
year. An investigator wants to assess whether the use of counselling services is similar in students of a local
university. A sample of 125 university students aged 17 to 22 from the local university are surveyed and 64
reported seeing a counsellor over the past 12 months. At 1% level of significance, is there a significant
difference in use of counselling services between the students of the local university and the national data?
Solution:
𝐻0 : 𝑝 = 0.75
𝐻1 : 𝑝 ≠ 0.75
64
𝑝̅ = = 0.512
125
The test statistic is given by
𝑝̅ − 𝑝0 0.512 − 0.75 −0.238
𝑍𝑐 = = = = −6.1451
𝑝0 (1−𝑝0 ) 0.75×0.25 0.0387
√ √
𝑛 125

At 1% significance level, 𝑍0.005 = 2.58

0.5%

-2.58 2.58
Since |𝑍𝑐 | = 6.1451 > 2.58, we reject the null hypothesis at 1% significance level and conclude that the
proportion of students requiring counselling services in the local university is significantly different from the
proportion nationally.
Exercise:
The NACADA report indicated that in 2019 the prevalence of Marijuana smoking among Kenyan adults was
21.1%. Data on prevalent smoking in 3536 participants who attended a medical camp indicated that 482 of the
respondents were currently smoking Marijuana at the time of the camp. Suppose we want to assess whether the
prevalence of Marijuana smoking is lower in the medical camp sample given the focus on health in that
community. At 5% significance level, is there evidence of a statistically lower prevalence of Marijuana
smoking in the Medical Camp study as compared to the prevalence among all Kenyans?
Ans. |𝑍𝑐 | = 10.93 > 1.65; reject 𝐻0 and conclude that we have statistically significant evidence at 5%
significance level to show that the prevalence of Marijuana smoking in the Medical Camp is lower than the
prevalence nationally.
HYPOTHESIS TESTS ABOUT THE POPULATION PROPORTIONS: TWO SAMPLE INFERENCE
Here we consider the situation where there are two independent comparison groups and the outcome of interest
is dichotomous (e.g., Successes/Failure; Yes/No). The goal of the analysis is to compare proportions of
successes between the two groups.
Suppose we wish to test the null hypothesis 𝐻0 : 𝑝1 = 𝑝2 against the alternative 𝐻1 : 𝑝1 ≠ 𝑝2 at 𝛼 level of
significance.
The test statistic is given by
𝑝̅1 − 𝑝̅2
𝑍𝑐 =
𝑝̅𝑐 (1−𝑝̅𝑐 ) 𝑝̅𝑐 (1−𝑝̅𝑐 )
√( + )
𝑛1 𝑛2

where
𝑥
 𝑝̅1 is the proportion of successes in sample 1; i.e., 𝑝̅1 = 𝑛1
1
 𝑥1 is the number of successes in sample 1
 𝑛1 is the size of sample 1
𝑥
 𝑝̅2 is the proportion of successes in sample 2; i.e., 𝑝̅2 = 2
𝑛2
 𝑥2 is the number of successes in sample 2
 𝑛2 is the size of sample 2
𝑥 +𝑥
 𝑝̅𝑐 is the proportion of successes in the pooled sample; i.e., 𝑝̅𝑐 = 𝑛1 +𝑛2
1 2

Reject 𝐻0 at 𝛼 level of significance if |𝑍𝑐 | > 𝑍𝛼⁄2

Note:
The above formula is appropriate for large samples, defined as at least 5 successes (𝑛𝑝 ≥ 5) and at least 5
failures (𝑛(1 − 𝑝) ≥ 5) in each of the two samples. If there are fewer than 5 successes or failures in either
comparison group, then alternative procedures, called exact methods must be used to estimate the difference in
population proportions.
Example:
The following table summarizes data from 3799 participants who attended a Medical Camp. The outcome of
interest is prevalent cardiovascular disease and we want to test whether the prevalence of cardiovascular disease
is significantly higher in smokers as compared to non-smokers.
Free of CVD History of CVD Total
Non-smoker 2757 298 3055
Current smoker 663 81 744
Total 3420 379 3799
Let sample 1 (resp. 2) be the one corresponding to non-smokers (resp. smokers)
We wish to test the hypothesis
𝐻0 : 𝑝1 = 𝑝2 ; 𝑣𝑠 𝐻1 : 𝑝1 < 𝑝2
𝑥 298
𝑛1 = 3055; 𝑥1 = 298; 𝑝̅1 = 𝑛1 = 3055 = 0.0975
1

𝑥 81
𝑛2 = 744; 𝑥2 = 81; 𝑝̅2 = 𝑛2 = 744 = 0.1089
2

𝑥1 + 𝑥2 298 + 81 379
𝑝̅𝑐 = = = = 0.0998
𝑛1 + 𝑛2 3055 + 744 3799
The test statistic is given by
𝑝̅1 − 𝑝̅2 0.0975 − 0.1089 −0.0114
𝑍𝑐 = = = = −0.9346
𝑝̅𝑐 (1−𝑝̅𝑐 ) 𝑝̅𝑐 (1−𝑝̅𝑐 ) 0.0988×0.9002 0.0988×0.9002 √0.0000291 + 0.0001197
√( + ) √ +
𝑛1 𝑛2 3055 744

At 5% significance level, 𝑍0.005 = 1.65

1.65
Since |𝑍𝑐 | = 0.9346 < 1.65, we fail to reject the null hypothesis and conclude that the data does not provide
any evidence of statistically significant difference in the proportion of prevalence of cardiovascular diseases in
the two populations.
Exercise:
The makers of a new drug for ADHD reported that 26 of the 374 subjects who took the drug (experimental
group) experienced vomiting as a side effect, compared to 8 of the 210 subjects who were on the placebo
(control group). Note that patients did not know which treatment they were given. At 5% level of significance,
is there sufficient evidence to suggest that the entire population on the drug would experience more vomiting?
Ans.: |𝑍𝑐 | = 1.60; do not reject 𝐻0
Let sample 1 (resp. 2) be the one corresponding to experimental (resp. control) group.
We wish to test the hypothesis
𝐻0 : 𝑝1 = 𝑝2 ; vs 𝐻1 : 𝑝1 > 𝑝2

𝑥 26
𝑛1 = 374; 𝑥1 = 26; 𝑝̅1 = 𝑛1 = 374 = 0.070
1

𝑥 8
𝑛2 = 210; 𝑥2 = 8; 𝑝̅2 = 𝑛2 = 210 = 0.038
2

𝑥1 + 𝑥2 26 + 8
𝑝̅𝑐 = = = 0.058
𝑛1 + 𝑛2 374 + 210
The test statistic is given by
𝑝̅1 − 𝑝̅2 0.070 − 0.038 0.032
𝑍𝑐 = = = = 1.60
𝑝̅𝑐 (1−𝑝̅𝑐 ) 𝑝̅𝑐 (1−𝑝̅𝑐 ) 0.058×0.942 0.058×0.942 0.020
√( + ) √ +
𝑛1 𝑛2 374 210

At 5% significance level, 𝑍0.05 = 1.65

Since 1.60 < 1.65, we fail to reject the null hypothesis and conclude that vomiting is not experienced
significantly more by those taking this drug when compared to a placebo.
HYPOTHESIS TESTS ABOUT THE POPULATION MEAN: ONE SAMPLE INFERENCE
One-Sample Z-test, 𝝈 known
Suppose we wish to test the hypothesis 𝐻0 : 𝜇 = 𝜇0 against the alternative 𝐻1 : 𝜇 ≠ 𝜇0 at 𝛼 level of significance.
√𝑛(𝑋̅ −𝜇0 )
The test statistic is given by 𝑍𝑐 = 𝜎

Reject 𝐻0 at 𝛼 level of significance if |𝑍𝑐 | > 𝑍𝛼⁄2

Example:
The Energy Regulatory Authority (ERA) reported in June 2021, that the mean price of a liter of Kerosene was
Sh. 103.70. A random sample of 25 petrol stations had a mean price of Sh. 103.90. Assuming normality, and a
population standard deviation of Sh. 0.5, test using a 5% level of significance whether the population mean for
kerosene has risen since June 2021.
Solution:
We wish to test the hypothesis 𝐻0 : 𝜇 = 103.7 against the alternative 𝐻1 : 𝜇 > 103.7 at 5% level of significance

√𝑛(𝑋̅ − 𝜇0 ) √25(103.90 − 103.70)

𝑍𝑐 = = =2
𝜎 0.5
At 5% significance level, 𝑍0.05 = 1.65
Since 𝑍𝑐 = 2 > 1.65, we reject the null hypothesis and conclude that the mean price of kerosene has risen from
Sh. 103.70 in June.
Exercise:
(a) A random sample of 100 water melons is obtained and the mean circumference is found to be 40.5 cm.
assuming that the population standard deviation is known to be 1.6 cm, use a 5% level of significance to
test the claim that the mean circumference of all water melons is equal to 39.9 cm.
Ans: |𝑍𝑐 | = 3.75 > 1.96; reject 𝐻0 .
(b) A simple random sample of 15-year old boys from one city is obtained and their weights (in pounds) are
listed below: 147 138 162 151 134 189 157 144 175 127 164
At 1% level of significance, test the claim that these sample weights come from a population with a
mean equal to 149 pounds. Assume that the standard deviation of the weights of all 15-year old boys in
the city is known to be 16.2 pounds
Ans: |𝑍𝑐 | = 0.912 < 2.33; fail to reject 𝐻0 .

One-Sample Z-test, 𝝈 unknown, 𝒏 > 30

Suppose we wish to test the hypothesis 𝐻0 : 𝜇 = 𝜇0 against the alternative 𝐻1 : 𝜇 ≠ 𝜇0 at 𝛼 level of significance.

√𝑛(𝑋̅ −𝜇0 ) ∑(𝑋−𝑋̅)2

The test statistic is given by 𝑍𝑐 = where 𝑆 = √ is the sample standard deviation.
𝑆 𝑛−1

Reject 𝐻0 at 𝛼 level of significance if 𝑍𝑐 > 𝑍𝛼⁄2

Example:
The Ministry of Health reports that in 2020, the mean cost of a stay in a hospital for Kenyan women aged 18-44
was Sh. 15,200. A random sample of 400 hospital stays for women aged 18-44 showed a mean cost of Sh.
16,000, with a standard deviation of Sh. 5000. Test whether the population mean cost has increased since 2020,
using a 5% level of significance.
Solution:
We wish to test the hypothesis 𝐻0 : 𝜇 = 15,200 against the alternative 𝐻1 : 𝜇 > 15,200 at 5% level of
significance

√𝑛(𝑋̅ − 𝜇0 ) √400(16000 − 15200)

𝑍𝑐 = = = 3.2
𝑆 5000
At 5% significance level, 𝑍0.05 = 1.65
Since 𝑍𝑐 = 3.2 > 1.65, we reject the null hypothesis and conclude that the population mean cost has increased
since 2020.
One-Sample t-test, 𝝈 unknown, 𝒏 ≤ 30
Suppose we wish to test the hypothesis 𝐻0 : 𝜇 = 𝜇0 against the alternative 𝐻1 : 𝜇 ≠ 𝜇0 at 𝛼 level of significance.

√𝑛(𝑋̅ −𝜇0 ) ∑(𝑋−𝑋̅ )2

The test statistic is given by 𝑡𝑐 = where 𝑆 = √ is the sample standard deviation.
𝑆 𝑛−1

Reject 𝐻0 at 𝛼 level of significance if |𝑡𝑐 | > 𝑡𝑛−1,𝛼⁄2

Example:
The program director for an accounting program wishes to test, at 5% level of significance, the hypothesis that
her students score higher than national average of 615 on national final exam. She randomly selects 11 recent
graduates of a two-year program and discovers that 𝑋̅ = 630 and 𝑆 = 23.
In this case, we wish to test 𝐻0 : 𝜇 = 615 vs. 𝐻1 : 𝜇 > 615 at 5% level of significance.
√𝑛(𝑋̅ −𝜇0 ) √11(630−615)
The test statistic is given by 𝑡𝑐 = = = 2.1630.
𝑆 23

At 5% level of significance, 𝑡10,0.05 = 1.812.

Since 2.1630 > 1.812, we reject 𝐻0 at 5% level of significance and conclude that her students score higher
than national average of 615.
Exercise:
1. Thirteen data values are observed in a fire-prevention study of sprinkler activation time (in seconds).
27 41 22 27 23 35 30 33 24 27 28 22 24
Actual average activation time is supposed to be 25 seconds. Test if it is more than this at 5% level of
significance.
Ans: 𝑡𝑐 = 1.876 > 1.782
2. A large software company gives job applicants a test of programming ability and the mean for that test
has been 160 in the past. Twenty-five job applicants are randomly selected from one large university and
they produce a mean score and standard deviation of 183 and 12, respectively. At 5% level of
significance, test the claim that this sample comes from a population with a mean score greater than 160.
Ans: 𝑡𝑐 = 9.583 > 1.711

HYPOTHESIS TESTS ABOUT THE MEANS: TWO SAMPLE INFERENCE

Independent Samples Z-test 𝝈𝟏 , 𝝈𝟐 known
Suppose we wish to test the hypothesis 𝐻0 : 𝜇1 = 𝜇2 against the alternative 𝐻1 : 𝜇1 ≠ 𝜇2 at 𝛼 level of
significance.
The test statistic is given by

𝑋̅1 − 𝑋̅2
𝑍𝑐 =
𝜎12 𝜎22
√ +
𝑛1 𝑛2

Reject 𝐻0 at 𝛼 level of significance if 𝑍𝑐 > 𝑍𝛼⁄2

Independent Samples Z-test 𝝈𝟏 , 𝝈𝟐 unknown, 𝒏𝟏 > 𝟑𝟎 and 𝒏𝟐 > 𝟑𝟎

Suppose we wish to test the hypothesis 𝐻0 : 𝜇1 = 𝜇2 against the alternative 𝐻1 : 𝜇1 ≠ 𝜇2 at 𝛼 level of
significance.
The test statistic is given by

𝑋̅1 − 𝑋̅2
𝑍𝑐 =
𝑆12 𝑆22
√ +
𝑛1 𝑛2

Reject 𝐻0 at 𝛼 level of significance if |𝑍𝑐 | > 𝑍𝛼⁄2

Example:
Many people take ginkgo supplements advertised to improve memory. Are these over-the-counter supplements
effective? In a study, elderly adults were assigned to the treatment group or control group. The 104 participants
who were assigned to the treatment group took 40 mg of ginkgo 3 times a day for 6 weeks. The 115 participants
assigned to the control group took a placebo pill 3 times a day for 6 weeks. At the end of 6 weeks, the Wechsler
Memory Scale was administered. Higher scores indicate better memory function. Summary values are given in
the following table:

Group Sample size mean Standard deviation

Ginkgo 104 5.7 0.6
Placebo 115 5.5 0.5
Based on these results, is there evidence that taking 40mg of ginkgo 3 times a day is effective in increasing
mean performance on the Wechsler Memory Scale? (use 5% level of significance)
Solution:
We wish to test the hypothesis 𝐻0 : 𝜇𝐺 = 𝜇𝑃 against the alternative 𝐻1 : 𝜇𝐺 > 𝜇𝑃 at 5% level of significance.
𝑋̅𝐺 −𝑋̅𝑃 5.7−5.5
The test statistic is given by 𝑍𝑐 = = 2 2
= 2.6642
𝑆2 𝑆2 √0.6 +0.5
√ 𝐺+ 𝑃 104 115
𝑛𝐺 𝑛𝑃

At 5% significance level, 𝑍0.05 = 1.65

Since 2.6642 > 1.65, we reject the null hypothesis and conclude that there is a significant difference in the
Ginkgo group when compared to a placebo. In other words, we conclude that Ginkgo does improve the memory
score.

Independent Samples t-test 𝝈𝟏 = 𝝈𝟐 but unknown, 𝒏𝟏 ≤ 𝟑𝟎 or 𝒏𝟐 ≤ 𝟑𝟎

Suppose we wish to test the hypothesis 𝐻0 : 𝜇1 = 𝜇2 against the alternative 𝐻1 : 𝜇1 ≠ 𝜇2 at 𝛼 level of
significance.
The test statistic is given by

𝑋̅1 − 𝑋̅2
𝑡𝑐 =
𝑆𝑝2 𝑆𝑝2
√ +
𝑛1 𝑛2

Where
(𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22
𝑆𝑝2 =
𝑛1 + 𝑛2 − 2
Reject 𝐻0 at 𝛼 level of significance if |𝑡𝑐 | > 𝑡𝑛1 +𝑛2 −2,𝛼⁄2

Example:
Daily protein intake (in grams) is measured on a sample of individuals living below the poverty level and another sample
living above the poverty level with the results:

Below poverty level:

51.4, 49.7, 72.0, 76.7, 65.8, 55.0, 73.7, 62.1, 79.7, 66.2, 75.8, 65.4, 65.5, 62.0, 73.3

Above poverty level:

86.0, 69.0, 59.7, 80.2, 68.6, 78.1, 98.6, 69.8, 87.7, 77.2.

Suppose we wish to see whether these two groups differ significantly in their mean protein intake. Our hypotheses would
then be
𝐻0 : 𝜇1 = 𝜇2 against the alternative 𝐻1 : 𝜇1 ≠ 𝜇2 where 1 and  2 represent the mean protein intake of those below
and above the poverty level.

We first must find the sample mean and s.d. for each sample. These are:
Below poverty level Above poverty level
Sample mean x1  66.29 x 2  77.49
Sample s.d. S1  9.17 S 2  11 .34
Sample size n1  15 n 2  10
The pooled variance is given by
(𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22
𝑆𝑝2 =
𝑛1 + 𝑛2 − 2
14 × 9.172 + 9 × 11.342
= = 101.50
15 + 10 − 2
The test statistic is given by
𝑋̅1 − 𝑋̅2 66.29 − 77.49
𝑡𝑐 = = = −2.72
𝑆𝑝2 𝑆𝑝2 101.50 101.50
√ +𝑛 √ +
𝑛1 2 15 10

At 5% significance level,
𝑡0.05,23 = 2.069

Since |𝑡𝑐 | = 2.72 > 2.069, we reject the null hypothesis and conclude that the mean protein intakes in the two
groups differ significantly.
TESTS FOR GOODNESS-OF-FIT AND CONTINGENCY TABLES
The chi-square (read kie square) test for goodness of fit is one of the most common nonparametric
procedures. The test evaluates whether there is a statistically significant difference between
observed scores for a sample and expected or hypothesized scores in a population. The test is
based on the assumption that observed scores fall randomly into one category or another, and the
chance of a score falling into a particular category can be estimated. In other words, the test may be
used to check the extent to which a distribution of observed (sample) scores fits an expected or
theoretical distribution.

The test is designed to assess the extent of agreement between the observed and expected
outcomes in each category. Unless the researcher has information or a rationale to the contrary, the
expected frequencies represent an equal proportion of cases in each category. Thus, the expected
frequency may be calculated by
𝐸𝑖 = 𝑁⁄𝑘
where 𝑁 represents the total number of cases, and 𝑘 represents the total number of categories. The
test statistic for the chi-square goodness-of-fit is given by
𝑘
(𝑂𝑖 − 𝐸𝑖 )2
𝜒𝑐2 =∑
𝐸𝑖
𝑖=1
where 𝑂𝑖 and 𝐸𝑖 are the observed and expected frequencies, respectively, for each category. Reject
2
𝐻0 at 𝛼 level of significance if 𝜒𝑐2 > 𝜒𝑘−1,𝛼

Example:
The following data on absenteeism was collected from a manufacturing plant. At 1% level of
significance, can we support the claim that there is a difference in the absence rate by day of the
week?

Day Frequency
Monday 95
Tuesday 65
Wednesday 60
Thursday 80
Friday 100
Solution:
We are testing the following hypotheses:

𝐻0 : There is no difference in absenteeism due to day of the week

𝐻1 : There is a difference in absenteeism due to day of the week

Assuming equal expected frequency,

95 + 65 + 60 + 80 + 100
𝐸𝑖 = = 80
5
Day 𝑂𝑖 𝐸𝑖 𝑂𝑖 − 𝐸𝑖 (𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2
𝐸𝑖
Monday 95 80 15 225 2.8125
Tuesday 65 80 -15 225 2.8125
Wednesday 60 80 -20 400 5
Thursday 80 80 0 0 0
Friday 100 80 20 400 5
Total 15.625

2
Since 𝜒𝑐2 = 15.625 > 𝜒4,0.01 = 13.277, we reject 𝐻0 and conclude that there is a difference in
absenteeism due to day of the week.

Example:
During the COVID-19 containment period, the State Department for Transport collected data
from a random sample of 43,000 employees across the country on how they were getting to work.

Method of Commuting Percentage

Driving a Private Car 22%
Public Service Vehicles (Matatus) 49.5%
Public Motorcycles (Boda bodas) 11%
Motorcycle Commuting 1.5%
Bicycle Commuting 1.2%
Walking 6%
Working from Home 5.4%
Other Means 3.4%

Suppose you wanted to know if people who live in Nairobi commute with similar proportions as
the entire nation. To do this, you pick a random sample of 1000 employees in Nairobi County from
which you obtain the following data:

Method of Commuting Frequency

Driving a Private Car 245
Public Service Vehicles (Matatus) 418
Public Motorcycles (Boda bodas) 122
Motorcycle Commuting 18
Bicycle Commuting 18
Walking 75
Working from Home 58
Other Means 46
Total 1000
What is your conclusion?
Solution:
Method of Commuting Observed Expected 𝑂𝑖 − 𝐸𝑖 (𝑂𝑖 − 𝐸𝑖 )2
frequency frequency
(𝑂𝑖 ) (𝐸𝑖 )
Driving a Private Car 245 220 25 625
Public Service Vehicles
418 495
(Matatus) -77 5929
Public Motorcycles (Boda
122 110
bodas) 12 144
Motorcycle Commuting 18 15 3 9
Bicycle Commuting 18 12 6 36
Walking 75 60 15 225
Working from Home 58 54 4 16
Other Means 46 34 12 144

Example:
Suppose we wish to establish whether employees in the County Government of Embu represent
the sub counties in Embu County in equal proportions. To do this, we randomly select a group
of 95 employees of the County Government of Embu, and find that 60 of them come from
Embu county. We then examine the home sub-county for each of the 60 employees and the
result is as follows:
Sub-County No. of Employees
Embu East 15
Embu West 8
Embu North 12
Mbeere North 10
Mbeere South 15
Clearly, there are sub-county-to-sub-county differences. The issue is whether the divergence
from the expected equality is sufficient to be statistically significant, or could any differences be
just the random variations that can occur any time one draws a sample.

Sub-County 𝑂𝑖 𝐸𝑖 𝑂𝑖 − 𝐸𝑖 (𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2

𝐸𝑖
Embu East 15 12 3 9 0.75
Embu West 8 12 -4 16 1.3333
Embu North 12 12 0 0 0
Mbeere North 10 12 -2 4 0.3333
Mbeere South 15 12 3 9 0.75
Total 3.1667

2
Since 𝜒𝑐2 = 3.1667 < 𝜒4,0.05 = 9.488, we fail to reject 𝐻0 and conclude that the sub-county-to-
sub-county differences are not statistically significant.
Chi-Square Test of Independence/Association
The chi-square test of independence/association determines whether there is an association
between categorical variables; that is, whether the variables are independent or related. The test
utilizes a contingency table to analyze the data. A contingency table (also known as a cross-
tabulation) is an arrangement in which data are classified according to two categorical variables.
The categories for one variable appear in rows, and the categories for the other variable appear in
columns. Each variable must have two or more categories. Each cell reflects the total count of
cases (the observed frequency) for a specific pair of categories. The expected frequency is
computed as follows:

(row total)(column total)

𝐸𝑖 =
grand Total
The degrees of freedom are given by
𝑑𝑓 = (𝑟 − 1)(𝑐 − 1)
where 𝑟 is the number of rows and 𝑐 is the number of columns.

Example:
A researcher conducted a poll with 1000 teachers on their opinion about re-introduction of
corporal punishment in secondary schools. The data are presented in the following table:
Opinion on CP Males Females Total
Re-introduce 270 230 500
Don’t re-introduce 205 245 450
Unsure 25 25 50
Total 500 500 1000
At 5% level of significance, can we conclude that gender and the opinion about re-introducing
corporal punishment are dependent events?
Solution:
We wish to test the following hypotheses
𝐻0 : Gender and opinion are independent
𝐻1 : Gender and opinion are dependent

The observed is the reported data

The expected is
(Row Total)(Column Total)
Grand Total

Opinion on CP Males Females Total

Re-introduce 270 230 500
(250) (250)
Don’t re-introduce 205 245 450
(225) (225)
Unsure 25 25 50
(25) (25)
Total 500 500 1000
(𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2
𝑂𝑖 𝐸𝑖 𝐸𝑖
270 250 400 1.6
230 250 400 1.6
205 225 400 1.777778
245 225 400 1.777778
25 25 0 0
25 25 0 0
Total 6.7556

2
Since 𝜒𝑐2 = 6.7556 > 𝜒2,0.05 = 5.991, we reject 𝐻0 and conclude that gender and opinion are
dependent variables. Male teachers are more likely to support re-introduction of corporal
punishment.

Example:
A marriage counselor is interested in whether men or women are more likely to seek professional
counseling for conflicts within their marriages, to ignore conflicts, or to try to resolve conflicts
themselves. The counselor collects data from a sample of 110 men and women and asks the
participants to answer questions about conflict resolution in a manner that reflects their true
feelings. The 3 × 2 contingency table is displayed in the following table.

Preferred Method Men Women Total

Mediator 14 31 45
Self-Negotiating 21 18 39
Ignoring 17 9 26
Total 52 58 110
At 1% significance level, what was the counsellor’s conclusion?
Solution:
Preferred Method Men Women Total
Mediator 14 31 45
(21.27) (23.73)
Self-Negotiating 21 18 39
(18.44) (20.56)
Ignoring 17 9 26
(12.29) (13.71)
Total 52 58 110
(𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2
𝑂𝑖 𝐸𝑖 𝐸𝑖
14 21.27 52.85 2.48
31 23.73 52.85 2.23
21 18.44 6.55 0.36
18 20.56 6.55 0.32
17 12.29 22.18 1.8
9 13.71 22.18 1.62
Total 8.81
2 2
Since 𝜒𝑐 = 8.81 < 𝜒2,0.01 = 9.21, we fail to reject 𝐻0 and conclude that gender and choice of
method of conflict resolution are independent variables. In other words, the choice of method for
conflict resolution is independent of group membership (gender).

Question: What would have been the conclusion if we tested at 5% significance level?
Example:
Consider the following data, collected from 88 school children. The data are categorized by gender
and by whether the child had repeated or has never repeated a grade.

Repeat
Norepeat Repeated Total
Gender Male 45 10 55
Female 31 2 33
Total 76 12 88
At 5% level of significance, can we conclude that there is a significant relationship between gender
and repeating a grade?
Solution:
Repeat
Norepeat Repeated Total
Gender Male 45 10 55
(47.5) (7.5)
Female 31 2 33
(28.5) (4.5)
Total 76 12 88

(𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2
𝑂𝑖 𝐸𝑖 𝐸𝑖
45 47.5 6.25 0.131579
10 7.5 6.25 0.833333
31 28.5 6.25 0.219298
2 4.5 6.25 1.388889
Total 2.5731
2
Since 𝜒𝑐2 = 2.5731 < 𝜒1,0.05 = 7.879, we fail to reject 𝐻0 and conclude that gender and repeating
a grade are independent variables.
ERRORS IN HYPOTHESIS TESTING
- Type I error occurs when we reject a true 𝐻0
- Type II error occurs when we fail to reject a false 𝐻0

Decision

Fail to reject 𝐻0 Reject 𝐻0

Reality 𝐻0 is actually Correct Type I error

true decision

𝐻0 is actually Type II error Correct

false decision

Correct decisions and errors in hypothesis testing

Tests of Hypothesis, Correlation and Regression Analysis
No ratings yet
Tests of Hypothesis, Correlation and Regression Analysis
60 pages
Test of Significance (Large Sample)
No ratings yet
Test of Significance (Large Sample)
21 pages
Micro Project Format Proposal
No ratings yet
Micro Project Format Proposal
5 pages
Unit45HypothesisTestingStudentCopypdf 2024 10-07-08!58!16
No ratings yet
Unit45HypothesisTestingStudentCopypdf 2024 10-07-08!58!16
44 pages
Statistics For Computing II COM 216
No ratings yet
Statistics For Computing II COM 216
8 pages
Chapter 7 Hypothesis Testing and Sample Size Determination - 2
No ratings yet
Chapter 7 Hypothesis Testing and Sample Size Determination - 2
69 pages
Cultural Heritage in A Changing World
No ratings yet
Cultural Heritage in A Changing World
340 pages
Testing of Hypothesis
No ratings yet
Testing of Hypothesis
58 pages
Lecture Notes 1
No ratings yet
Lecture Notes 1
147 pages
Lec 9 (Hypothesis Testing)
No ratings yet
Lec 9 (Hypothesis Testing)
53 pages
Defining Public Interest in Planning Are View
No ratings yet
Defining Public Interest in Planning Are View
20 pages
SOCI1005 - Hypothesis Testing
No ratings yet
SOCI1005 - Hypothesis Testing
23 pages
Hypothesis Testing Ug
No ratings yet
Hypothesis Testing Ug
66 pages
Module 8 MS102
No ratings yet
Module 8 MS102
41 pages
PS-UNIT5-Test of Hypothesis
No ratings yet
PS-UNIT5-Test of Hypothesis
95 pages
STS 112 Hypothesis Testing
No ratings yet
STS 112 Hypothesis Testing
16 pages
8.hypo Testing....
No ratings yet
8.hypo Testing....
44 pages
Stat CH 4
No ratings yet
Stat CH 4
31 pages
Test On Hypothesis For A Single Sample
No ratings yet
Test On Hypothesis For A Single Sample
32 pages
Topic06. Analysis of Differences
No ratings yet
Topic06. Analysis of Differences
63 pages
Statistical Inferences
No ratings yet
Statistical Inferences
46 pages
TEST OF SIGNIFICANCE For Large Sample
No ratings yet
TEST OF SIGNIFICANCE For Large Sample
16 pages
Kim 2022 The Effect of Civilian Oversight On Police Organizational Performance A Quasi Experimental Study
No ratings yet
Kim 2022 The Effect of Civilian Oversight On Police Organizational Performance A Quasi Experimental Study
16 pages
Tests of Significance
No ratings yet
Tests of Significance
13 pages
Biostats - Inferential
No ratings yet
Biostats - Inferential
40 pages
Cleaning The NVD Comprehensive Quality Assessment Improvements and Analyses
No ratings yet
Cleaning The NVD Comprehensive Quality Assessment Improvements and Analyses
15 pages
Week 3 - Statistical Hypothesis Testing
No ratings yet
Week 3 - Statistical Hypothesis Testing
18 pages
Testing
No ratings yet
Testing
29 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
18 pages
Nehawavhal 1
No ratings yet
Nehawavhal 1
65 pages
Testing of Hypothesis
No ratings yet
Testing of Hypothesis
8 pages
Testing of Hypotheses PDF
No ratings yet
Testing of Hypotheses PDF
21 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
42 pages
Conditional Probability and - Independence
No ratings yet
Conditional Probability and - Independence
41 pages
HOTS in The Classroom V.2.1
No ratings yet
HOTS in The Classroom V.2.1
30 pages
Prathamesh Shukla SMDM Project 20.08.23
100% (1)
Prathamesh Shukla SMDM Project 20.08.23
34 pages
Psych Stat Lec 1
No ratings yet
Psych Stat Lec 1
17 pages
MTH 4th Grading Notes
No ratings yet
MTH 4th Grading Notes
19 pages
STAT2 2e R Markdown Files Sec4.7
No ratings yet
STAT2 2e R Markdown Files Sec4.7
10 pages
Huypothesis Testing Final Notes 2020 - 2021
No ratings yet
Huypothesis Testing Final Notes 2020 - 2021
33 pages
PT Module5
No ratings yet
PT Module5
30 pages
Stat Proba Notes3
No ratings yet
Stat Proba Notes3
28 pages
Learning Module - Statistics and Probability
No ratings yet
Learning Module - Statistics and Probability
71 pages
A. Krawczyk Sołtys
No ratings yet
A. Krawczyk Sołtys
13 pages
Statistics & Probability: Hypothesis Testing Z-Test
No ratings yet
Statistics & Probability: Hypothesis Testing Z-Test
63 pages
Theory of Decision
No ratings yet
Theory of Decision
9 pages
Directions: Lab Report Guide
0% (1)
Directions: Lab Report Guide
7 pages
University of Ghana Thesis Repository
100% (3)
University of Ghana Thesis Repository
6 pages
Test of Hypothesis
No ratings yet
Test of Hypothesis
9 pages
LC 1 Question Bank
No ratings yet
LC 1 Question Bank
11 pages
Dissertation Wvu
100% (2)
Dissertation Wvu
6 pages
P&S Unit 5 (1) 111111111111
No ratings yet
P&S Unit 5 (1) 111111111111
16 pages
Chapter 3 Test of Hypothesis
No ratings yet
Chapter 3 Test of Hypothesis
51 pages
Executive Summary Misconduct Investigation AFK and ACJ
No ratings yet
Executive Summary Misconduct Investigation AFK and ACJ
2 pages
Stat Prob Q4 W5
No ratings yet
Stat Prob Q4 W5
7 pages
Handout#3 - Statistical Inference, Z and T Test
No ratings yet
Handout#3 - Statistical Inference, Z and T Test
3 pages
Biostat Hypothesis Testing
100% (4)
Biostat Hypothesis Testing
31 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
45 pages
CHN Film
No ratings yet
CHN Film
2 pages
Improving Speaking Skill of The Eighth Grade Students of SMPN 5 Mengwi in Academic Year 2017/2018 Through Chain-Drill Technique
No ratings yet
Improving Speaking Skill of The Eighth Grade Students of SMPN 5 Mengwi in Academic Year 2017/2018 Through Chain-Drill Technique
11 pages
Chapter 4 Lesson 3: Estimating Population Proportion (P) For The Large Sample Size
No ratings yet
Chapter 4 Lesson 3: Estimating Population Proportion (P) For The Large Sample Size
15 pages
Lab 8 - Sampling Techniques 1
No ratings yet
Lab 8 - Sampling Techniques 1
43 pages
Statistics & Probability Q4 - Week 5-6
No ratings yet
Statistics & Probability Q4 - Week 5-6
13 pages
Gaussian Distributions: Overview: This Worksheet Introduces The Properties of Gaussian Distributions, The
No ratings yet
Gaussian Distributions: Overview: This Worksheet Introduces The Properties of Gaussian Distributions, The
25 pages
04 Hypothesis Testing IITB PDF
No ratings yet
04 Hypothesis Testing IITB PDF
33 pages
Topic 6 UsingHessLaw Hydration Copper Sulphate
No ratings yet
Topic 6 UsingHessLaw Hydration Copper Sulphate
3 pages
Mimi-Choon Quiñones, PHD, MBA, Joins Hispanic Health Council As The Co-Chief Research Officer
No ratings yet
Mimi-Choon Quiñones, PHD, MBA, Joins Hispanic Health Council As The Co-Chief Research Officer
2 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
7 pages
Project Success and Failure
No ratings yet
Project Success and Failure
12 pages
5 - Test of Hypothesis (Part - 1)
No ratings yet
5 - Test of Hypothesis (Part - 1)
44 pages
MTRN3020 Modelling and Control of Mechatronic Systems
No ratings yet
MTRN3020 Modelling and Control of Mechatronic Systems
11 pages
Testing of Hypothesis Hypothesis
No ratings yet
Testing of Hypothesis Hypothesis
32 pages
Module in Statistics and Probability
100% (1)
Module in Statistics and Probability
4 pages
Hypothesis Test
No ratings yet
Hypothesis Test
23 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
St. Paul University Philippines
No ratings yet
St. Paul University Philippines
14 pages
2 - Forecasting and Demand Planning
100% (2)
2 - Forecasting and Demand Planning
51 pages
CH 5 HP Testing
100% (1)
CH 5 HP Testing
29 pages
Statistical Estimation
No ratings yet
Statistical Estimation
37 pages
SOA Essential Playbook
No ratings yet
SOA Essential Playbook
7 pages
Scientific Method Quiz 1-1
No ratings yet
Scientific Method Quiz 1-1
5 pages
AP Stats Chapter 11 Notes
No ratings yet
AP Stats Chapter 11 Notes
10 pages
Testing Hypotheses About Proportions
No ratings yet
Testing Hypotheses About Proportions
26 pages
Hypothesis Testing Is A Procedure of Making Decision
No ratings yet
Hypothesis Testing Is A Procedure of Making Decision
4 pages
Hypothesis
No ratings yet
Hypothesis
11 pages
Why Has IDEO Been So Successful? Answer:: Assignment #2 Priya Aslam (42730) - Edc Campus
No ratings yet
Why Has IDEO Been So Successful? Answer:: Assignment #2 Priya Aslam (42730) - Edc Campus
2 pages
White Box Testing: Anuja Arora Cse / It Jiitu, Noida
No ratings yet
White Box Testing: Anuja Arora Cse / It Jiitu, Noida
24 pages
Chapter V PR2
No ratings yet
Chapter V PR2
3 pages
Tests of Hypothesis-Large Samples
No ratings yet
Tests of Hypothesis-Large Samples
7 pages
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
From Everand
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
S. Deviant
4.5/5 (6)

ECT702 Lecture6 Hypothesis Testing-1

Uploaded by

ECT702 Lecture6 Hypothesis Testing-1

Uploaded by

UNIVERSITY OF EMBU

ECT 702: STATISTICS FOR EDUCATIONAL RESEARCH

|𝑍𝑐 | > 𝑍𝛼⁄

At 5% significance level, 𝑍0.025 = 1.96

At 1% significance level, 𝑍0.005 = 2.58

Reject 𝐻0 at 𝛼 level of significance if |𝑍𝑐 | > 𝑍𝛼⁄2

At 5% significance level, 𝑍0.005 = 1.65

At 5% significance level, 𝑍0.05 = 1.65

Reject 𝐻0 at 𝛼 level of significance if |𝑍𝑐 | > 𝑍𝛼⁄2

√𝑛(𝑋̅ − 𝜇0 ) √25(103.90 − 103.70)

One-Sample Z-test, 𝝈 unknown, 𝒏 > 30

√𝑛(𝑋̅ −𝜇0 ) ∑(𝑋−𝑋̅)2

Reject 𝐻0 at 𝛼 level of significance if 𝑍𝑐 > 𝑍𝛼⁄2

√𝑛(𝑋̅ − 𝜇0 ) √400(16000 − 15200)

√𝑛(𝑋̅ −𝜇0 ) ∑(𝑋−𝑋̅ )2

Reject 𝐻0 at 𝛼 level of significance if |𝑡𝑐 | > 𝑡𝑛−1,𝛼⁄2

At 5% level of significance, 𝑡10,0.05 = 1.812.

HYPOTHESIS TESTS ABOUT THE MEANS: TWO SAMPLE INFERENCE

Reject 𝐻0 at 𝛼 level of significance if 𝑍𝑐 > 𝑍𝛼⁄2

Independent Samples Z-test 𝝈𝟏 , 𝝈𝟐 unknown, 𝒏𝟏 > 𝟑𝟎 and 𝒏𝟐 > 𝟑𝟎

Reject 𝐻0 at 𝛼 level of significance if |𝑍𝑐 | > 𝑍𝛼⁄2

Group Sample size mean Standard deviation

At 5% significance level, 𝑍0.05 = 1.65

Independent Samples t-test 𝝈𝟏 = 𝝈𝟐 but unknown, 𝒏𝟏 ≤ 𝟑𝟎 or 𝒏𝟐 ≤ 𝟑𝟎

Below poverty level:

Above poverty level:

𝐻0 : There is no difference in absenteeism due to day of the week

Assuming equal expected frequency,

Method of Commuting Percentage

Method of Commuting Frequency

Sub-County 𝑂𝑖 𝐸𝑖 𝑂𝑖 − 𝐸𝑖 (𝑂𝑖 − 𝐸𝑖 )2 (𝑂𝑖 − 𝐸𝑖 )2

(row total)(column total)

The observed is the reported data

Opinion on CP Males Females Total

Preferred Method Men Women Total

Fail to reject 𝐻0 Reject 𝐻0

Reality 𝐻0 is actually Correct Type I error

𝐻0 is actually Type II error Correct

Correct decisions and errors in hypothesis testing

You might also like