Chapter 4
Chapter 4
HYPOTHESIS TESTING
Lecturer: Dr Ruzanita Mat Rani
Pre p are d B y :
H A Z IYA H B IN T I MD J A S MIN
CONTENTS
4.0 Introduction / C oncept of Hypothesis Testin g
4.1 Hypothesis Tests about A Population Mean (𝜇)
4.1.1 Variance (𝜎 2 ) or Std. Deviation 𝜎 is K N O W N
4.2 Hypothesis Testin g for Difference between Two Population Means (𝜇1 − 𝜇2 )
4.2.1 Independent Samp le - Variances (𝜎12, 𝜎22) or Std. Deviations (𝜎1 , 𝜎2 ) are K N O W N
4.2.2 Independent Samp le - Variances (𝜎12, 𝜎22) or Std. Deviations (𝜎1 , 𝜎2 ) are UN K N O W N
4.2.2.1 Larg e Samp le
4.2.2.2 Sm all Sample – Eq ual and Unequal Variances assumed (𝜎1 2= 𝜎2 2@ 𝜎1 2≠ 𝜎2 )2
4.2.3 Dependent Samp le
4.3 Testin g for the Difference among more than Two Means (ANOVA)
1. Makin g an assumption about a population parameter. This assumption may or may not be true.
2. C ollectin g a sample data
3. C alculatin g a sample statistics
4. Usin g the sample statistics to evaluate the hypothesis.
Types of Hypothesis
HYPOTHESES
Two-tailed Test
O ne-tailed Test
𝐻1 : ≠
Right-tailed test
𝐻1 : >
Left-tailed test
𝐻1 : <
Rejection Rejection
region region
0.025 0.025
Definition of Terms
Terms Definition
Significance level (α) Probability of rejecting the null hypothesis when it is true. The level of
confidence is 1 – 𝛼. Thus, a confidence level of 95% corresponds to alpha
of 5%.
Test statistic Single number calculated from the sample data as a basis in
deciding to reject or not to reject the null hypothesis.
Regions One region consists of values that support H1 and lead to rejecting Ho is
called the rejection region (or critical region). The other consists
of values that support Ho is called the acceptance region.
Critical value Value of the test statistic that divides the non-rejection region from
the rejection region (left tailed, right tailed, two tailed)
Significance Level (𝜶)
❖D eciding on a criterion for accepting or rejecting the H 0.
❖Significance level refers to the percentage of sample means
that is outside certain prescribed limits.
❖EXAMPLE: Testing a hypothesis at 5% level of significance
(α= 0.05).
-We will reject H 0 if it falls in the two reg ions of area 0.025.
-We will not reject H 0 if it falls within the region of area 0.95.
❖The higher the level of sig nificance, the higher is the
probability of rejecting the null hypothesis when it is true
(acceptance reg ion is narrow).
How to Make Conclusion based on
Hypothesis Testing?
◦ The conclusion once the test has been carried out is always g iven in terms of the
null hypothesis. We either REJECT H0 or FAILED TO REJECT H0.
◦ If we conclude that “We failed to reject H 0 ”, this does not necessarily mean that the
null hypothesis is true, it only sug gests that there is not sufficient/enoug h evidence
against H 0 .
Two-tailed Test
𝐻1 : ≠
How to Determine the p-value? (Step 2)
p -value for
𝐻0 : 𝝁 = …
Alternative hypothesis (H1) Given “Sig. 2-tailed”
𝑝−𝑣𝑎𝑙𝑢𝑒 0.000
𝐻1 : > OR 𝐻1 : < p-value = = = 0.000
2 2
𝐻1 : ≠ p-value = 0.000
How to Write the Decision Rule (Step 4)?
1. Example for One tailed test [Right-tail test (>)]: 𝑯1: >
Decision Rule: Reject H0 if test statistics (𝑍 𝑐𝑎𝑙 𝑜𝑟 𝑡 𝑐𝑎𝑙 ) > critical value
Decision : Since test statistics > critical value, thus we reject H0; otherwise we fail to reject H0.
2. Example for One tailed test [Left-tail test (<)]: 𝑯1: <
Decision Rule: Reject H0 if test statistics (𝑍 𝑐𝑎𝑙 𝑜𝑟 𝑡 𝑐𝑎𝑙 ) < (-ve) critical value
Decision : Since test statistics < (-ve) critical value, thus we reject H 0 ; otherwise we fail to reject H 0 .
𝐻0 : 𝜇 = 24
Step 1:
𝐻1 : 𝜇 < 24
ҧ
𝑥−𝜇 23−24
Step 2: Test Statistics : 𝑍𝑐𝑎𝑙 = 𝜎 = 3 = −2
ൗ 𝑛 ൗ 36
p-value 𝑥ҧ − 𝜇
Can either be
𝑍𝑐𝑎𝑙 or 𝑡𝑐𝑎𝑙
𝛼 = 0.02
a) If the researcher would like to test whether the mean weight of the solid waste is different from 6.1kg,
what will be the null and alternative hypothesis?
b) Based on the p-value, can the researcher conclude that the mean weight of the solid waste is different
from 6.1kg?
Solution: 𝜇 = 6.1 , 𝑥ҧ = 5.2714 𝑠 = 1.11871, 𝑛 = 35 , 𝛼 = 0.02
a) H0 : μ = 6.1
H1 : μ ≠ 6.1
*If in question (b) you were asked to use the test statistics, then here is the solution :
b) Test statistics : 𝑍𝑐𝑎𝑙 = 4.382
Critical Value : 𝑍𝛼/2 = 𝑍0.02/2 = 𝑍0.01 = 2.3263
Decision Rule : Reject 𝐻0 if 𝑍𝑐𝑎𝑙 > 𝑍𝛼/2
Decision : Since 𝑍𝑐𝑎𝑙 = 4.382 > 𝑍0.01 = 2.3263, thus we reject 𝑯𝟎
Conclusion : Therefore, the researcher have an enough evidence to conclude that the mean weight
of the solid waste is different from 6.1kg
4.1.2
(𝜎 ) OR 𝜎
2
IS UNKNOWN
(𝑛 < 30)
EXAMPLE 4
The speed limit along the Ipoh-Lumut highway states 90km/h. the highway petrol center suspects that
cars travelling along the highway exceed this speed limit. A sample of 15 cars had their speeds measured
by radar. The sample mean was 98km/h and the standard deviation was 15km/h. At the 5% level of
significance is there evidence to indicate that cars travelling along this highway exceed the speed limit?
Solution: 𝜇 = 90 , 𝑥ҧ = 98, 𝑠 = 15, 𝑛 = 15 , 𝛼 = 0.05
𝐻0 :
Step 1: 𝐻1 :
Decision :
Step 5: Conclusion : Therefore,
EXAMPLE 5
The R & D department of an industry imposed that the mean life of the light bulbs produced should
exceed 4000 hours and with a standard deviation of less than 150 hours before it could be supplied to
the markets. A sample of 15 bulbs were tested and the lengths of its life were measured. The data was
analysed using SPSS and the output is shown below:
𝜇 = 4000 , 𝑥ҧ = 4350, 𝑠 = 124.748, 𝑛 = 15
a) Write the null and alternative hypothesis.
H0 :
H1 :
b) Show that the value of the test statistics is 10.866.
Test statistics =
ARE KNOWN
EXAMPLE 6
An experiment was conducted in which two types of engines, A and B were compared. Gas mileage in
miles per gallon was measured. 75 experiments were conducted using engine type A and 50 experiments
were done for engine type B. The gasoline used and other conditions were held constant. The average
gas mileage for engine A was 42 miles per gallon and the average for engine B was 36 miles per gallon.
Test at 10% significance level to determine whether the average gas mileage for engine A is greater than
engine B. Assume that the population standard deviations are 8 and 6 for engine A and B respectively.
So, 𝑛𝐴 = 75 𝑛𝐵 = 50
𝑥𝐴ҧ = 42 𝑥ҧ𝐵 = 36
𝜎𝐴 = 8 𝜎𝐵 = 6
𝐻0 :
Step 1:
𝐻1 :
𝑥ҧ 𝐴 −𝑥ҧ 𝐵 − 𝜇𝐴 −𝜇𝐵
Test Statistics : 𝑍𝑐𝑎𝑙 = =
Step 2: 𝜎2
𝐴 𝜎2
+ 𝐵
𝑛𝐴 𝑛𝐵
Decision :
Step 5:
Conclusion : Therefore, there is ______________________________ to conclude that the
average gas mileage for engine A is greater than engine B.
4.2.2
INDEPENDENT SAMPLE
( 𝜎1 , 𝜎 2 ) O R ( 𝜎1 , 𝜎 2 )
2 2
ARE UNKNOWN
INDEPENDENT SAMPLE
𝐻0 : 𝜇1 = 𝜇2
𝐻1 : 𝜇1 ≠ 𝜇2 @ 𝜇1 > 𝜇2 @ 𝜇1 < 𝜇2
𝝈𝟏 , 𝝈𝟐 𝒐𝒓 𝝈𝟐𝟏 , 𝝈𝟐𝟐
BOTH UNKNOWN 𝒏𝟏 𝑶𝑹 𝒏𝟐 < 𝟑𝟎
𝒏𝟏 & 𝒏𝟐 ≥ 𝟑𝟎
𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐
𝑥ҧ1 − 𝑥ҧ2 − 𝜇1 − 𝜇2
𝑡𝑐𝑎𝑙 =
𝑠12 𝑠22
+
𝑛1 𝑛2
L A RG E S A MP LE
BOTH 𝑛 1 &𝑛 2 ≥ 30
EXAMPLE 7 – DEC’19 Q.7
A researcher wants to determine whether there is significant difference in the Body Mass Index (BMI)
between male and female. A survey was conducted on 80 patients at Tawakal Health Centre. The
collected data analyzed using SPSS. The partial output indicated in the following table.
𝑠12 𝑠22
Std. error difference = + =
𝑛1 𝑛2
Decision :
Step 5: Conclusion : Therefore,
SMALL SAMPLE
𝑛1 𝑂𝑅 𝑛2 < 30
&
EQUAL VARIANCES ASSUMED
2 2
( 𝜎1 = 𝜎 2 )
THE ASSUMPTION OF EQUALITY OF VARIANCES
Example of question:
1) What is the assumption for the equality of variances? OR
2) Based on the Levene’s test, what is the assumption for the variances?
LEVENE’S TEST
Step 4: Conclusion
If we reject 𝐻0 , we conclude 𝜎12 ≠ 𝜎22
If we fail to reject 𝐻0 , we conclude 𝜎12 = 𝜎22
EXAMPLE 8
An analysis is done to find out whether a new drug lower the blood pressure compares to the conventional treatment.
Patients with high blood pressure would be randomly assigned into two groups, a placebo group and a treatment
group. The placebo group would receive conventional treatment while the treatment group will receive a new drug
that is expected to lower blood pressure. After treatment for a couple of months, the data were recorded and analyzed
using SPSS. Assumed that the population variances are equal. 2 2
𝑠1 𝑠2
+
𝑛1 𝑛2
p-value for
𝐻0 : 𝝈𝟐𝟏 = 𝝈𝟐𝟐
1 1
𝑠𝑝 +
𝑛1 𝑛2
p-value for
𝐻0 : 𝝁𝟏 = 𝝁𝟐
𝑥ҧ1 − 𝑥ҧ2
Let 𝜇1 = Mean blood pressure for treatment group
𝜇2 = Mean blood pressure for placebo group
b) Test whether there is any evidence to support the claim using the 5% level of significance.
*If there is SPSS output given in the question, it is highly recommended to use p-value method instead of test
statistics when the question did not specify any specific method to be used to conduct the hypothesis testing*
𝐻0 :
Step 1:
𝐻1 :
Step 2: p-value:
Decision :
At the 1% level of significance is there sufficient evidence in the samples to support Amir’s claim?
Solution:
EXAMPLE 10
A cigarette manufacturer claims that the average of tar content in the Brand B cigarettes is lower than that of Brand A.
To test the claim, the determinations of tar content, in milligrams, were recorded and the data was analyzed using
SPSS. Assume that the tar content of cigarettes in Brand A and Brand B are normally distributed.
Using the p-value, do the data provide sufficient evidence to indicate that the mean of tar content in the
Brand B cigarettes is lower than that of Brand A at 5% significance level?
𝐻0 :
Step 1:
𝐻1 :
Decision :
a) Based on p-value in the Levene’s Test, test the equality of variances in this study. Use 𝜶 = 𝟎. 𝟎𝟓.
b) State the null and alternative hypotheses for the mean test.
c) Using p-value, justify there is a significant difference in the mean enrolment of those
specializing in research and primary care, at 5% significance level.
(8 marks)
SMALL SAMPLE
𝑛1 𝑂𝑅 𝑛2 < 30
&
UNEQUAL VARIANCES ASSUMED
2 2
( 𝜎1 ≠ 𝜎2 )
EXAMPLE 12
The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5 and a sample standard
deviation of 11.90. In addition, the breaking strengths of another 12 bundles of synthetic fibres have a
sample mean 452.8 and a sample standard deviation 3.61. Assume the breaking strengths of the two
populations are normally distributed with unequal variances. Test at 5% level of significance whether the
mean breaking strengths for wools fibres is less than of synthetic fibres.
So, 𝑛1 = 11 𝑛2 = 12
𝑥ҧ1 = 436.5 𝑥ҧ2 = 452.8
𝑠1 = 11.90 𝑠2 = 3.61
𝐻0 :
Step 1:
𝐻1 :
𝑥ҧ 1 −𝑥ҧ2 − 𝜇1 −𝜇2
Test Statistics : 𝑡𝑐𝑎𝑙 = =
Step 2: 𝑠2
1 𝑠2
+ 2
𝑛1 𝑛2
Decision :
Step 5:
Conclusion : Therefore, there is ______________________________ to conclude that the
mean breaking strengths for wools fibres is less than of synthetic fibres.
EXAMPLE 13
A set of facilitation tools to help with data analysis for problem solving is being developed by a group of
statisticians at UiTM. In order to test effectiveness of these tools, a group of research officers were asked
to analyze and produce a built-in report for a set of data on the computer. Twelve equally capable
research officers were randomly selected and six were randomly assigned a standard procedure to
complete the task. The other six were asked to do the task using the developed facilitation tools. The
response measured was the time to completion (in minutes). The output of statistical analysis is shown in
the following tables.
At 5% significance level, can it be concluded that the mean difference in time completion between
standard procedures is more than facilitation tools?
𝐻0 :
Step 1:
𝐻1 :
Decision :
𝒕𝒄𝒂𝒍
ഥ
𝒅 𝒑 − 𝒗𝒂𝒍𝒖𝒆
𝒔𝒅 𝒏−𝟏
𝒔𝒅
ൗ
𝒏
Solution: Let 𝜇𝑑 = 𝑀𝑒𝑎𝑛 𝑏𝑒𝑓𝑜𝑟𝑒 − 𝑀𝑒𝑎𝑛 𝑎𝑓𝑡𝑒𝑟
Decision Rule :
Critical value : Step 3
Decision :
Conclusion :
Therefore, there is __________________ to
Decision Rule : Step 4
conclude that attending the workshop does
increases the test scores
Decision :
Conclusion :
Therefore, there is __________________ to conclude that Step 5
attending the workshop does increases the test scores
EXAMPLE 15 – DEC’18 Q.4
The manager of an insurance company hired a marketing executive officer to advice the best advertising
strategies to improve the sales of the insurance policies under the company. In order to investigate
whether the sales had improved, the manager recorded the sales of the insurance policies a month
before and after the officer was hired. The data was analyzed using SPSS and the result is as follows.
b) H0 : μd = 0
H1 :
Critical value :
Decision Rule :
Decision :
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = ⋯ = 𝜇𝑘
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟
Computing
𝑁 = 𝑇ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
formulas Σ𝑥 = 𝑇ℎ𝑒 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
Σ𝑥 2 = 𝑇ℎ𝑒 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
(Σ𝑥)2
Total Sum of Squares, 𝑆𝑆𝑇 = Σ𝑥 2 − 𝐶𝐹 ; where 𝐶𝐹 = 𝑇12 𝑇22 𝑇𝑖2 𝑇𝑖2
𝑁
+ +⋯+ =
𝑛1 𝑛2 𝑛𝑖 𝑛𝑖
Sum of Squares of Treatments,
𝑇12 𝑇22 𝑇𝑖2 Σ𝑥 2
𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 = 𝑆𝑆𝑅 = + + ⋯+ − ; where 𝑇𝑖 = 𝑇𝑜𝑡𝑎𝑙 𝑓𝑜𝑟 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑖
𝑛1 𝑛2 𝑛𝑖 𝑁
𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡/𝑆𝑆𝑅 𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡/𝑀𝑆𝑅
Treatment 𝑆𝑆Treatment/SSR k − 1 (𝒗𝟏 ) 𝑀STreatment/MSR = 𝐹𝑐𝑎𝑙 =
ANOVA 𝑘−1 𝑀𝑆𝐸
table
𝑆𝑆𝐸
Error SSE N − k (𝒗𝟐 ) 𝑀𝑆𝐸 =
𝑁−𝑘
STEP 5 : Conclusion
There is an enough/no enough evidence to conclude that (refer to the question) ………
EXAMPLE 16
Fifteen fourth-grade students were randomly assigned to three groups to experiment with three different
methods of teaching arithmetic. At the end of the semester, the same test was given to all 15 students.
The table gives the scores of students in the three groups.
Method 1 48 73 51 65 87 𝑇1 = 324
Method 2 55 85 70 69 90 𝑇2 = 369
Method 3 84 68 95 74 67
𝑇3 = 388
At 1% level of significance, can we reject the null hypotheses that mean arithmetic scores of all fourth-
grade students taught by the three methods is the same?
Solution: Let 𝜇1 = 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠′ 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑠𝑐𝑜𝑟𝑒 𝑡𝑎𝑢𝑔ℎ𝑡 𝑏𝑦 𝑀𝑒𝑡ℎ𝑜𝑑 1
𝜇2 = 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠′ 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑠𝑐𝑜𝑟𝑒 𝑡𝑎𝑢𝑔ℎ𝑡 𝑏𝑦 𝑀𝑒𝑡ℎ𝑜𝑑 2
𝜇3 = 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠′ 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑠𝑐𝑜𝑟𝑒 𝑡𝑎𝑢𝑔ℎ𝑡 𝑏𝑦 𝑀𝑒𝑡ℎ𝑜𝑑 3
STEP 1 : 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟
(Σ𝑥)2 (1081)2
𝐶𝐹 = = = 77904.067
𝑁 15
STEP 5: Conclusion : Therefore, there is no enough evidence to reject the null hypotheses that mean
arithmetic scores of all fourth-grade students taught by the three methods is the
same.
EXAMPLE 17
Reconsider the previous example (Example 16), solve the hypothesis testing for ANOVA by using p-value method:
𝐹𝑐𝑎𝑙
𝑆𝑆𝑅 𝑀𝑆𝑅
𝐹𝑎𝑐𝑡𝑜𝑟@𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡
𝑝 − 𝑣𝑎𝑙𝑢𝑒
𝐸𝑟𝑟𝑜𝑟
𝑆𝑆𝐸
𝑆𝑆𝑇 𝑀𝑆𝐸
STEP 1: 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟
STEP 4: Conclusion : Therefore, there is no enough evidence to reject the null hypotheses that mean arithmetic scores of all
fourth-grade students taught by the three methods is the same.
EXAMPLE 18
A group of researchers is studying 3 new materials for teeth filling. The hardening indexes (in percentage)
are shown in the following table:
The hardening indexes are normally distributed with the same variance. Do the data indicate that the
mean indexes are the same for the three types of materials? Use 5% significance level.
Solution: Let 𝜇𝐴 = 𝑀𝑒𝑎𝑛 ℎ𝑎𝑟𝑑𝑒𝑛𝑖𝑛𝑔 𝑖𝑛𝑑𝑒𝑥𝑒𝑠 𝑓𝑜𝑟 𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝐴
𝜇𝐵 = 𝑀𝑒𝑎𝑛 ℎ𝑎𝑟𝑑𝑒𝑛𝑖𝑛𝑔 𝑖𝑛𝑑𝑒𝑥𝑒𝑠 𝑓𝑜𝑟 𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝐵
𝜇𝐶 = 𝑀𝑒𝑎𝑛 ℎ𝑎𝑟𝑑𝑒𝑛𝑖𝑛𝑔 𝑖𝑛𝑑𝑒𝑥𝑒𝑠 𝑓𝑜𝑟 𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝐶
STEP 1 : 𝐻0 : 𝜇𝐴 = 𝜇𝐵 = 𝜇𝐶
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟
(Σ𝑥)2
𝐶𝐹 = =
𝑁
𝑆𝑆𝑇 = Σ𝑥 2 − 𝐶𝐹 =
STEP 5: Conclusion : Therefore, there is _________________________ to conclude that the data does
indicate the mean indexes are the same for the three types of materials.
EXAMPLE 19 – DEC’19 Q.1
A team of researchers interest to compare the yield (in kilograms) of four different varieties (A, B, C, D) of
a rambutan tree in Kg Hutan Kampung orchard. The researchers obtain a random sample of four trees of
each variety from the same orchard. The data were analyzed by using IBM SPSS Statistics. The result given
as below.
𝐻0 :
𝐻1 :
c) Based on the p-value, test at the 5% level of significance whether the mean yield differ on the four
different varieties.
𝑝 − 𝑣𝑎𝑙𝑢𝑒 =
Decision :
Conclusion :
4.4
TEST OF INDEPENDENCE
To analyze categorical variables / independence test in particular
deals with testing the independence between two categorical
variables. (e.g. : Do GENDER and BRAND OF CAR are related or
independent of each other?)
Test the null hypothesis that the two characteristics of the elements of a
given population are not related.
STEP 5 : Conclusion
There is an enough/no enough evidence to conclude that (refer to the question) ………
How to Calculate the Expected Frequencies (E)?
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐. 𝒐𝒇 𝒄𝒐𝒍𝒖𝒎𝒏, 𝒄 = 𝟐
Temper
Colour hair Total
Vile Mild
Red 40 20 60 40 ; E=27 20 ; E=33
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐. 𝒐𝒇 𝒓𝒐𝒘,
𝒓=𝟑
Brown 80 100 180 80 ; E=81 100 ; E=99
Black 60 100 160 60 ; E=72 100 ; E=88
Total 180 220 400
Actual data , 𝑂 = 40, 80, 60, 20, 100, 100
Since there’s 6 values of actual data (𝑂), thus there will be 6 values of expected frequencies (𝐸) as each 𝑂
will have its own 𝐸 value.
2 2
STEP 4 : Decision Rule: Reject 𝐻0 if𝜒𝑐𝑎𝑙 > 𝜒𝛼,(𝑟−1)(𝑐−1)
2 2
Decision : Since 𝜒𝑐𝑎𝑙 = 15.039 > 𝜒0.05,2 = 5.991, thus we reject 𝐻0 .
STEP 5 : Conclusion:
There is no enough evidence to conclude that temper and hair colour are independent.
EXAMPLE 21 – JUNE’19 Q.4
A manager at Company Brilliant wishes to determine whether the employees’ work satisfaction is
related to their respective department. The results obtained are shown below.
𝐴𝑐𝑡𝑢𝑎𝑙 𝑑𝑎𝑡𝑎, 𝑂
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝐸
2
𝜒𝑐𝑎𝑙 𝑝 − 𝑣𝑎𝑙𝑢𝑒
𝑑𝑓 = 𝑟 − 1 𝑐 − 1
= 2−1 4−1
= 3 𝒏𝒐𝒕 𝟐
a) Using an appropriate formula, compute the values of D and E.
Solution:
= 2.152
c) Based on the p-value, is there sufficient evidence to conclude that employees’ work satisfaction is
related to their respective department? Use 𝛼 = 0.05.
STEP 4 : Conclusion:
There is no enough evidence to conclude that employees’ work satisfaction is related to
their respective department.
.
EXAMPLE 22 – JULY’17 Q.6
The management of a train company wants to study if there is any association between the train station’s
crowd and the train delay in Klang Valley. Hence, the number of trains on time and the number of trains
that were late were observed at three different stations in Klang Valley. The crosstabulation of the
observation and the chi-square test are displayed below.
a) Calculate the values of A and B.
c) Based on the p-value, can we conclude that there is an association between station’s crowd and train
delay?
END OF
CHAPTER
4