0% found this document useful (0 votes)
105 views89 pages

Chapter 4

This document outlines the key steps in hypothesis testing for a population mean. It discusses determining the critical value, calculating the p-value, writing the decision rule, and making a conclusion. The steps are: 1) state the null and alternative hypotheses; 2) calculate the test statistic; 3) determine the critical value from statistical tables; 4) write the decision rule to either reject or fail to reject the null hypothesis based on where the test statistic falls relative to the critical value; and 5) make a conclusion based on the decision of whether to reject or fail to reject the null hypothesis. Examples are provided for one-tailed and two-tailed tests.

Uploaded by

Afifa Suib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views89 pages

Chapter 4

This document outlines the key steps in hypothesis testing for a population mean. It discusses determining the critical value, calculating the p-value, writing the decision rule, and making a conclusion. The steps are: 1) state the null and alternative hypotheses; 2) calculate the test statistic; 3) determine the critical value from statistical tables; 4) write the decision rule to either reject or fail to reject the null hypothesis based on where the test statistic falls relative to the critical value; and 5) make a conclusion based on the decision of whether to reject or fail to reject the null hypothesis. Examples are provided for one-tailed and two-tailed tests.

Uploaded by

Afifa Suib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

CHAPTER 4

HYPOTHESIS TESTING
Lecturer: Dr Ruzanita Mat Rani
Pre p are d B y :
H A Z IYA H B IN T I MD J A S MIN
CONTENTS
4.0 Introduction / C oncept of Hypothesis Testin g
4.1 Hypothesis Tests about A Population Mean (𝜇)
4.1.1 Variance (𝜎 2 ) or Std. Deviation 𝜎 is K N O W N

4.1.2 Variance (𝜎 2 ) or Std. Deviation 𝜎 is UN K N O W N – (Larg e & Sm all Sample)

4.2 Hypothesis Testin g for Difference between Two Population Means (𝜇1 − 𝜇2 )
4.2.1 Independent Samp le - Variances (𝜎12, 𝜎22) or Std. Deviations (𝜎1 , 𝜎2 ) are K N O W N
4.2.2 Independent Samp le - Variances (𝜎12, 𝜎22) or Std. Deviations (𝜎1 , 𝜎2 ) are UN K N O W N
4.2.2.1 Larg e Samp le
4.2.2.2 Sm all Sample – Eq ual and Unequal Variances assumed (𝜎1 2= 𝜎2 2@ 𝜎1 2≠ 𝜎2 )2
4.2.3 Dependent Samp le
4.3 Testin g for the Difference among more than Two Means (ANOVA)

4.4 Test of Independence


4.0
INTRODUCTION
Concept of Hypothesis Testing

• Hypothesis is a statement or claim made about the value of a population parameter.

• Hypothesis testing (test of significance) is a procedure whereby sample information is


used to decide whether accept or reject the statement made regarding the value of the
population parameter.

• Hypothesis testing refers to:

1. Makin g an assumption about a population parameter. This assumption may or may not be true.
2. C ollectin g a sample data
3. C alculatin g a sample statistics
4. Usin g the sample statistics to evaluate the hypothesis.
Types of Hypothesis

HYPOTHESES

Null Hypothesis (H0) Alternative Hypothesis (H1)


1. A null hypothesis is a claim (or statement)
about a population parameter that is 1. An alternative hypothesis is a claim about a
assumed to be true until it is declared false. population parameter that will be true if the null
hypothesis is false.
2. State the hypothesis value of the parameter
2. Can be in 2 forms (Directional or Non-directional)
before sampling.
3. Always stated using equal sign (=) Example: H1: µ ≠ 90 (Non-directional)
H1: µ > 90 or H1: µ < 90 (Directional)
Example: H0: µ = 90
Null Hypothesis (H0)
◦ For example: In a clinical trial of a new drug, the null hypothesis might be that the
new drug is no better, on average, than the current drug . Then, we would write the
null hypothesis as:

H 0 : There is no d ifference b etween the two d rugs on average


OR
H 0 : µ1 = µ2
Alternative Hypothesis (H1)
◦ For example, in the clinical trial of a new drug, the alternative hypothesis might be
that the new drug has a different effect, on average, compared to that of the
current drug . We would write the alternative hypothesis as:

H 1 : The two drugs have different effects, on average.


OR
H 1 : µ1 ≠ µ2
EXAMPLE 1
State the null and alternatives hypotheses for the following problems.
Suppose that the averag e number of km per liter obtained with carburetors is 20
km/liter.
A. If you want to show that the averag e number of km/liter obtained with a certain
carburetor is less than 20 km/liter, then
A N SW ER: H 0 :
H1 :
B. If you want to show that the mean number of km/liter obtained with a certain
carburetor is different than 20 km/liter, then
A N SW ER: H 0 :
H1 :
Hypothesis-Testing Common Phrases
Type of Tests
Alternatives Hypothesis
(𝐻1 )

D irectional Hypothesis Non-D irectional Hypothesis

Two-tailed Test
O ne-tailed Test
𝐻1 : ≠

Right-tailed test
𝐻1 : >

Left-tailed test
𝐻1 : <
Rejection Rejection
region region
0.025 0.025
Definition of Terms
Terms Definition

Significance level (α) Probability of rejecting the null hypothesis when it is true. The level of
confidence is 1 – 𝛼. Thus, a confidence level of 95% corresponds to alpha
of 5%.
Test statistic Single number calculated from the sample data as a basis in
deciding to reject or not to reject the null hypothesis.
Regions One region consists of values that support H1 and lead to rejecting Ho is
called the rejection region (or critical region). The other consists
of values that support Ho is called the acceptance region.
Critical value Value of the test statistic that divides the non-rejection region from
the rejection region (left tailed, right tailed, two tailed)
Significance Level (𝜶)
❖D eciding on a criterion for accepting or rejecting the H 0.
❖Significance level refers to the percentage of sample means
that is outside certain prescribed limits.
❖EXAMPLE: Testing a hypothesis at 5% level of significance
(α= 0.05).
-We will reject H 0 if it falls in the two reg ions of area 0.025.
-We will not reject H 0 if it falls within the region of area 0.95.
❖The higher the level of sig nificance, the higher is the
probability of rejecting the null hypothesis when it is true
(acceptance reg ion is narrow).
How to Make Conclusion based on
Hypothesis Testing?
◦ The conclusion once the test has been carried out is always g iven in terms of the
null hypothesis. We either REJECT H0 or FAILED TO REJECT H0.

◦ REMEMBER!! We never conclude REJECT H1 or even ACCEPT H0

◦ If we conclude that “We failed to reject H 0 ”, this does not necessarily mean that the
null hypothesis is true, it only sug gests that there is not sufficient/enoug h evidence
against H 0 .

◦ Reject H 0 then s u g g ests that the alternative hypothesis may be true.


4.1
HYPOTHESIS TESTS
ABOUT A POPUL ATION
MEAN ( 𝜇 )
YES NO
Steps in Hypothesis Testing for A Population Mean
By using Test Statistics By using P-Value
𝐻0 : 𝜇 = 𝜇0 STEP 1 𝐻0 : 𝜇 = 𝜇0
𝐻1 : 𝜇 ≠ 𝜇0 @ 𝜇 > 𝜇0 @ 𝜇 < 𝜇0 State the null and 𝐻1 : 𝜇 ≠ 𝜇0 @ 𝜇 > 𝜇0 @ 𝜇 < 𝜇0
alternative hypothesis
Calculate the test statistics STEP 2 Determine the p-value
Zc𝑎l or tc𝑎l (column labelled “Sig.” or “Sig. 2-tailed” )

Determine the critical value STEP 3 Decision rule: Reject H0 if p-value ≤ α


(used Statistical Table: Decision: “Reject H0” or
Table 4 or Table 7) “Failed to Reject H0”
Decision rule: STEP 4 Conclusion:
Decision: “Reject H0” or Therefore, there is enough evidence @ not
“Failed to Reject H0” enough evidence to conclude that …
Conclusion: STEP 5
Therefore, there is enough evidence @
not enough evidence to conclude that …
How to Determine the Critical Value (Step 3)?
Right-tailed test
𝐻1 : >

Alternatives Hypothesis Left-tailed test


(𝐻1 ) 𝐻1 : <

Two-tailed Test
𝐻1 : ≠
How to Determine the p-value? (Step 2)

p -value for
𝐻0 : 𝝁 = …
Alternative hypothesis (H1) Given “Sig. 2-tailed”

𝑝−𝑣𝑎𝑙𝑢𝑒 0.000
𝐻1 : > OR 𝐻1 : < p-value = = = 0.000
2 2

𝐻1 : ≠ p-value = 0.000
How to Write the Decision Rule (Step 4)?
1. Example for One tailed test [Right-tail test (>)]: 𝑯1: >
Decision Rule: Reject H0 if test statistics (𝑍 𝑐𝑎𝑙 𝑜𝑟 𝑡 𝑐𝑎𝑙 ) > critical value
Decision : Since test statistics > critical value, thus we reject H0; otherwise we fail to reject H0.

2. Example for One tailed test [Left-tail test (<)]: 𝑯1: <
Decision Rule: Reject H0 if test statistics (𝑍 𝑐𝑎𝑙 𝑜𝑟 𝑡 𝑐𝑎𝑙 ) < (-ve) critical value
Decision : Since test statistics < (-ve) critical value, thus we reject H 0 ; otherwise we fail to reject H 0 .

3. Example for Two-tailed test (≠ ): 𝑯1: ≠


Decision Rule: Reject H0 if |test statistics| @ (|𝑍 𝑐𝑎𝑙 | 𝑜𝑟 |𝑡 𝑐𝑎𝑙 |) > critical value
Decision : Since |test statistics| > critical value, thus we reject H0; otherwise we fail to reject H0.
4.1 .1
(𝜎 ) OR 𝜎
2
IS KNOWN
EXAMPLE 1
A company producing 3A batteries claims that its batteries last an average of 24 months with a standard
deviation of 3 months. A sample of 36 batteries was tested. The mean life of these batteries was 23
months. Using the 5% level of significance, is there evidence to indicate that the mean lifetime of 3A
batteries is below 24 months?
Solution: 𝜇 = 24 , 𝜎 = 3 , 𝑛 = 36 , 𝑥ҧ = 23 , 𝛼 = 0.05

𝐻0 : 𝜇 = 24
Step 1:
𝐻1 : 𝜇 < 24
ҧ
𝑥−𝜇 23−24
Step 2: Test Statistics : 𝑍𝑐𝑎𝑙 = 𝜎 = 3 = −2
ൗ 𝑛 ൗ 36

Step 3: Critical Value : −𝑍𝛼 = −𝑍0.05 = −1.6449


Step 4: Decision Rule : Reject 𝐻0 if 𝑍𝑐𝑎𝑙 < −𝑍𝛼
Decision : Since 𝑍𝑐𝑎𝑙 = −2 < −𝑍0.05 = −1.6449, thus we reject 𝑯𝟎
Step 5: Conclusion : Therefore, there is an enough evidence to indicate that the mean
lifetime of 3A batteries is below 24 months.
4.1.2
(𝜎 ) OR 𝜎
2
IS UNKNOWN
(𝑛 ≥ 30)
EXAMPLE 2
A pharmaceutical manufacturer purchases a particular material from a supplier. The manufacturer selects
30 shipments from the supplier and measures the percentage of impurities in the raw material from each
shipment. The sample mean and variance are 𝑥ҧ = 1.89 and 𝑠 2 = 0.273 respectively. Test at 5% level of
significance whether the average percentage of impurities is different from 1.8.
Solution: 𝜇 = 1.8 , 𝑥ҧ = 1.89, 𝑠 2 = 0.273, 𝑛 = 30 , 𝛼 = 0.05
Use the modulus (| |) symbol such
as 𝑍𝑐𝑎𝑙 or 𝑡𝑐𝑎𝑙 if and only if your
𝐻0 : 𝜇 = 1.8
Step 1: 𝐇𝟏 has an unequal symbol (≠)
𝐻1 : 𝜇 ≠ 1.8
ҧ
𝑥−𝜇 1.89−1.8
Step 2: Test Statistics : 𝑍𝑐𝑎𝑙 = 𝑠 = 0.273
= 0.943
ൗ 𝑛 ൗ 30

Step 3: Critical Value : 𝑍𝛼/2 = 𝑍0.05/2 = 𝑍0.025 = 1.96


Step 4: Decision Rule : Reject 𝐻0 if 𝑍𝑐𝑎𝑙 > 𝑍𝛼/2
Decision : Since 𝑍𝑐𝑎𝑙 = 0.943 < 𝑍0.025 = 1.96, thus we fail to reject 𝑯𝟎
Step 5: Conclusion : Therefore, there is no enough evidence to conclude that the
average percentage of impurities is different from 1.8.
EXAMPLE 3
Based on the information given by the housekeepers, it was found that the hotel has produced 6.1
kilograms of solid waste daily. The following tables shows the results obtained from a further analysis of
the study t = Test statistics
(Step 2) Test value = 𝜇

p-value 𝑥ҧ − 𝜇

Can either be
𝑍𝑐𝑎𝑙 or 𝑡𝑐𝑎𝑙
𝛼 = 0.02
a) If the researcher would like to test whether the mean weight of the solid waste is different from 6.1kg,
what will be the null and alternative hypothesis?
b) Based on the p-value, can the researcher conclude that the mean weight of the solid waste is different
from 6.1kg?
Solution: 𝜇 = 6.1 , 𝑥ҧ = 5.2714 𝑠 = 1.11871, 𝑛 = 35 , 𝛼 = 0.02
a) H0 : μ = 6.1
H1 : μ ≠ 6.1

b) p-value: p-value = 0.000


Decision Rule : Reject 𝐻0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
Decision : Since 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.000 < 𝛼 = 0.02, thus we reject 𝑯𝟎
Conclusion : Therefore, the researcher have an enough evidence to conclude that the mean weight
of the solid waste is different from 6.1kg

*If in question (b) you were asked to use the test statistics, then here is the solution :
b) Test statistics : 𝑍𝑐𝑎𝑙 = 4.382
Critical Value : 𝑍𝛼/2 = 𝑍0.02/2 = 𝑍0.01 = 2.3263
Decision Rule : Reject 𝐻0 if 𝑍𝑐𝑎𝑙 > 𝑍𝛼/2
Decision : Since 𝑍𝑐𝑎𝑙 = 4.382 > 𝑍0.01 = 2.3263, thus we reject 𝑯𝟎
Conclusion : Therefore, the researcher have an enough evidence to conclude that the mean weight
of the solid waste is different from 6.1kg
4.1.2
(𝜎 ) OR 𝜎
2
IS UNKNOWN
(𝑛 < 30)
EXAMPLE 4
The speed limit along the Ipoh-Lumut highway states 90km/h. the highway petrol center suspects that
cars travelling along the highway exceed this speed limit. A sample of 15 cars had their speeds measured
by radar. The sample mean was 98km/h and the standard deviation was 15km/h. At the 5% level of
significance is there evidence to indicate that cars travelling along this highway exceed the speed limit?
Solution: 𝜇 = 90 , 𝑥ҧ = 98, 𝑠 = 15, 𝑛 = 15 , 𝛼 = 0.05

𝐻0 :
Step 1: 𝐻1 :

Step 2: Test Statistics :

Step 3: Critical Value :

Step 4: Decision Rule :

Decision :
Step 5: Conclusion : Therefore,
EXAMPLE 5
The R & D department of an industry imposed that the mean life of the light bulbs produced should
exceed 4000 hours and with a standard deviation of less than 150 hours before it could be supplied to
the markets. A sample of 15 bulbs were tested and the lengths of its life were measured. The data was
analysed using SPSS and the output is shown below:
𝜇 = 4000 , 𝑥ҧ = 4350, 𝑠 = 124.748, 𝑛 = 15
a) Write the null and alternative hypothesis.
H0 :
H1 :
b) Show that the value of the test statistics is 10.866.
Test statistics =

c) Write the decision rule and decision. Use 5% significance level.

d) Write the conclusion.


4.2
HYPOTHESIS TESTING FOR
DIFFERENCE BET WEEN
T W O P O P U L AT I O N M E A N S
(𝜇1 − 𝜇2 )
𝜇1 − 𝜇2 is
always
equal to
zero (0)
Steps in Hypothesis Testing for Two Population Means
- INDEPENDENT SAMPLE (𝜇1 − 𝜇2 ) -
By using Test Statistics By using P-Value
STEP 1
𝐻0 : 𝜇1 = 𝜇2 𝐻0 : 𝜇1 = 𝜇2
State the null and
𝐻1 : 𝜇1 ≠ 𝜇2 @ 𝜇1 > 𝜇2 @ 𝜇1 < 𝜇2 𝐻1 : 𝜇1 ≠ 𝜇2 @ 𝜇1 > 𝜇2 @ 𝜇1 < 𝜇2
alternative hypothesis

Calculate the test statistics Determine the p-value


STEP 2
𝑍𝑐𝑎𝑙 or 𝑡𝑐𝑎𝑙 (Column labelled “Sig.” or “Sig. 2-tailed” )
Decision rule: Reject H0 if p-value ≤ 𝛼
Determine the critical value
STEP 3 Decision: “Reject H0” or
(used Statistical Table; Table 4 or 7)
“Failed to Reject H0”
Decision rule: _______________ Conclusion: Therefore, there is an
Decision: “Reject H0” or STEP 4 enough evidence @ no enough
“Failed to Reject H0” evidence to conclude that …
Conclusion: Therefore, there is an
enough evidence @ no enough STEP 5
evidence to conclude that …
4.2.1
INDEPENDENT SAMPLE
( 𝜎1 , 𝜎 2 ) O R ( 𝜎1 , 𝜎 2 )
2 2

ARE KNOWN
EXAMPLE 6
An experiment was conducted in which two types of engines, A and B were compared. Gas mileage in
miles per gallon was measured. 75 experiments were conducted using engine type A and 50 experiments
were done for engine type B. The gasoline used and other conditions were held constant. The average
gas mileage for engine A was 42 miles per gallon and the average for engine B was 36 miles per gallon.
Test at 10% significance level to determine whether the average gas mileage for engine A is greater than
engine B. Assume that the population standard deviations are 8 and 6 for engine A and B respectively.

Solution: Let 𝜇𝐴 = Population mean gas mileage for Engine A


𝜇𝐵 = Population mean gas mileage for Engine B

So, 𝑛𝐴 = 75 𝑛𝐵 = 50
𝑥𝐴ҧ = 42 𝑥ҧ𝐵 = 36
𝜎𝐴 = 8 𝜎𝐵 = 6
𝐻0 :
Step 1:
𝐻1 :
𝑥ҧ 𝐴 −𝑥ҧ 𝐵 − 𝜇𝐴 −𝜇𝐵
Test Statistics : 𝑍𝑐𝑎𝑙 = =
Step 2: 𝜎2
𝐴 𝜎2
+ 𝐵
𝑛𝐴 𝑛𝐵

Step 3: Critical Value : 𝑍𝛼 = 𝑍0.10 =

Step 4: Decision Rule :

Decision :

Step 5:
Conclusion : Therefore, there is ______________________________ to conclude that the
average gas mileage for engine A is greater than engine B.
4.2.2
INDEPENDENT SAMPLE
( 𝜎1 , 𝜎 2 ) O R ( 𝜎1 , 𝜎 2 )
2 2

ARE UNKNOWN
INDEPENDENT SAMPLE
𝐻0 : 𝜇1 = 𝜇2
𝐻1 : 𝜇1 ≠ 𝜇2 @ 𝜇1 > 𝜇2 @ 𝜇1 < 𝜇2

𝝈𝟏 , 𝝈𝟐 𝒐𝒓 𝝈𝟐𝟏 , 𝝈𝟐𝟐
BOTH UNKNOWN 𝒏𝟏 𝑶𝑹 𝒏𝟐 < 𝟑𝟎
𝒏𝟏 & 𝒏𝟐 ≥ 𝟑𝟎

𝑥1ҧ − 𝑥ҧ2 − 𝜇1 − 𝜇2 𝝈𝟐𝟏 = 𝝈𝟐𝟐


𝑍𝑐𝑎𝑙 =
𝑠12 𝑠22 𝑥1ҧ − 𝑥ҧ2 − 𝜇1 − 𝜇2
+ 𝑡𝑐𝑎𝑙 =
𝑛1 𝑛2 1 1
𝑠𝑝 +
𝑛1 𝑛2

𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐

𝑥ҧ1 − 𝑥ҧ2 − 𝜇1 − 𝜇2
𝑡𝑐𝑎𝑙 =
𝑠12 𝑠22
+
𝑛1 𝑛2
L A RG E S A MP LE
BOTH 𝑛 1 &𝑛 2 ≥ 30
EXAMPLE 7 – DEC’19 Q.7
A researcher wants to determine whether there is significant difference in the Body Mass Index (BMI)
between male and female. A survey was conducted on 80 patients at Tawakal Health Centre. The
collected data analyzed using SPSS. The partial output indicated in the following table.

a) Show the standard error difference is 0.9425.


b) State the null and alternative hypotheses for the above study.
c) Given the z-statistic is 3.205, do the data provide sufficient evidence to indicate there is significant
difference in the Body Mass Index (BMI) between male and female? Use a = 0.05.
(8 marks)
Let 𝜇1 = Mean BMI for male patient
𝜇2 = Mean BMI for female patient
a) Show that the standard error difference is 0.9425.

𝑠12 𝑠22
Std. error difference = + =
𝑛1 𝑛2

b) Write the null and alternative hypothesis.


H0 :
H1 :
c) Given the test statistics, 𝒁𝒄𝒂𝒍 = 𝟑. 𝟐𝟎𝟓

Step 3: Critical Value :

Step 4: Decision Rule :

Decision :
Step 5: Conclusion : Therefore,
SMALL SAMPLE
𝑛1 𝑂𝑅 𝑛2 < 30
&
EQUAL VARIANCES ASSUMED
2 2
( 𝜎1 = 𝜎 2 )
THE ASSUMPTION OF EQUALITY OF VARIANCES
Example of question:
1) What is the assumption for the equality of variances? OR
2) Based on the Levene’s test, what is the assumption for the variances?
LEVENE’S TEST

Step 1: 𝐻0 : 𝜎12 = 𝜎22


𝐻1: 𝜎12 ≠ 𝜎22

Step 2: State the 𝑝 − 𝑣𝑎𝑙𝑢𝑒

Step 3: Decision Rule : Reject 𝐻0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼


Decision : Since 𝑝 − 𝑣𝑎𝑙𝑢𝑒 < 𝛼, thus we reject 𝐻0 OR Since 𝑝 − 𝑣𝑎𝑙𝑢𝑒 > 𝛼, thus we fail to reject 𝐻0

Step 4: Conclusion
 If we reject 𝐻0 , we conclude 𝜎12 ≠ 𝜎22
 If we fail to reject 𝐻0 , we conclude 𝜎12 = 𝜎22
EXAMPLE 8
An analysis is done to find out whether a new drug lower the blood pressure compares to the conventional treatment.
Patients with high blood pressure would be randomly assigned into two groups, a placebo group and a treatment
group. The placebo group would receive conventional treatment while the treatment group will receive a new drug
that is expected to lower blood pressure. After treatment for a couple of months, the data were recorded and analyzed
using SPSS. Assumed that the population variances are equal. 2 2
𝑠1 𝑠2
+
𝑛1 𝑛2

p-value for
𝐻0 : 𝝈𝟐𝟏 = 𝝈𝟐𝟐

1 1
𝑠𝑝 +
𝑛1 𝑛2
p-value for
𝐻0 : 𝝁𝟏 = 𝝁𝟐
𝑥ҧ1 − 𝑥ҧ2
Let 𝜇1 = Mean blood pressure for treatment group
𝜇2 = Mean blood pressure for placebo group

a) Find the value of A.


1 1 𝑛1 −1 𝑠12 + 𝑛2 −1 𝑠22 1 1
Std. error difference = 𝑠𝑝 + = +𝑛 =
𝑛1 𝑛2 𝑛1 +𝑛2 −2 𝑛1 2

b) Test whether there is any evidence to support the claim using the 5% level of significance.
*If there is SPSS output given in the question, it is highly recommended to use p-value method instead of test
statistics when the question did not specify any specific method to be used to conduct the hypothesis testing*
𝐻0 :
Step 1:
𝐻1 :

Step 2: p-value:

Step 3: Decision Rule :

Decision :

Step 4: Conclusion : Therefore,


EXAMPLE 9
Amir and ranjit work for the same company in Kuala Lumpur. They both live in the same neighborhood in Subang Jaya.
Ranjit takes the Klang Valley Expressway to work, while Amir uses the North-South Expressway. Amir claims that his
route takes him faster to work. Both Amir and Ranjit recorded the time (in minutes) it took them to reach their office for
the last 10 working days. Assume that the data were drawn from a normal distribution with equal variances.

At the 1% level of significance is there sufficient evidence in the samples to support Amir’s claim?
Solution:
EXAMPLE 10
A cigarette manufacturer claims that the average of tar content in the Brand B cigarettes is lower than that of Brand A.
To test the claim, the determinations of tar content, in milligrams, were recorded and the data was analyzed using
SPSS. Assume that the tar content of cigarettes in Brand A and Brand B are normally distributed.
Using the p-value, do the data provide sufficient evidence to indicate that the mean of tar content in the
Brand B cigarettes is lower than that of Brand A at 5% significance level?

𝐻0 :
Step 1:
𝐻1 :

Step 2: p − value = 0.270Τ2 = 0.135

Step 3: Decision Rule :

Decision :

Step 4: Conclusion : Therefore, there is ______________________________ to indicate that the mean


of tar content in the Brand B cigarettes is lower than that of Brand A.
EXAMPLE 11 – JULY’17 Q.5
A researcher wants to know if there is a significant difference in the mean enrolment of those specializing in research
and primary care. The SPSS outputs for the statistics of enrolments and the independent samples test of those
specializing in research and primary care are given below.
The enrolments of those specializing in research and primary care are assumed to be normally
distributed.

Let 𝜇1 = Mean enrolment of those specializing in research


𝜇2 = Mean enrolment of those specializing in primary care

a) Based on p-value in the Levene’s Test, test the equality of variances in this study. Use 𝜶 = 𝟎. 𝟎𝟓.

b) State the null and alternative hypotheses for the mean test.

c) Using p-value, justify there is a significant difference in the mean enrolment of those
specializing in research and primary care, at 5% significance level.

(8 marks)
SMALL SAMPLE
𝑛1 𝑂𝑅 𝑛2 < 30
&
UNEQUAL VARIANCES ASSUMED
2 2
( 𝜎1 ≠ 𝜎2 )
EXAMPLE 12
The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5 and a sample standard
deviation of 11.90. In addition, the breaking strengths of another 12 bundles of synthetic fibres have a
sample mean 452.8 and a sample standard deviation 3.61. Assume the breaking strengths of the two
populations are normally distributed with unequal variances. Test at 5% level of significance whether the
mean breaking strengths for wools fibres is less than of synthetic fibres.

Solution: Let 𝜇1 = Population mean breaking strength for wools fibres


𝜇2 = Population mean breaking strength for synthetic fibres

So, 𝑛1 = 11 𝑛2 = 12
𝑥ҧ1 = 436.5 𝑥ҧ2 = 452.8
𝑠1 = 11.90 𝑠2 = 3.61
𝐻0 :
Step 1:
𝐻1 :
𝑥ҧ 1 −𝑥ҧ2 − 𝜇1 −𝜇2
Test Statistics : 𝑡𝑐𝑎𝑙 = =
Step 2: 𝑠2
1 𝑠2
+ 2
𝑛1 𝑛2

Step 3: Critical Value : −𝑡𝛼,𝑑𝑓 = −𝑡0.05, 11 =

Step 4: Decision Rule : Reject 𝐻0 if 𝑡𝑐𝑎𝑙 < −𝑡𝛼,𝑑𝑓

Decision :

Step 5:
Conclusion : Therefore, there is ______________________________ to conclude that the
mean breaking strengths for wools fibres is less than of synthetic fibres.
EXAMPLE 13
A set of facilitation tools to help with data analysis for problem solving is being developed by a group of
statisticians at UiTM. In order to test effectiveness of these tools, a group of research officers were asked
to analyze and produce a built-in report for a set of data on the computer. Twelve equally capable
research officers were randomly selected and six were randomly assigned a standard procedure to
complete the task. The other six were asked to do the task using the developed facilitation tools. The
response measured was the time to completion (in minutes). The output of statistical analysis is shown in
the following tables.
At 5% significance level, can it be concluded that the mean difference in time completion between
standard procedures is more than facilitation tools?

𝐻0 :
Step 1:
𝐻1 :

Step 2: p − value = 0.000Τ2 = 0.000

Step 3: Decision Rule :

Decision :

Step 4: Conclusion : Therefore, there is ______________________________ to conclude that the mean


difference in time completion between standard procedures is more than
facilitation tools
4.2.3
DEPENDENT SAMPLE
Steps in Hypothesis Testing for Two Population Means
- DEPENDENT SAMPLE (𝝁𝒅 )-
By using Test Statistics By using P-Value
STEP 1
𝐻0 : 𝜇𝑑 = 0 𝐻0 : 𝜇𝑑 = 0
State the null and
𝐻1 : 𝜇𝑑 ≠ 0 @ 𝜇𝑑 > 0 @ 𝜇𝑑 < 0 𝐻1 : 𝜇𝑑 ≠ 0 @ 𝜇𝑑 > 0 @ 𝜇𝑑 < 0
alternative hypothesis

Calculate the test statistics Determine the p-value


STEP 2
𝑡𝑐𝑎𝑙 (Column labelled “Sig.” or “Sig. 2-tailed” )
Decision rule: Reject H0 if p-value ≤ 𝛼
Determine the critical value
STEP 3 Decision: “Reject H0” or
(used Statistical Table; Table 7)
“Failed to Reject H0”
Decision rule: _______________ Conclusion: Therefore, there is an
Decision: “Reject H0” or STEP 4 enough evidence @ no enough
“Failed to Reject H0” evidence to conclude that …
Conclusion: Therefore, there is an
enough evidence @ no enough STEP 5
evidence to conclude that …
EXAMPLE 14
Many engineering students are having problem in data analysis using statistical software. A professor who
teaches statistics for engineering course offered a two day workshop on this topic. The following table
gives the test scores of seven engineering students before and after they attended the workshop. Test at
5% significance level whether attending the workshop increases the test scores?

𝒕𝒄𝒂𝒍

𝒅 𝒑 − 𝒗𝒂𝒍𝒖𝒆
𝒔𝒅 𝒏−𝟏

𝒔𝒅

𝒏
Solution: Let 𝜇𝑑 = 𝑀𝑒𝑎𝑛 𝑏𝑒𝑓𝑜𝑟𝑒 − 𝑀𝑒𝑎𝑛 𝑎𝑓𝑡𝑒𝑟

Test Statistics method P-value method


𝐻0 : 𝜇𝑑 = 0 𝐻0 : 𝜇𝑑 = 0
Step 1
𝐻1 : 𝐻1 :
ത 𝑑
𝑑−𝜇
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑠, 𝑡𝑐𝑎𝑙 = 𝑠𝑑
ൗ 𝑛
= Step 2 𝑝 − 𝑣𝑎𝑙𝑢𝑒 =

Decision Rule :
Critical value : Step 3
Decision :

Conclusion :
Therefore, there is __________________ to
Decision Rule : Step 4
conclude that attending the workshop does
increases the test scores

Decision :

Conclusion :
Therefore, there is __________________ to conclude that Step 5
attending the workshop does increases the test scores
EXAMPLE 15 – DEC’18 Q.4
The manager of an insurance company hired a marketing executive officer to advice the best advertising
strategies to improve the sales of the insurance policies under the company. In order to investigate
whether the sales had improved, the manager recorded the sales of the insurance policies a month
before and after the officer was hired. The data was analyzed using SPSS and the result is as follows.

a) Show how the value of test statistics, t is obtained.


b) State the null and alternative hypotheses.
c) At the 5% level of significance, do the data provide sufficient evidence to indicate that the sales had
improved?
Solution: Let 𝜇𝑑 = 𝑆𝑎𝑙𝑒𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 − 𝑆𝑎𝑙𝑒𝑠 𝑎𝑓𝑡𝑒𝑟

a) Test statistics, t cal =

b) H0 : μd = 0

H1 :

c) Test statistics : t cal =

Critical value :

Decision Rule :
Decision :

Conclusion : Therefore, there is ______________________________ to conclude that the data provide


sufficient evidence to indicate that the sales had improved
4.3
ANALYSIS OF VARIANCE
( ANOVA )
WHAT IS ANOVA?
Analysis of Variance (ANOVA) is a statistical technique that can be used to
test for the equality of THREE or MORE POPULATION MEANS. The
technique is called "Analysis of Variance" rather than "Analysis of Means"
because inferences about means are made by analyzing variance.

ANOVA is used to assess potential differences between ONE


DEPENDENT VARIABLE (quantitative variable) and an INDEPENDENT
VARIABLE or FACTOR (nominal variable) having 2 or more categories.

We want to use the sample results to test the following hypotheses:


𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = 𝝁𝟑 = ⋯ = 𝝁𝒌
𝑯𝟏 : 𝑨𝒕 𝒍𝒆𝒂𝒔𝒕 𝒐𝒏𝒆 𝒑𝒂𝒊𝒓 𝒐𝒇 𝒕𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 𝒎𝒆𝒂𝒏𝒔 𝒊𝒔 𝒅𝒊𝒇𝒇𝒆𝒓
DEFINITION OF TERMS
• Variable of interest to be measured in the experiment
Response Variable / Dependent
(describes the measurements, usually on a continuous
Variable / Outcome
scale, of the variable of interest)

Factors / Independent Variable • Variable whose effect on the response variable

Factor Level 𝒌 • Values of the factor utilized in the experiment

Treatment • A specific combination of factor levels

Experimental Unit • The object on which measurement is taken


EXAMPLE OF TERMS USED
Factor Level:
4 Designs
A greeting company wanted to use a coupon offer to increase sales. They
developed four different coupon designs and used each design with a number
of customers. They took a sample of 8 customers for each design and noted
their purchase amount as a result of the coupon. Did the coupons have
different effects on sales? Experimental Unit:
Independent Var/Treaments: Customers
Dependent Var: Coupon Designs
Purchase Amount
Steps in Hypothesis Testing for ANOVA
STEP 1 : State the Null and Alternatives Hypotheses

𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = ⋯ = 𝜇𝑘
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟

STEP 2 : ANOVA Table and Test Statistics

Computing
𝑁 = 𝑇ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
formulas Σ𝑥 = 𝑇ℎ𝑒 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
Σ𝑥 2 = 𝑇ℎ𝑒 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠

(Σ𝑥)2
Total Sum of Squares, 𝑆𝑆𝑇 = Σ𝑥 2 − 𝐶𝐹 ; where 𝐶𝐹 = 𝑇12 𝑇22 𝑇𝑖2 𝑇𝑖2
𝑁
+ +⋯+ = ෍
𝑛1 𝑛2 𝑛𝑖 𝑛𝑖
Sum of Squares of Treatments,
𝑇12 𝑇22 𝑇𝑖2 Σ𝑥 2
𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 = 𝑆𝑆𝑅 = + + ⋯+ − ; where 𝑇𝑖 = 𝑇𝑜𝑡𝑎𝑙 𝑓𝑜𝑟 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑖
𝑛1 𝑛2 𝑛𝑖 𝑁

Sum of Squares of Error, 𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑅


Source of Sum of Squares Degrees of
Mean Square (𝑴𝑺) 𝑭𝒄𝒂𝒍 (Test Statistics)
Variation (𝑺𝑺) Freedom

𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡/𝑆𝑆𝑅 𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡/𝑀𝑆𝑅
Treatment 𝑆𝑆Treatment/SSR k − 1 (𝒗𝟏 ) 𝑀STreatment/MSR = 𝐹𝑐𝑎𝑙 =
ANOVA 𝑘−1 𝑀𝑆𝐸
table
𝑆𝑆𝐸
Error SSE N − k (𝒗𝟐 ) 𝑀𝑆𝐸 =
𝑁−𝑘

Total SST N−1

STEP 3 : Critical Value (Table 9)


𝐹𝛼,𝑘−1,𝑁−𝑘 = 𝐹𝛼,𝑣1,𝑣2

STEP 4 : Decision Rule and Decision


Decision Rule: Reject 𝐻0 if 𝐹𝑐𝑎𝑙 > 𝐹𝛼,𝑣1,𝑣2
Decision : We reject 𝐻0 or failed to reject 𝐻0 …

STEP 5 : Conclusion
There is an enough/no enough evidence to conclude that (refer to the question) ………
EXAMPLE 16
Fifteen fourth-grade students were randomly assigned to three groups to experiment with three different
methods of teaching arithmetic. At the end of the semester, the same test was given to all 15 students.
The table gives the scores of students in the three groups.

Method 1 48 73 51 65 87 𝑇1 = 324

Method 2 55 85 70 69 90 𝑇2 = 369
Method 3 84 68 95 74 67
𝑇3 = 388

𝑘=3 Σ𝑥 = 324 + 369 + 388 = 1081

At 1% level of significance, can we reject the null hypotheses that mean arithmetic scores of all fourth-
grade students taught by the three methods is the same?
Solution: Let 𝜇1 = 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠′ 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑠𝑐𝑜𝑟𝑒 𝑡𝑎𝑢𝑔ℎ𝑡 𝑏𝑦 𝑀𝑒𝑡ℎ𝑜𝑑 1
𝜇2 = 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠′ 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑠𝑐𝑜𝑟𝑒 𝑡𝑎𝑢𝑔ℎ𝑡 𝑏𝑦 𝑀𝑒𝑡ℎ𝑜𝑑 2
𝜇3 = 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠′ 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑠𝑐𝑜𝑟𝑒 𝑡𝑎𝑢𝑔ℎ𝑡 𝑏𝑦 𝑀𝑒𝑡ℎ𝑜𝑑 3
STEP 1 : 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟

STEP 2 : 𝑁 = 15, Σ𝑥 = 1081, Σ𝑥 2 = 80709 𝑂𝑏𝑡𝑎𝑖𝑛𝑒𝑑 𝑖𝑡 𝑏𝑦 𝑢𝑠𝑖𝑛𝑔 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑜𝑟

(Σ𝑥)2 (1081)2
𝐶𝐹 = = = 77904.067
𝑁 15

𝑆𝑆𝑇 = Σ𝑥 2 − 𝐶𝐹 = 80709 − 77904.067 = 2804.933

𝑇12 𝑇22 𝑇32 3242 3692 3882


𝑆𝑆 𝑀𝑒𝑡ℎ𝑜𝑑 = 𝑆𝑆𝑅 = + + − 𝐶𝐹 = + + − 𝐶𝐹 = 78336.2 − 77904.067
𝑛1 𝑛2 𝑛3 5 5 5
= 432.133

𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑅 = 2804.933 − 432.133 = 2372.8


Source of 𝑴𝑺𝑹
Sum of squares Degree of freedom Mean of squares 𝑭𝒄𝒂𝒍 =
variation 𝑴𝑺𝑬
432.133
Method 𝑆𝑆𝑅 = 432.133 𝑘−1=2 MSR = = 216.067
2
2372.8 216.067
Error 𝑆𝑆𝐸 = 2372.8 N − 𝑘 = 12 MSE = = 197.733 = 1.093
12 197.733

Total 𝑆𝑆𝑇 = 2804.933 N − 1 = 14

STEP 3: 𝐹𝛼,𝑘−1,𝑁−𝑘 = 𝐹𝛼,𝑣1,𝑣2 = 𝐹0.01,2,12 = 6.93

STEP 4: Decision Rule: Reject 𝐻0 𝑖𝑓 𝐹𝑐𝑎𝑙 > 𝐹𝛼,𝑣1,𝑣2


Decision : Since 𝐹𝑐𝑎𝑙 = 1.093 < 𝐹0.01,2,12 = 6.93, thus we failed to reject 𝑯𝟎

STEP 5: Conclusion : Therefore, there is no enough evidence to reject the null hypotheses that mean
arithmetic scores of all fourth-grade students taught by the three methods is the
same.
EXAMPLE 17
Reconsider the previous example (Example 16), solve the hypothesis testing for ANOVA by using p-value method:
𝐹𝑐𝑎𝑙
𝑆𝑆𝑅 𝑀𝑆𝑅
𝐹𝑎𝑐𝑡𝑜𝑟@𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡
𝑝 − 𝑣𝑎𝑙𝑢𝑒

𝐸𝑟𝑟𝑜𝑟
𝑆𝑆𝐸
𝑆𝑆𝑇 𝑀𝑆𝐸
STEP 1: 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟

STEP 2: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.366

STEP 3: Decision Rule: Reject 𝐻0 𝑖𝑓 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼


Decision : Since 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.366 > 𝛼 = 0.01, thus we failed to reject 𝑯𝟎

STEP 4: Conclusion : Therefore, there is no enough evidence to reject the null hypotheses that mean arithmetic scores of all
fourth-grade students taught by the three methods is the same.
EXAMPLE 18
A group of researchers is studying 3 new materials for teeth filling. The hardening indexes (in percentage)
are shown in the following table:

The hardening indexes are normally distributed with the same variance. Do the data indicate that the
mean indexes are the same for the three types of materials? Use 5% significance level.
Solution: Let 𝜇𝐴 = 𝑀𝑒𝑎𝑛 ℎ𝑎𝑟𝑑𝑒𝑛𝑖𝑛𝑔 𝑖𝑛𝑑𝑒𝑥𝑒𝑠 𝑓𝑜𝑟 𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝐴
𝜇𝐵 = 𝑀𝑒𝑎𝑛 ℎ𝑎𝑟𝑑𝑒𝑛𝑖𝑛𝑔 𝑖𝑛𝑑𝑒𝑥𝑒𝑠 𝑓𝑜𝑟 𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝐵
𝜇𝐶 = 𝑀𝑒𝑎𝑛 ℎ𝑎𝑟𝑑𝑒𝑛𝑖𝑛𝑔 𝑖𝑛𝑑𝑒𝑥𝑒𝑠 𝑓𝑜𝑟 𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝐶

STEP 1 : 𝐻0 : 𝜇𝐴 = 𝜇𝐵 = 𝜇𝐶
𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑚𝑒𝑎𝑛𝑠 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟

STEP 2 : 𝑁 = 15, Σ𝑥 = _____________, Σ𝑥 2 = _______________

(Σ𝑥)2
𝐶𝐹 = =
𝑁

𝑆𝑆𝑇 = Σ𝑥 2 − 𝐶𝐹 =

𝑇𝐴2 𝑇𝐵2 𝑇𝐶2


𝑆𝑆(𝑀𝑎𝑡𝑒𝑟𝑖𝑎𝑙) = 𝑆𝑆𝑅 = + + − 𝐶𝐹 =
𝑛𝐴 𝑛𝐵 𝑛𝐶

𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑅 =


STEP 3: 𝐹𝛼,𝑘−1,𝑁−𝑘 = 𝐹𝛼,𝑣1,𝑣2 =

STEP 4: Decision Rule: Reject 𝐻0 𝑖𝑓 𝐹𝑐𝑎𝑙 > 𝐹𝛼,𝑣1,𝑣2


Decision :

STEP 5: Conclusion : Therefore, there is _________________________ to conclude that the data does
indicate the mean indexes are the same for the three types of materials.
EXAMPLE 19 – DEC’19 Q.1
A team of researchers interest to compare the yield (in kilograms) of four different varieties (A, B, C, D) of
a rambutan tree in Kg Hutan Kampung orchard. The researchers obtain a random sample of four trees of
each variety from the same orchard. The data were analyzed by using IBM SPSS Statistics. The result given
as below.

a) Compute the values of R, S, T and U.


Solution: R=
S=
T=
U=
b) State the null and alternative hypotheses for this study.

𝐻0 :

𝐻1 :

c) Based on the p-value, test at the 5% level of significance whether the mean yield differ on the four
different varieties.

𝑝 − 𝑣𝑎𝑙𝑢𝑒 =

Decision Rule: Reject 𝐻0 𝑖𝑓 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼

Decision :

Conclusion :
4.4
TEST OF INDEPENDENCE
To analyze categorical variables / independence test in particular
deals with testing the independence between two categorical
variables. (e.g. : Do GENDER and BRAND OF CAR are related or
independent of each other?)

Test the null hypothesis that the two characteristics of the elements of a
given population are not related.

We want to use the sample results to test the following hypotheses:


𝑯𝟎 : The two variables are independent / not associated / not related of each other
𝑯𝟏 : The two variables are dependent / associated / related of each other
Steps in Hypothesis Testing for “Test of Independence”
STEP 1 : State the Null and Alternatives Hypotheses
𝐻0 : The two variables are independent / not associated / not related of each other
𝐻1 : The two variables are dependent / associated / related of each other

STEP 2 : Test Statistics


2 (𝑂−𝐸)2
𝜒𝑐𝑎𝑙 =σ ; where O = the observed frequencies (actual data)
𝐸
𝑡𝑜𝑡𝑎𝑙 𝑟𝑜𝑤 × (𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑙𝑢𝑚𝑛)
E = the expected frequencies for a cell, 𝐸 =
𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
STEP 3 : Critical Value (Table 8)
2
𝜒𝛼,(𝑟−1)(𝑐−1) ; where 𝑟 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑜𝑤
c = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛

STEP 4 : Decision Rule and Decision


2 2
Decision Rule: Reject 𝐻0 if𝜒𝑐𝑎𝑙 > 𝜒𝛼,(𝑟−1)(𝑐−1)
Decision : We reject 𝐻0 or failed to reject 𝐻0 …

STEP 5 : Conclusion
There is an enough/no enough evidence to conclude that (refer to the question) ………
How to Calculate the Expected Frequencies (E)?
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐. 𝒐𝒇 𝒄𝒐𝒍𝒖𝒎𝒏, 𝒄 = 𝟐

Temper
Colour hair Total
Vile Mild
Red 40 20 60 40 ; E=27 20 ; E=33
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐. 𝒐𝒇 𝒓𝒐𝒘,
𝒓=𝟑
Brown 80 100 180 80 ; E=81 100 ; E=99
Black 60 100 160 60 ; E=72 100 ; E=88
Total 180 220 400
Actual data , 𝑂 = 40, 80, 60, 20, 100, 100

Since there’s 6 values of actual data (𝑂), thus there will be 6 values of expected frequencies (𝐸) as each 𝑂
will have its own 𝐸 value.

𝑡𝑜𝑡𝑎𝑙 𝑟𝑜𝑤 × (𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑙𝑢𝑚𝑛) 60 × 180


Let say we take the 1st actual data, 𝑂 = 40. 𝑆𝑜 𝑖𝑡𝑠 𝐸 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
= 400
= 27

𝑡𝑜𝑡𝑎𝑙 𝑟𝑜𝑤 × (𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑙𝑢𝑚𝑛) 160 ×220


Next actual data, let say 𝑂 = 100. 𝑆𝑜 𝑖𝑡𝑠 𝐸 = = = 88
𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 400
EXAMPLE 20
A random sample of 400 people is selected from all the 16 years old in a town. The variables recorded
were temper (vile or mild) and hair colour (red, brown or black). The observed frequencies are
displayed in table below. Test the hypothesis that temper and hair colour are independent at the 5%
level of significance.
Temper
Colour hair Total
Vile Mild
Red 40 20 60
Brown 80 100 180
Black 60 100 160
Total 180 220 400
STEP 1 : 𝐻0 : Temper and hair colour are independent of each other
𝐻1 : Temper and hair colour are dependent of each other

STEP 2 : Test Statistics,

2 (𝑂−𝐸)2 (40−27)2 (80−81)2 (60−72)2 (20−33)2 (100−99)2 (100−88)2


𝜒𝑐𝑎𝑙 = σ = + 81 + + + + = 15.039
𝐸 27 72 33 99 88
STEP 3 : Critical Value,
2 2 2
𝜒𝛼,(𝑟−1)(𝑐−1) = 𝜒0.05,(3−1)(2−1) = 𝜒0.05,2 = 5.991

2 2
STEP 4 : Decision Rule: Reject 𝐻0 if𝜒𝑐𝑎𝑙 > 𝜒𝛼,(𝑟−1)(𝑐−1)
2 2
Decision : Since 𝜒𝑐𝑎𝑙 = 15.039 > 𝜒0.05,2 = 5.991, thus we reject 𝐻0 .

STEP 5 : Conclusion:
There is no enough evidence to conclude that temper and hair colour are independent.
EXAMPLE 21 – JUNE’19 Q.4
A manager at Company Brilliant wishes to determine whether the employees’ work satisfaction is
related to their respective department. The results obtained are shown below.

𝐴𝑐𝑡𝑢𝑎𝑙 𝑑𝑎𝑡𝑎, 𝑂

𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝐸
2
𝜒𝑐𝑎𝑙 𝑝 − 𝑣𝑎𝑙𝑢𝑒

𝑑𝑓 = 𝑟 − 1 𝑐 − 1
= 2−1 4−1
= 3 𝒏𝒐𝒕 𝟐
a) Using an appropriate formula, compute the values of D and E.

Solution:

𝑡𝑜𝑡𝑎𝑙 𝑟𝑜𝑤 ×(𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑙𝑢𝑚𝑛) 30×57


𝑬= = = 18.4
𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 93

2 (𝑂−𝐸)2 (12−12.9)2 (38−38.6)2 (5−5.4)2 (8−6.1)2 (7−6.1)2 (19−18.4)2 (3−2.6)2 (1−2.9)2


𝑫= 𝜒𝑐𝑎𝑙 = σ = + + + + + + +
𝐸 12.9 38.6 5.4 6.1 6.1 18.4 2.6 2.9

= 2.152

b) State the null and alternative hypotheses for this study.

𝐻0 : Employees’ work satisfaction is not related to their respective department


𝐻1 : Employees’ work satisfaction is related to their respective department
Using p-value method

c) Based on the p-value, is there sufficient evidence to conclude that employees’ work satisfaction is
related to their respective department? Use 𝛼 = 0.05.

STEP 1 : 𝐻0 : Employees’ work satisfaction is not related to their respective department


𝐻1 : Employees’ work satisfaction is related to their respective department

STEP 2 : 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.541

STEP 3 : Decision Rule : Reject 𝐻0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼


Decision : Since 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.541 > 𝛼 = 0.05, thus we fail to reject 𝐻0 .

STEP 4 : Conclusion:
There is no enough evidence to conclude that employees’ work satisfaction is related to
their respective department.
.
EXAMPLE 22 – JULY’17 Q.6
The management of a train company wants to study if there is any association between the train station’s
crowd and the train delay in Klang Valley. Hence, the number of trains on time and the number of trains
that were late were observed at three different stations in Klang Valley. The crosstabulation of the
observation and the chi-square test are displayed below.
a) Calculate the values of A and B.

b) State the hypotheses for this study.

c) Based on the p-value, can we conclude that there is an association between station’s crowd and train
delay?
END OF
CHAPTER
4

You might also like