0% found this document useful (0 votes)
4 views

Copy of STA404_Students Module_v4

The document provides examples and explanations for constructing confidence intervals for various statistical scenarios, including the mean breaking strengths of wool fibers and the percentage of impurities in pharmaceutical materials. It also discusses the confidence intervals for the difference between two population means, covering both independent and dependent samples, as well as cases where variances are known or unknown. Additionally, it includes examples of real-world applications, such as comparing gas mileage and tensile strength of different products.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Copy of STA404_Students Module_v4

The document provides examples and explanations for constructing confidence intervals for various statistical scenarios, including the mean breaking strengths of wool fibers and the percentage of impurities in pharmaceutical materials. It also discusses the confidence intervals for the difference between two population means, covering both independent and dependent samples, as well as cases where variances are known or unknown. Additionally, it includes examples of real-world applications, such as comparing gas mileage and tensile strength of different products.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.4: The breaking strengths of 11 bundles of wool fibers have a sample mean 436.5
and a sample standard deviation of 11.90. Assume the breaking strengths of the populations
are normally distributed. Construct a 90% confidence interval for the mean breakings strengths
for wool fibers.
to
bias sd is unknown unknown
6
games
=
,

known 2
guna
=

bundles 436.5 11.90


It it s :

table
=
n
7
= ,
,

step 1 : Find ✗

Find 3 Find Ct
step
a :

✗ = I -
0.90=0.10 , 9/2 = 0.05

tab
'd-1 M = Ñ ± takin
-1
¥
Step 2 : Find tab is =
takin-1=-1 , , =

µ = Ñ ± to.1% ' ""


É
436.51=1.812
& 11 -
I = fo.si
'° = I. 812 M =

0.1% ,

N = 436.5 ± 6.5015

429.9986 < M< 443.0014

step 4 : conclusion

Therefore , we are 90% confident that the


average breaking strength
of wool fibre is between 429.9986 and 443.0014 .

Example 3.5: A pharmaceutical manufacturer purchases a particular material from a supplier.


The manufacturer selects nine shipments from the supplier and measures the percentage of
impurities in the raw material from each shipment. The sample means and variances are
x  1.89 and s 2  0.273
Find a 90% confidence interval for  and interpret your results.
'
1.89 0.273
Ñ
=
s
9 shipments
=
,
h= ,

a
Find
step 3 :

step 1 : ✗
µ = in ± tab '
"'
¥
I 0.90 = 0.10
a =

¥
-

µ = I ± to ,
9- I

Step 2: Find t 912


'
It u = 1.89 ± 1.860¥25
M = 1. 89 ± 0 3240 .

to ,
9- I = 1.860
M < 2.2140
I. 5661 <

t
Emit of measurement)
.

Conclusion keneada

1- 5661 % and 2.2140%

kat table , if n= 3s ambik 30 .


round down

32
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.6: A sample 15 bulbs were tested and the lengths of life are as follows (hours):

4300 4302 4415 4483 4301 4446 4478 4319


3985 4483 4377 4401 4346 4261 4353

One-Sample Statistics

N Mean Std. Deviation Std. Error Mean

hours 15 4350.00 124.748 32.210

One-Sample Test

Test Value = 0

t df Sig. (2-tailed) Mean Difference 98% Confidence Interval of the


Difference

Lower Upper

hours 135.052 14 .000 4350.000 4265.47 4434.53

Use the table above to answer the following questions.

a) Prove that the mean lengths of life bulbs is 4350 hours


b) Prove that the 98% confidence interval of the true mean is the same as in the computer
output.

33
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

3.7 CONFIDENCE INTERVAL FOR THE DIFFERENCE BETWEEN TWO POPULATION


MEANS

In this section we will discuss estimation procedures for the difference between two population
means. There are two different types of interval estimation for difference between two means,
namely independent and dependent samples.

Example:

1) We may want to find interval estimation on the difference between the mean lengths
of insects measured by two different microscopes
2) We may want to find interval estimation on the difference between the mean pH in
rainfall of two different areas.

Let 𝜇 and 𝜇 be the mean of the first and second population respectively. We want to find the
confidence interval of the difference between the two population means 𝜇 − 𝜇 .then 𝑥̅ − 𝑥̅
is the sample statistic used to make the confidence interval.

3.7.1 Confidence Interval for Difference between Two Population Means -Independent
samples

Two samples are independent if they are draw from two different populations and the elements
of first sample have no relationship to the elements of the second sample.

Example: To estimate the difference between the weights of male and female students. We
select two samples, one from male student’s population and another from female student’s
population. Thus, these two samples are independent because they are chosen from two
different populations, and the samples have no effect on each other.

 Levene’s Test for the equality variances known = 2-


Levene's test is used to test if n samples have equal variances. Equal variances across
samples is called homogeneity of variance. Some statistical tests, for example the analysis of
variance, assume that variances are equal across groups or samples. The Levene’s test can
be used to verify that assumption.

H 0 :  12   22 (Assume equal variances)


H1 :  12   22 (Assume unequal variances)
Decision:
If p-value (sig.) ≤ α, reject H0. Hence, we assume unequal variances,
① ( ñ -31+-2-4 /
'

0 known → ( In a.)
,
-
=
,
,

① state in the

}
if unknown → here can variance equal or not sbb diff formula q
② to know the
gpgg output not
variance equal or
calculation
③ calculate by

34
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

a) Confidence Interval for Differences between Two Population Means (variances is


known) - for both Small and Large Sample Size

Assumptions :

i. Either 𝑛 ≥ 30 or 𝑛 < 30
ii. Populations are normally distributed
iii. Populations variances 𝜎 and 𝜎 are known

The (1 - 𝛼) 100% confidence interval for 𝜇 − 𝜇 is

𝜎 𝜎
(𝑥̅ − 𝑥̅ ) ± 𝑍 +
𝑛 𝑛

Example 3.7: An experiment was conducted in which two types of engines, A and B were
-

compared. Gas mileage in miles per gallon was measured. 75 experiments were conducted
using engine type A and 50 experiments were done for engine type B. The gasoline used and
other conditions were held constant. The average gas mileage for engine A was 42 miles per
gallon and the average for engine B was 36 miles per gallon. Find a 96% confidence interval
on 𝜇 − 𝜇 , where 𝜇 and 𝜇 are population mean gas mileage for engine A and engine B,
respectively. Assume that the population standard deviations are 8 and 6 for engine A and B
respectively.

n
A
= 75 experiment NB = 50 experiment
miles per
ÑA = 42 miles per gallon
ÑB =
36
gallon
◦A =
8 miles per
gallon
◦B = 6 miles per
gallon

step 1 : Find a

-0.96=0.04
✗ =
I
b) From the answer in a) , can we

step 2 i Find 2% conclude that Engine A and B have


from statistical Table 5
equal performance Explain
answer
your
.
.

20.04/2=20.02 =
2. 0537

3 Find G- A consume than B.


Step :
more
interval
(I =
[ÑA -
ÑB ) ± 2% 9¥ -19? No . Because 0 is
not in the

(I =
(42-36)=1 2.0537 82 62
interval 0
7s so equal =

(I =
6=12.5760
3.420L ( In < 8.5760 Unequal = interval =/ 0
(I =
( 3. 4240 ,
8.5760 ) or ,
-
us

Step 4 : Conclusion
Therefore at 96% confidence interval for the diff between A
i. ,
mean
engine
and
engine
B is between 3. 4240 and 8.5760 miles per gallon .

35
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

b) Confidence Interval for Differences between Two Population Means (variances is


unknown) - for Large Sample Size

Assumptions :

i. Both sample sizes are more than 30


ii. Populations are normally distributed
iii. Populations variances 𝜎 and 𝜎 are unknown

The (1 - 𝛼) 100% confidence interval for 𝜇 − 𝜇 is

𝑠 𝑠
(𝑥̅ − 𝑥̅ ) ± 𝑍 +
𝑛 𝑛

Example 3.8: Two kinds of threat are being compared for strength. Fifty pieces of each type
of thread are tested under similar conditions. Brand A had an average tensile strength of 78.3
kilograms with a standard deviation of 5.6 kilograms, while Brand B had an average tensile
strength of 87.2 kilograms with a standard deviation of 6.3 kilograms. Construct a 95%
confidence interval for the difference of the population means.
step I -

. Find a

✗ = I -
0.95 = 0 . 05

step 2 :

step 3 : Find t % dt

t% df =

to.gs . it = 2.201


table no 7

value need to round down


36
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

c) Confidence Interval for Difference between Two Population Means (variances is


unknown and equal variances) – for small sample

Assumptions:

i. One or both sample sizes are less than 30


ii. Populations are normally distributed
iii. Populations variances 𝜎 and 𝜎 are unknown but the variances are assumed to be
equal

The (1 − 𝛼)100% confidence interval for 𝜇 − 𝜇 is

(𝑥̅ − 𝑥̅ ) ± 𝑡 , 𝑠 + Where 𝑑𝑓 = 𝑛 + 𝑛 − 2

(𝑛 − 1)𝑠 + (𝑛 − 1)𝑠
𝑠 =
𝑛 +𝑛 −2

ynkalausambungayatso
jadi sample
=
2-

Example 3.9: An insurance company wants to know if the average speed at which men drive
cars is greater than that of women drivers. The company took a random sample of 26 cars
driven by men on a highway and found the mean speed to be 72 miles per hour with a standard
deviation of 2.2 miles per hour. Another sample of 16 cars driven by women on the same
highway gave a mean speed of 68 miles per hour with standard deviation of 2.5 miles per
hour. Assume that the speeds at which all men and all women drive cars on this highway are
both normally distributed with the same population standard deviation. Construct a 98%
confidence interval for the difference between the mean speeds of cars driven by all men and
all women on this highway. (Ans: 𝑠 = 2.317, (2.216,5.784))

dt=V=
equal h tha -
2
nfemak 16 ,
N male = 26 cars
= cars


ñmale = 72 mph
Ñ female = 68 mph unequal iv. df =

5 female = 2.5 mph


Smale = 2.2 mph

1 Find a
step :

4 : Find
step CI
a = I -0.98=0.02
df
.de/Sptn-.-n-. )
= n, -1ha -2

Step 2 : Find toy ,df


,
dt = 26+16-2
40
(I =
Cñ ,
-
Ñs) ± toy,
df =

-10.021 ,
df
= -10.01/40=2.423

(I = (72-68) ± 2.423 (2.31711%+16)
table 7

Step 3 : Find Sp Sp :
[ 26 -

1) 2.23+(16-1) 2.52
Cni -
Dsi -11ns 1) s ?
-
26+16-2
gp= hi -1ha -2

2.3171
Sp =

37
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.10:The manufacturer of a small battery-powered tape recorder decides to include


four alkaline batteries with its product. Two battery suppliers are being considered; each has
its own brand (brand 1 and brand 2). The supervising inspector of incoming quality wants to
know if the average lifetimes of two brands are the same. A sample experiment is conducted:
each of ten batteries (five of each brand) is connected to a test device that places a small
drain on the battery power and records the battery lifetimes the following result (in hours) are
obtained:

Group Statistics

Brand N Mean Std. Deviation Std. Error Mean

brand1 5 44.20 5.263 2.354


hours
brand2 5 31.60 4.159 1.860

Independent Samples Test

Levene's Test t-test for Equality of Means


for Equality of
Variances

F Sig. t df Sig. Mean Std. Error 95% Confidence Interval


(2-tailed) Difference Difference of the Difference

Lower Upper

Equal variances if equal


.605 .459 4.200 8 .003 12.600 3.000 5.682 19.518
assumed
hours
Equal variances
4.200 7.594 .003 12.600 3.000 it not 5.617 19.583
not assumed equal

a) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
b) State the 95% confidence interval on the differences between the average lifetimes of the
two brands.
c) Based on the confidence interval, can we conclude that the average lifetimes of the two
brands are equal?
whether brand A & brand B have equal
variance
a) Identify
.

' '

Slept to : 6A =
GB

H ,
i
62A =/ 6132

step 2 Find ✗

✗ = 0.05

step 3 Find p-value (Relent table sig) =

step 4 Decision Rule


-

Reject Ho it p-value < ✗

Since p-value 0.459 = > ✗ = 0.05 ,

fail to reject Ho

steps conclusion .

38
NZZ & NHNMS, UiTM SHAH ALAM Brand A & B have equal variances .
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

d) Confidence Interval for Difference between Two Population Means (variances is


unknown and unequal variances) – for small sample

Assumptions:

i. One or both sample sizes are less than 30


ii. Populations are normally distributed
iii. Populations variances 𝜎 and 𝜎 are unknown but the variances are assumed to be
different

The (1 − 𝛼)100% confidence interval for 𝜇 − 𝜇 is

2
 s12 s22 
  
n n2 
(𝑥̅ − 𝑥̅ ) ± 𝑡 , + Where df   12 2
 s12   s22 
   
 n1    n2 
n1  1 n2  1

Example 3.11:The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5
and a sample standard deviation of 11.90. In addition, the breaking strengths of another 12
bundles of synthetic fibres have a sample mean 452.8 and a sample standard deviation 3.61.
Assume the breaking strengths of the two populations are normally distributed with unequal
variances. Construct a 95% confidence interval on the mean difference of breaking strengths
between wool fibres and synthetic fibres. Explain your answer? (Ans: (-24.6245, -7.9755) OR
(-24.5235,-8.0764)) Slept : Finds
✗ =/ -
0.95=0.05

Find -1%
Step 2 : .df
table 7

-10.05 -10.025,11
-1% ,df 2.201
= =
ill
=

Step 3 : Find dt
2 2
s? s} 11.902 3.612
" "
df
n ' m
= = = 11.6828 ≈ 11
2 2
s ? 2
S? 11.902 3.612
hi nz It 12

ha I 11 I 12 -
I
I
-

h,
-
-

Find CI
step 4 :

sits:
(I =
Cñi ña ) - ± -1g ,
,df N ,
N2

11.902 3.612
(I = (436.5-452.8)=12.201 11 12

(I = -
16.3=18.2235

(I = - 16.31=8.2235

(I =
f- 24,5235 -8.0764) ,

Conclusion
step 5 :

The 95%
confidence interval of the mean difference of breaking strength
fibres between -24.5235 and -8.0764
between wool fibres
39
and synthetic is

NZZ & NHNMS, UiTM SHAH ALAM


STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.12:A set of facilitation tools to help with data analysis for problem solving is being
developed by a group of statisticians at UiTM. In order to test effectiveness of these tools, a
group of research officers were asked to analyze and produce a built-in report for a set of data
on the computer. Twelve equally capable research officers were randomly selected and six
were randomly assigned a standard procedure to complete the task. The other six were asked
to do the task using the developed facilitation tools. The response measured was the time to
completion (in minutes). The output of statistical analysis is shown in the following tables.

Group Statistics

Tool N Mean Std. Deviation Std. Error Mean

Standard Procedure 6 65.50 5.891 2.405


time completion
Facilitation Tool 6 36.50 4.087 1.668

Independent Samples Test

Levene's Test t-test for Equality of Means


for Equality of
Variances

F Sig. t df Sig. Mean Std. Error 95% Confidence


(2-tailed) Difference Difference Interval of the
Difference

Lower Upper

Equal variances
1.231 .003 9.908 10 .000 29.000 2.927 22.478 35.522
Time assumed
completion Equal variances
9.908 8.908 .000 29.000 2.927 22.368 35.632
not assumed

a) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
b) State the 95% confidence interval to estimate the difference between the average
completion times for the two procedures.
c) Based on the confidence interval, can we conclude that the mean difference between the
average completion times for the two procedures are differ?
d) Show the degree of freedom for unequal variances is 8.908.
Let I = standard procedure
let 2 = facilitation tool

a) Levene Test
1 .
Ho = o ? = o ?
Hi = o ? ≠ o :

value (
Sig )
2.
= 0.003
p
-

3 . ✗ = 0.05
≤ ✗
4. Reject Ho if p-value 0.05
✗ =
0.003 <
Since p value
-
=

40
NZZ & NHNMS, UiTM SHAH ALAM
Step 1 : Find a
✗ =/ -
0.95=0.05

Find -1%
Step 2 : .df
table 7

-1×1 ,
,df
=
£095,8 =
-10.025,8 = 2.306

Step 3 : Find dt
2 2
s? s} 5.8912 4. 0872
6
6
df
ni na
= =
2 2
= 8.9079 ≈ 8
s ? 2
S? g. 8912 4.0872
6 6
hi nz

ha I I 6 I
6
-

I
-

h,
-
-

Find CI
step 4 :

sits:
(I =
Cñi Ña ) - ± -142 , dt ni na

5.8912 4. 0872
(I = (66.50-36.50) ± 2.306
6 6

CI =
30 ± 6.7499

(I =
( 36.7499 ,
23.25 1)

Step 5 : Conclusion
The 95% confidence interval of the mean difference between the average completion times
for standard procedure and facilitation tool is between -24.5235 and -8.0764
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

3.7.2 Confidence Interval for Difference Between Two Population Means - Dependent
samples

Matched or paired samples involve a procedure whereby pairs of observations are matched
as close as possible according to certain relevant characteristics. The two sets of observations
are then subjected to two different treatments. Now, the pairs of observations selected are
similar in characteristics. Hence, if there is any difference in the two sets of observations, this
must be attributed to the treatment alone.

The point estimate for the mean difference between two observations from matched samples
is:

𝜇 =d

The (1-α) 100% confidence interval for the mean difference between two observations from
matched samples, 𝜇 is:
𝑠
d ± t α,
√𝑛

Where 𝜇 = population mean difference in the observations

d = sample mean difference in the observations

𝑠 = standard deviation of the difference

𝑛 = number of differences

The formula used to compute the values of d and 𝑠 are as follows:

∑ [∑ ]2
𝑑̅ = and s = ∑d −

Where 𝑑 is the difference between two observations from each matched samples.

41
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.13: The manufacturer of a gasoline additive claimed that the use of this additive
increases gasoline mileage. A random sample of six cars was selected and these cars were
driven for one week without the gasoline additive and then for one week with the gasoline
additive. The following table gives the miles per gallon for these cars without and with the
gasoline additive.

Without 24.6 28.3 18.9 23.7 15.4 29.5


With 26.3 31.7 18.2 25.3 18.3 30.9

Construct a 95% confidence interval for the difference in mean mileage per gallon for cars
without and with the gasoline additive. (Ans:(-3.2150,-0.2184))

Solution:

Difference [ 26.3-24.6>31.7-28.3
=
,
18.2-18.9 , 25.3-23.7 18.3-15.4, 30.9-29.5 ,
]

=
[ 1. 7,3 4 -0.7 1.6.2.9
.
, , ,
1.4 ]

1.7-13.4+1-0.7) -11.6+2.9+1.4
d- =

6
≈ 1.717

2.9-1.71713+(1.4-1.717)
'

4.7-1.7175+(3.4-1.717) -1C-0.7 -1-71712+(1.6-1.717)


'
'
+ (
variance =
6- I
10.188
=
2.0376
5

sd =
2.0376 ≈ 1.4274

1.4274
standard mean error : ≈ 0.5827
56

df = n - 1--6-1=5

Slept : Finds
✗ =/ -
0.95=0.05

Find -1%
Step 2 : .df
table 7

-1×1 ,
,df
=
£095,11 =
-10.025,11 = 2.201

Step 3 : Find dt
2 2
s? s} 11.902 3.612
11 12
df
hi N2
= = = 11.6828 ≈ 11
2 2
s ? 2
S? 11.902 3.612
hi nz It 12

ha I 11 I 12 -
I
I
-

h,
-
-

Find CI
step 4 :

"+ ˢ:

(I =
(Ñi Ña ) - ± -14 ,
,df n ,
na

11.902 3.612
(I = (436.5-452.8)=12.201 It 12

(I = -
16.3=18.2235

(I = - 16.31=8.2235

(I =
f- 24,5235 -8.0764) ,

Conclusion
step 5 :

The 95%
confidence interval of the mean difference of breaking strength
fibres between -24.5235 and -8.0764
between wool fibres
and synthetic is

42
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.14: Ariff is the Human Resources Director at the head office of a reputable bank
in Ipoh. Ariff finds that absenteeism among the bank’s employee is quite high leading to poor
moral and slow performance. In order to boost employee performance and lower absenteeism
among his employees, he sent the bank’s employees to attend “The Innersole of Highly
Effective People”, a training program conducted by Top Performers Sdn.Bhd. In order to test
the effectiveness of the training program, he selected a random sample of 12 employees and
gathered data on the number of days these employees were absent from work six months
before the training program. He then collected the same data six months after the training
programs. The data is shown in the table.

Employee Number of days absent from work d- I


Before Program After Program 36
A 14 8 6
4
B 9 7 2

4 16
C 10 6
D 6 3 3 q
E 7 8 -

1 I
16
F 9 5 4
5 25
G 11 6
H 5 3 2 4
I 7 4 3 9
J 12 10 2 4
25
K 10 5 5
L 12 6 6
Is
Determine and interpret the 95% confidence interval for the mean difference in number of days
d- = ¥
employees were absent before and after training program. (Ans: (2.1326, 4.7008))
=
¥
Solution: = 3.4167

standard error mean .

sd =
n
!, [ Ed
'
-
"
%]

=
I, 185 -
c)
= 2.0207
Sd /Jn = 2.0207
JI

= 0.5833

9=0.05 d- = 3.1467

42=0.025 dr = 12-1=11

t 0.025,11 = 2. 201

M = d- ± to.oas.it ( ¥)

=
3.1467=12.201 (0.5833)
= 3.1467 ± 1.2838

=
(4.4305 ,
1.8629)

43
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.15: A random sample of nine local banks shows their deposit (in billions of dollars)
3 years ago and their deposits (in billions of dollars) today. Assume the variable is normally
distributed. The data is shown in the table.
d. 2=27.7729-1.0609-6.5025 3.2041-0.4096-0.6084-0.6084 -0.36 0.0169 21.4441
Bank 1 2 3 4 5 6 7 8 9
3 years ago 11.42 8.41 3.98 7.37 2.28 1.10 1.00 0.9 1.35 37.81 mean = 4.20111

Today 16.69 9.44 6.53 5.58 2.92 1.88 1.78 1.5 1.22 47.54 mean = 5.28222

5.27 -1.03 -2.55 1.79 -0.64 -0.78 -0.78 -0.6 0.13 -13.31
-

The data collected was analyzed and the output is shown below

Paired Differences t df

Mean Std. Error 95% Confidence Interval


Mean of the Difference

Lower Upper

Pair 3 years ago - Today -1.0811 0.6458 -2.57024 .40802 -1.674 8

a) Show the mean difference is -1.0811


b) Show the standard error mean is 0.6458
c) State the 95% confidence interval that the mean difference in deposits for the both
bank.
d) Based on the confidence interval, can we conclude that the mean difference in deposits
for the banks are differ?
d- -1.0811 =

-11.35 b) '
11.42-1 Ñ)
G) Efki
. . . .

3
.

:
4.20111
-

years ago g
=
variance =

n -
I

(0.131-1.0811)
16.69 -1 + ' 22 = (-5.27-11.0811)%1 . . . . .

today
' .

:
5.28222
- '
. . '
.

9 9-1

Mean = 4.20111-5.28222=-1.08111 # proven 29.299


3.662
=


8
1.913
Sd =
1-3.662 ≈

sea =
g¥='j¥ˢ≈ 0.638

44
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 3.16: Many engineering students are having problems in data analysis using
statistical software. A professor who teaches statistics for engineering course offered a two
day workshop on this topic. The following table gives the test scores of seven engineering
students before and after they attended the workshop.

Before 56 69 48 74 65 71 58
After 62 73 44 85 71 70 69

The data collected was analyzed and the output is shown below

Paired Samples Test

Paired Differences t df Sig. (2-

Mean Std. Std. Error 95% Confidence Interval of tailed)

Deviation Mean the Difference

Lower Upper

Pair 1 before - after -4.714 5.648 2.135 -9.938 .510 -2.208 6 .069

a) Show that 95% confidence interval for the difference in mean tests scores before and
after attending the workshop is between -9.94 and 0.51.
b) Can we conclude whether attending the workshop increases the test score?

45
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

EXERCISE CHAPTER 3

1. A study on the compressive strength of concrete is conducted by a civil engineer.


Compressive strength is normally distributed with a variance of 1000 (psi) 2 . A random
sample of 12 specimens is tested and the mean compressive strength is 3250 psi.
a) What is the best point estimate of the mean compressive strength?
b) Construct a 95% confidence interval for the mean compressive strength

2. A random sample of 15 items is taken, producing a sample mean of 2.364 with a sample
variance of 0.81. Assume x is normally distributed and construct 95% confidence interval
for the population mean.

3. The ACT scores from a random sample of 61 high school seniors were analyzed and found
to have a mean of 25.1 and a standard deviation of 3.6. Find a 95% confidence interval
for the mean population.

4. The drying time (in hours) of a certain brand of latex paint were recorded as follows:

3.4 2.5 4.8 2.9 3.6 2.8 3.3 5.6


3.7 2.8 4.4 4.0 5.2 3.0 4.8

a) Estimate the values of the population mean and population standard deviation
b) Construct a 95% confidence interval for the population mean.

5. To determine the flow characteristics of oil through a valve, the inlet oil temperature is
measured in degrees Fahrenheit. The following are a sample of 8 readings:

97, 93, 91, 94, 93, 92,89, 90

Construct a 99% confidence interval for the mean inlet oil temperature.

6. A dietitian wishes to see if a person’s cholesterol level will change if the diet is
supplemented by a certain mineral. Six randomly selected subjects were pretested, and
then they took the mineral supplement for a 6-week periods.

Paired Samples Statistics

Mean N Std. Deviation Std. Error Mean

Before 209.83 6 26.940 10.998


Pair
After 193.17 6 22.257 9.086

Paired Samples Test

Paired Differences

Mean Std. Deviation

Pair 1 Before - After 16.667 25.390

Construct a 90% confidence interval for the mean difference in cholesterol level before
and after diet.

46
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

7. A new treatment was proposed to fight breast cancer. Six randomly selected new breast
cancer patients were treated with the new treatment. For comparison, five patients with
the old treatments were also selected at random. The survival times, in years from the time
treatments started are recorded as follows.

Group Statistics

Treatment N Mean Std. Deviation Std. Error Mean

new treatment 6 3.83 1.472 .601


Survival
old treatment 5 2.40 1.140 .510

Independent Samples Test

Levene's Test for t-test for Equality of Means


Equality of
Variances

F Sig. t df Sig. Mean Std. Error 95% Confidence


(2- Difference Difference Interval of the
tailed) Difference

Lower Upper

Equal variances
.505 .495 1.773 9 .110 1.433 Y -.395 3.262
assumed
Survival
Equal variances
1.819 8.976 .102 1.433 .788 -.350 3.217
not assumed

a) Show the mean difference is 1.433


b) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
c) Fins the value of Y
d) Construct a 99% confidence interval for the mean difference in survival time for the new
and old treatment.
2.40=1.43
a) Ñ 3.83 -

b) ① std error mean


=
¥
I No
② -1cal
-
=

Sd /Jn

SE = sd
In
=

47
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

CHAPTER 4:
HYPOTHESIS TESTING
In the case when the value of a population parameter is unknown, the value will be estimated;
either by a point estimate or interval estimate. If a statement (hypothesis) is a statement or
claim made about the value of a population parameter, we would test whether the statement
made is true or not. The procedure to do this is called hypothesis testing or test of significance.
Since the statement made is about population parameters, to do the test we will take a random
sample from that population and calculate the sample statistics. Thus, based on the
information from the sample we will make a decision whether to reject or not to reject the
statement made.

The following are some definitions:

A hypothesis is a statement or claim made about the value of a population parameter

Hypothesis testing is a procedure whereby sample information is used to decide whether


accept or reject the statement made regarding the value of the population parameter.

Null hypothesis is a statement that the population parameter has a specific value. We will
use 𝐻0 to represent the null hypothesis. Thus, the null hypothesis is always stated using the
equal sign.

𝐻0 : θ = θ

Alternative hypothesis is the hypothesis opposite to 𝐻0 and this hypothesis will be accepted
if 𝐻0 is rejected. It is also known as the research hypothesis. The alternative hypothesis can
be two forms: directional and non-directional.

a) Non-directional
𝐻 :θ ≠ θ

b) Directional
𝐻 :θ < θ
𝐻 :θ > θ

The two of alternative hypothesis forms two types of tests: one tailed and two tailed tests

a) One tailed test


𝐻 : θ < θ (Left tailed test)
𝐻 : θ > θ (Right tailed test)

b) Two tailed test


𝐻 :θ ≠ θ

Significance level is the maximum probability of committing a type I error. This probability is
symbolized by 𝛼 (Greek Letter alpha).

48
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Test statistic is a single number calculated from the sample data as a basis in deciding to
reject or not to reject the null hypothesis.

The entire set of values that the test statistic may assume is divided into two regions. One
region consists of values that support H1 and lead to rejecting H0 is called the rejection region
(critical region). The other consists of values that support H0 is called the acceptance
region.

The value of the test statistic that divides the non-rejection region from the rejection region is
critical value.

a) TYPE I AND TYPE II ERRORS

A Type I error occurs when the null hypothesis is rejected when it is true. The value of
𝜶 represents the probability of committing this Type I error, which is.

𝑃(𝑇𝑦𝑝𝑒 𝐼 𝐸𝑟𝑟𝑜𝑟) = 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 = 𝛼

A Type II error occurs when the null hypothesis is failed to be rejected when it is false. The
value of 𝜷 represents the probability of committing this Type II error.

Power of test is the probability of rejecting H0 given a specific alternative is true, that is, to
make a decision. It is the probability of not committing a Type II error.

𝑃𝑜𝑤𝑒𝑟 = 1 − 𝛽

The following table gives a summary of possible results of any hypothesis test.

Decision
Null Hypothesis
Reject 𝐇𝟎 Accept 𝐇𝟎
𝐇𝟎 is true Type I error Correct Decision
𝐇𝟎 is false Correct Decision Type II error

Hypothesis-testing common phrases


> <
Is greater than Is less than
Is above Is below
Is higher than Is lower than
Is longer than Is shorter than
Is bigger than Is smaller than
Is increased Is decreased or reduced from
= ≠
Is equal to Is not equal to
Is the same as Is different from
Has no changed from Has changed from
Is the same as Is not the same as

49
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

b) STEPS IN HYPOTHESIS TESTING

The following are the required steps in conducting a hypothesis test

Step 1 State the null and alternative hypothesis

Situation A
A contractor wishes to lower heating bills by using a special type of insulation
in houses. The average of the monthly heating bill is RM78.
The hypotheses for this situation are:
H0: µ = 78
H1: µ < 78

Situation B
A chemist invents an additive to increase the life of an automobile battery. It is
known that the mean lifetime of the automobile battery is 36 months.
The hypotheses for this situation are:
H0: µ = 36
H1: µ > 36

Situation C
A medical researcher is interested in finding out whether a new medication will
have any undesirable side effects. He is particularly concerned with the pulse
rate of the patients who take the medication. He knows that the mean pulse
rate for the population under study is 82 beats per minute.
The hypotheses for this situation are:
H0: µ = 82
H1: µ ≠ 82

Step 2 Determine the appropriate test statistic and calculate


Step 3 State the level of significance, specify the decision rule and determine the
critical value

Critical value(s) ~ separates the critical region/rejection region from non-


critical region/acceptance region
Critical or rejection region ~ the range values of the test value that indicates
there is a significant difference and the null hypothesis should be rejected
Non-critical or Non-rejection region ~ the range values of the test value that
indicates the difference was probably due to chance and the null hypothesis
should not be rejected.
Decision Rule ~ Summary of deciding whether to accept or reject the null
hypothesis based on the comparison between the critical value and the test
statistics

*Note: to obtain a critical value, α-level must be chosen first. For two-tailed test,
α is divided into two equal parts.

Step 4 Make a decision


Make an inference regarding the population parameter

If the value of the test statistics lies in the critical region/rejection region, reject
H0
If the value of the test statistics does not lies in the critical region/rejection
region, do not reject H0

50
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Step 5 State the conclusion


a) Reject H0 and conclude H1 is true (test is significant)
b) Accept H0/Do not to reject H0/Fail to reject H0 and conclude H0 is true
(test is not significant)

c) MAKE A DECISION RULE

i. By traditional

Decision Rule: Reject H0


Right-tail test
(When the alternative If Z/tcalculated > Z/tcritical value
hypothesis involved sign “>”)
Left-tail test
(When the alternative If Z/tcalculated < -Z/-tcritical value
hypothesis involved sign “<”)
Two-tail test
(When the alternative If Z/tcalculated > Z/tcritical value OR Z/tcalculated < -Z/-tcritical
hypothesis involved sign “≠”) value

ii. By P-Value
 Reject H0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼

iii. By Confidence Interval Method


 Reject H0 if zero is not included in the interval

51
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

4.1 HYPOTHESIS TESTS ABOUT A POPULATION MEAN

4.1.1 Hypothesis Test for Mean 𝝁 (Variance is known)


① One population Mean
Assumption:

i. The sample is a random sample


% known ( Z ) 6/62 unknown /f)
ii. Either 𝑛 ≥ 30 or the population is normally distributed when 𝑛 < 30
iii. Variance or standard deviation for the population is known 2- ul Ñ m
tea / ñ m
-
-
=
=

9Th
Stn

Test procedure: z,
lone tail) tcu (one tail) -

Za th ,
v -
degree d- freedom

𝐻 :𝜇 = 𝜇 Zcv (two tail) for [two tail) -

7*12
than
𝐻 :𝜇 ≠ 𝜇 𝐻 :𝜇 > 𝜇 𝐻 :𝜇 < 𝜇 Reject to it 2-at
> Zn (
-

tail) right
Reject Ho if Zal <
Za Cleft tail)
-

two -
tail
Test statistic:

𝑥̅ − 𝜇0 not
𝑧= 𝜎 equal -
two tail -

√𝑛

Example 4.1: A company producing 3A batteries claims that its batteries last an average of
24 months with a standard deviation of 3 months. A sample of 36 batteries was tested. The
mean life of these batteries was 23 months. Using the 5% level of significance, is there
evidence to indicate that the mean lifetime of 3A batteries is below 24 months?
Ho = M -0.8
-

Solution: below
Hi = m > 0.8

§
Two Population
less than
tail test
% " """
% equal
"
% .
The;%
one known

higher
.

than
-

greater than
M = 24 0=3 n= 36 Ñ 23 = 4=0.05 3 : Confidence Interval

Step 1: Hypothesis Test -


Statistic 2 P-value
Hypothesis
.

1:
if less than step 3 must be c- ve) tail two tail step
Ho 24 1: Hypothesis 0M -

µ
step
: = /
-

it must be ( tu) Ho :
M = 24
H, n < 24 Him >24 step 3
✗ %
:
Ho
,
,
µ , 24
Hi : ML 24
step 2 :
significant Value H, : µ < 24
P-value/a p-value

takyah divide ? step 2 :


significant Value
significant
0.0s tail
✗ one
Value
so
step
-
= -

2 :
✗ =
0.05×2 -
one tail -

step 3 test Statistic


double the value
: -

✗ =
0.05 a = o.io
2- cat =
Ñ -
M 23 -24

%
=

3/56=-2 step 3 :
p
-
value ✗
12=0.05
0.06
step Critical Value p value 0.03
=
4 :
- =
✗ =p . OS
tail
2
two
42=0.025
-

Step Rule
-

step 3 Jawapan _

2- or = -

2- ✗ = -

Zao, =
-1.644g 4 : Decision
C- ve)
Hoit p-value Step
Reject < ✗ 3 :( I
step 4
: Decision Rule 05
Since p-value
¥
= 0.03 < ✗ = ◦ '
'

CIn= ñ ± 2- %
Reject Ho it 7cal Zev Ho
Reject
-

Since Zeal = -
24 Zou -
=
-1.6449
Step 5 : Conclusion
= 23 I 1.6449%6
Reject Ho =
(22.1776/23.8225)
Steps : Conclusion
sufficient evidence to indicate that
step 4: Division Rule
There is

the mean lifetime of 3A batteries is


Reject Ho if 24 is not in the interval

< 24 months Since 24 is not is the interval ,

52
.

NZZ & NHNMS, UiTM SHAH ALAM Reject Ho

if accept _ fail to reject


STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

4.1.2 Hypothesis Test for Mean 𝝁 (Variance is unknown and large sample)

Assumption:

i. The sample is a random sample


ii. the population is normally distributed
iii. Sample size is large 𝑛 ≥ 30
iv. Variance or standard deviation for the population is unknown

Test procedure:

𝐻 :𝜇 = 𝜇
𝐻 :𝜇 ≠ 𝜇 𝐻 :𝜇 > 𝜇 𝐻 :𝜇 < 𝜇
Test statistic:

𝑥̅ − 𝜇0
𝑧= 𝑠
√𝑛

Example 4.2: A pharmaceutical manufacturer purchases a particular material from a supplier.


The manufacturer selects 30 shipments from the supplier and measures the percentage of
impurities in the raw material from each shipment. The sample means and variances are 𝑥̅ =
1.89 and 𝑠 2 = 0.273. Test at 5% level of significance whether the average percentage of
impurities is different from 1.8.

Solution:
Ñ =
1.89 S2 = 0.273

S = 0.5225

Step 1 :
Hypothesis
to : M 1.8

M 1=1.8 1-two tail)


-

step 2 :
significant - value
9--0.05 ; % = 0.025

3 : Test statistic
step
Ñ M 1.89 1.8
( ( at
-

=
=

0.5225/1-30
s/
5h

step 4 : Cu
=
0.9434

for =
-1% , u

to 025 29
=

.
,

=
2.045

Step 5 : Decision Rule

if -1cal > tour


Reject Ho
teal 4- tcv
Since -1cal 0.94374 for
=
=
2.045

Fail to reject Ho

step 6 : conclusion
Therefore there is no sufficient evidence
,

to indicate that n not equal to 1.8 .

53
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.3: Based on the information given by the housekeepers, it was found that the hotel
has produced 6.1 kilograms of solid waste daily. The following tables shows the results
obtained from a further analysis of the study

One-Sample Statistics

N Mean Std. Deviation

hours 35 5.2714 1.11871

One-Sample Test

Test Value = 6.1

t df Sig. (2-tailed) Mean Difference 98% Confidence Interval of the


Difference

Lower Upper

hours -4.382 34 .000 -0.82857 -1.2129 -0.4443

a) If the researcher would like to test whether the mean weight of the solid waste is
different from 6.1kg, what will be the null and alternative hypothesis?
b) Based on the p-value, can the researcher conclude that the mean weight of the solid
waste is different from 6.1kg?
c) Construct a 95% confidence interval for the mean weight of solid waste.

a) Hypothesis
Ho
[ In = Ñ Itani ¥
M
6.1kg
: =

" " 871


Hi : M =/
6.1kg = 5.2714=1 to . 02s , zy
35

b) Step 1 :
Hypothesis 5.6578 )
=
f- 4.885 ,

Ho M
6.1kg
: =

Conclusion between
Hi ≠
6.1kg
:
: M _
-

step 2 :
significant value

✗ = 0.02

Step 3 :p value -

value : 0.000
p
-

step 4 :
Decision Rule

Ho it p-value < ✗
Reject
0.000<9--0.02
Since p-value =
,

Reject Ho .

Conclusion
Step 5 :

There is sufficient evidence to


indicate that Hi .

54
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

4.1.3 Hypothesis Test for Mean 𝝁 (Variance is unknown and small sample)

Assumption:

i. The sample is a random sample


ii. The population is normally distributed
iii. Sample size is small 𝑛 < 30
iv. Variance or standard deviation for the population is known

Test procedure:

𝐻 :𝜇 = 𝜇
𝐻 :𝜇 ≠ 𝜇 𝐻 :𝜇 > 𝜇 𝐻 :𝜇 < 𝜇
Test statistic:

𝑥̅ − 𝜇0
𝑡= 𝑠
√𝑛

Example 4.4: The speed limit along the Ipoh-Lumut highway states 90km/h. the highway
petrol centre suspects that cars travelling along the highway exceed this speed limit. A sample
of 15 cars had their speeds measured by radar. The sample mean was 98km/h and the
-

standard deviation was 15km/h. at the 5% level of significance is there evidence to indicate
one tail -

that cars travelling along this highway exceed the speed limit?

Solution: A- IS 1-1,0=90

Step 1 :
Hypothesis
to : u = 90km/h step 6 : conclusion
evident to
90km/h There is sufficient
M of
indicate that the mean

390km/h
step 2 :
significant value speed is

✗ = 0 . 05

test statistic
Step 3 : -

-1cal
-%%
:

= 2.0656

Step 4 Critical Value


dt = 15 - I =
14

for =
for =
to.gs ,
14=1.761

step 5 : Decision Rule

Reject Ho if -1cal > tcu

since -1cal = 2.0656 > tcu = 1-7-61

Reject Ho 55
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.5: The R & D department of an industry imposed that the mean life of the light
bulbs produced should exceed 4000 hours and with a standard deviation of less than 150 < -

hours before it could be supplied to the markets. A sample 15 bulbs were tested and the
-

lengths of life are as follows (hours):

4300 4302 4415 4483 4301 4446 4478 4319


3985 4483 4377 4401 4346 4261 4353

The data was analysed using SPSS and the output is shown below:
One-Sample Statistics

N Mean Std. Deviation Std. Error Mean

hours 15 4350.00 124.748 32.210

One-Sample Test

Test Value = 4000

t df Sig. (2-tailed) Mean Difference 98% Confidence Interval of the


Difference

Lower Upper

hours 10.866 14 .000 350.000 265.47 434.53

a) Write the null and alternative hypothesis


b) Show that the value of the test statistic is 10.866
c) Write the decision rule. Use 5% significance level
d) Write the decision and the conclusion

a) Hypothesis step 5 : Decision Rule


hours
Ho = M = 4000

µ, = it 4000 hours
Reject Ho if -1cal > tcu

b) step 1 :
significant value
since -1cal = 10.8663 > Ecu = 1-7-61

✗ = 0.02
Reject to

test statistic
step 2 : -

-1cal = 41350-4000
step 6 : conclusion
evident to
There is sufficient
124.748

of
15

=
10.8663 indicate that the mean

> 90km/h
speed is
4 critical value
step :

dt = 15-1=14

for =
for =
-10.0s ,
14=1.761

56
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

4.2 HYPOTHESIS TESTING FOR DIFFERENCE BETWEEN TWO POPULATION MEANS

4.2.1 Hypothesis test for difference between two population means – independent
samples

a) Hypothesis testing for Differences between Two Population Means (variances is


known) - for both Small and Large Sample Size

Assumptions :

iv. Either 𝑛 ≥ 30 or 𝑛 < 30


v. Populations are normally distributed
vi. Populations variances 𝜎 and 𝜎 are known

Test procedure:

𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0

Test statistic:
-52) Cui Ma)
(Ñ ,
-
-

(𝑥̅ − 𝑥̅ ) − (𝜇 − 𝜇 ) Z =

𝑍=
+ ¥ -1¥;
,

Example 4.6: An experiment was conducted in which two types of engines, A and B were
compared. Gas mileage in miles per gallon was measured. 75 experiments were conducted
using engine type A and 50 experiments were done for engine type B. The gasoline used and
other conditions were held constant. The average gas mileage for engine A was 42 miles per
gallon and the average for engine B was 36 miles per gallon. Test 5% significance level
whether there is significant difference in gas mileage between engine types A and B? Assume
that the population standard deviations are 8 and 6 for engine A and B respectively.
l :
Hypothesis
Rule
Ho : Mi = M2 5 : Decision

H, : Mi ≠ M2 Reject Ho if 2- cat > Zev

level -2cal < Zev


2
sign
-

: -

A- 0.05 Since 7cal -4.78347 Zou


-
= 1.96
42=0.025 Reject Ho
3 : Test - statistic "" " " 6 : conclusion

2- cat = (42-36) ( -
sufficient evidence

¥st¥s
=
4.7834
4 :
critical Value

Zev =
2- 0.02s =
1.96

57
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

b) Confidence Interval for Differences between Two Population Means (variances is


unknown) - for Large Sample Size

Assumptions :

iv. Both sample sizes are more than 30


v. Populations are normally distributed
vi. Populations variances 𝜎 and 𝜎 are unknown

Test procedure:

𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0

Test statistic:

(𝑥̅ − 𝑥̅ ) − (𝜇 − 𝜇 )
𝑍=
+

Example 4.7: Two kinds of threat are being compared for strength. Fifty pieces of each type
of thread are tested under similar conditions. Brand A had an average tensile strength of 78.3
kilograms with a standard deviation of 5.6 kilograms, while Brand B had an average tensile
strength of 87.2 kilograms with a standard deviation of 6.3 kilograms. Test at 5% level of
significance whether the mean difference between brand A and brand B are differ.
1 :
Hypothesis if t test ada statement
- ni .

Ho M, M2
normally disturbed
: =

Hi : Mi ≠ M2 Asumee . . .
.

level
2 :
sign -

5 :
Decision Rule
✗ = 0.05

Sp =
In
,
-
1) ( si) -11ns 1) (si)
-

3 Test statistic Ma
n.tn . -2
am
: - -
,

( cat = (78.3-87.2) ( O ) -

Sp : 4915.62) -14916.32)
98
35.525
so gto
-

=
35.525
= - 13.6839

4 :
critical Value

df = 50-1=49

tw =
-10.025,49 =

58
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

c) Hypothesis test for difference between two population means (Variances is


unknown and assumed equal variances) – for small sample

Assumptions:

i. The samples are random samples


ii. The sample data are independent of one another
iii. When the sample size are less than 30, the population must be normally or
approximately normally distributed
iv. Variances of population are unknown but the variances are assumed to be equal

Test procedure:

𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0
t-test
Test statistic: equal variance

( ̅1 ̅2) ( 1 2)
𝑡= 1 1
Where 𝑑𝑓 = 𝑛1 + 𝑛2 − 2
1 2

( 1 1) 12 ( 2 1) 22
𝑠 =
1 2 2

Example 4.8: An insurance company wants to know if the average speed at which men drive
cars is greater than that of women drivers. The company took a random sample of 26 cars
driven by men on a highway and found the mean speed to be 72 miles per hour with a standard
deviation of 2.2 miles per hour. Another sample of 16 cars driven by women on the same
highway gave a mean speed of 68 miles per hour with standard deviation of 2.5 miles per
hour. Assume that the speeds at which all men and all women drive cars on this highway are
both normally distributed with the same population standard deviation.

Test at 2.5% significance level whether the average speed at which men drive cars is greater
than that of women drivers.

Solution:

59
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

not same -
one -
tail
Example 4.9: The manufacturer of a small battery-powered tape recorder decides to include
four alkaline batteries with its product. Two battery suppliers are being considered; each has
its own brand (brand 1 and brand 2). The supervising inspector of incoming quality wants to
two #all
know if the average lifetimes of two brands are the same. Based on past experience, she
believes that the battery lifetimes follow a normal distribution with equal variances. A sample
experiment is conducted: each of ten batteries (five of each brand) is connected to a test
device that places a small drain on the battery power and records the battery lifetimes the
following result (in hours) are obtained:
Hypothesis -

unequal
Brand 1 43 48 38 41 51
Brand 2 30 26 37 31 34
Group Statistics

brand N Mean Std. Deviation Std. Error Mean


same
brand1 5 44.20 5.263 2.354
hours
2 5 31.60 4.159 1.860

Independent Samples Test

Levene's Test t-test for Equality of Means


for Equality of
Variances

F Sig. t df Sig. (2- Mean Std. Error 95% Confidence Interval of the
tailed) Differe Difference Difference
nce Lower Upper

Equal
variances .605 .459 4.200 8 .003 12.600 3.000 5.682 19.518
assumed 0.459
>
hours Equal 0.05
variances thus ,
equal 4.200 7.594 .003 12.600 3.000 5.617 19.583
not
variants so
takyahbacanilai bawab
assumed

Based on the output, answer the following questions:

a) State the null and alternative hypothesis


b) By referring to the p-value, would you reject or accept the null hypothesis at 5%
significance level? State your reason. Reject Ho it p-value
< a

c) Can the supervising inspector of incoming quality conclude that the average lifetimes
of the two brands are not equal? Yes reject Ho ,

brand c) Ss Conclusion
1 Hypothesis I :

Define
: 1 :

:
2 brand 2 sufficient evidence to indicates
6,2=6: There
:

brand I Ho is

lifetime of the two brands


=

I = a) S1 : Ho : µ, = us
that the mean
H' : °? ≠ °?
2 = brand 2 Hi : Mi =/ Ma
are unequal .

2 value 4 D. Rule b) 52 :
Sig . value
Sig
: .
:

✗ = 0.05

✗ = 0.05 Reject Ho if p-value < ✗


S3 :
p-value
value
since p-value 0.459
= > ✗ = 0.05, p value = 0.003
-

3 :p -

0.499
Fail to reject Ho S4 :D Rule
p value
.

- :

Reject Ho if p-value < ✗

5 : Conclusion 0.05
p-value 0.003h =
since = ✗

There insufficient evidence to indicate that reject Ho .

the variances are unequal


60
.

NZZ & NHNMS, UiTM SHAH ALAM


STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

d) Hypothesis test for difference between two population means (Variances is


unknown and assumed unequal variances) – for small sample

Assumptions:

i. The samples are random samples


ii. The sample data are independent of one another
iii. When the sample size are less than 30, the population must be normally or
approximately normally distributed
iv. Variances of population are unknown but the variances are assumed to be different

Test procedure:

𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0

Test statistic:

(𝑥̅1 − 𝑥̅2 ) − (𝜇1 − 𝜇2 ) unequal -1 test



-

𝑡=
2
1
+
1 2

2
 s12 s22 
  
n n
Where df   12 2  2
 s12   s22 
   
 n1    n2 
n1  1 n2  1

Example 4.10: The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5
and a sample standard deviation of 11.90. In addition, the breaking strengths of another 12
bundles of synthetic fibres have a sample mean 452.8 and a sample standard deviation 3.61.
Assume the breaking strengths of the two populations are normally distributed with unequal
variances.Test at 5% level of significance whether the mean breaking strengths for lwools
fibres is less than of synthetic fibres.,
one -
tail
tcv ta ,df
Solution:
=

df =
141.61+13.0321
12
wool fibre
It

1 :
141.61 2
13.03212

synthetic fibre
"
2 : + 12

It -
I 12 -
I

a) 81 : Ho : Mi = Ma =

H, : M ,
< M2
84 : Decision Rule
b) 52 :
Sig . value
Reject Ho if -1cal L -
tu
✗ = 0.05
Since teal =
-4.36274 for = -1.796
test statistic
,
53 :
Reject Ho
-

-1cal = (436.5-452.8) -
0
S5 : conclusion
'
3.612
1¥ +
12 There is sufficient evident to indicate

that Hi
4.3627
.

= -

61
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.11:A set of facilitation tools to help with data analysis for problem solving is being
developed by a group of statisticians at UiTM. In order to test effectiveness of these tools, a
group of research officers were asked to analyze and produce a built-in report for a set of data
on the computer. Twelve equally capable research officers were randomly selected and six
were randomly assigned a standard procedure to complete the task. The other six were asked
to do the task using the developed facilitation tools. The response measured was the time to
completion (in minutes). The output of statistical analysis is shown in the following tables.

Group Statistics

Tool N Mean Std. Deviation Std. Error Mean

Standard Procedure 6 65.50 5.891 2.405


time completion
Facilitation Tool 6 36.50 4.087 1.668

Independent Samples Test

Levene's Test t-test for Equality of Means


for Equality of
Variances

F Sig. t df Sig. Mean Std. Error 95% Confidence


(2-tailed) Difference Difference Interval of the
Difference

Lower Upper

Equal variances
1.231 .003 9.908 10 .000 29.000 2.927 22.478 35.522
Time assumed
completion Equal variances
9.908 8.908 .000 29.000 2.927 22.368 35.632
not assumed

a) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
b) State the null and alternative hypotheses.
c) At 5% significance level, can it be concluded that the mean difference in time completion
between standard procedures is more than facilitation tools.

SI Hypothesis b) Define :

a)
:

1 Std procedure
6,2=6:
:

Ho
.

2 : facilitation tool
Hi : 6? ≠ 6:
step 1 :

S2 value Hoi Mi Ma
Sig
-

: .
-

H , N > M2
0.05
,

✗ =

c) S2
S3 :p - value
:
Sig . value
2=0.05
003
p value :O
- .

S3 :
p-value
0-000
54 D. Rule
:
p-value _
=
0.000
,
2

Reject Ho if p-value < ✗


S4 : Dec . Rule

since p-value 0.459 = < ✗ = 0.05,


Reject Ho if p-value < ✗

Hence , reject Ho since p value


-
-0.000 <
✗ = 0.005 ,

55 : conclusion Reject Ho

sufficient evidence to indicate that 55 Conclusion


:

There
There is sufficient evidence to indicate the mean
the variances are unequal .
of time completion std to procedure more than
facilitation tool .

62
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

4.3 Hypothesis test for difference between two population means - Dependent
samples

Assumptions:

i. The sample or samples are random


ii. The sample data are dependent
iii. When the sample size or sample size are less than 30, the population or populations
must be normally or approximately normally distributed.

Test procedure:

𝐻 :𝜇 = 0
𝐻 :𝜇 ≠ 0 𝐻 :𝜇 > 0 𝐻 :𝜇 < 0

Test statistic:

𝑑̅ − 𝜇
𝑡= 𝑠
√𝑛

Where:

∑𝑑
𝑑̅ =
𝑛

1 (∑ 𝑑)2
𝑠 = 𝑑2 −
(𝑛 − 1) 𝑛

𝑑𝑓 = 𝑛 − 1, 𝑤h𝑒𝑟𝑒 𝑛 𝑖𝑠 𝑛𝑜. 𝑜𝑓 𝑝𝑎𝑖𝑟𝑠

Two -

dependent One -

sample

ñ m
nd teal
-

-1cal =
d- -
=

s
/Tn
Sd / Tn

Kcal

63
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.12: Many engineering students are having problem in data analysis using
statistical software. A professor who teaches statistics for engineering course offered a two
day workshop on this topic. The following table gives the test scores of seven engineering
students before and after they attended the workshop.

Before 56 69 48 74 65 71 58
after 62 73 44 85 71 70 69

Test at 5% significance level whether attending the workshop increases the test scores?

Paired Samples Test

Paired Differences t df Sig. (2-

Mean Std. Std. Error 95% Confidence Interval of tailed)

Deviation Mean the Difference

Lower Upper
tj.at
Pair 1 before - afer -4.714 5.648 2.135 -9.938 .510 -2.208 6 .069

-
Solution: after
output
1- -

for 2-tail
2- before
-

so kene

Hypothesis stop 6 divide 2


Ho : Md = 0
✗ = 0.10
Hi : Md > 0

Step 2 :
Sig .
value
% = 0.05
✗ = 0.05

Step 3 : Test statistic

t.cat = + 2.208

b a d step 4 : Critical Value


l

24
^
tcv -10.05
-

I 6
=
, = 1. 943
2

3
D. Rule
4 n .
,
steps :

Reject Ho

64
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

4.4 TESTING FOR THE DIFFERENCE AMONG MORE THAN TWO MEANS

4.4.1 ONE WAY ANALYSIS OF VARIANCE (ANOVA)

Analysis of Variance is a method where the total variation / variability in a set of data are
partitioned into several components. The main reason why we need to perform the ANOVA is
to test the equality of means that involved more than two population means. These
components can be used to answer the effects of factor on the response variable of interest.

There are some terminologists:

Terminology Definition
Response Variable/ Dependent Variable of interest to be measured in the
Variable experiment
Factors/ Independent Variable Variable whose effect on the response variable
Factor Level Values of the factor utilized in the experiment
Treatment Factor level of combination
Experimental Unit The object on which measurement is taken

The following are assumptions that should be satisfied when applying one way ANOVA:

i. Each of the groups must be random samples from normal populations


ii. Population variances must be equal
iii. The samples are randomly selected in an independent

How does ANOVA works? The idea behind the ANOVA is to compare the ratio of between
group variance to within group variance. If the variance caused by the interaction between the
samples is much larger when compared to the variance that appears within each groups, then
it’s because the means aren’t the same. In order to test the equality of three or more
populations’ means, we use the ANOVA F-test. The test statistic for the F-test can be obtained
from the ANOVA table. Therefore, we have to first construct the ANOVA table to obtain the
test statistic for the F-test. F-test only tells whether there is a difference in the population
means but it does not provide information on which pair of means that differ.
Fcv Gable 9)
How to construct ANOVA table: = " " " "
"

Source of Sum of Degree of Mean of reject Hnull


F ga ,
variation squares freedom squares if Foal > For

number of
Between group → treatment 𝑆𝑆𝑇𝑟
SSTr k-1 𝑀𝑆𝑇𝑟 = 161.4 -15% significant
(Treatment) 1¥:# ¥ ) -4¥ 𝑘−1 value

& -1 161.4 -15% significant


value
Within groups 𝑆𝑆𝐸 𝑀𝑆𝑇𝑟 ↑
0.0s

SSE N-k 𝑀𝑆𝐸 = 𝐹= 0.025


(Error) 𝑁−𝑘 𝑀𝑆𝐸 E- %%
=

Total SST '


N-1
{in ' -

;D
CE

Anova two -

independent -1
-
test
(atleast 3) untuk 2 variable

CGPA CGPA sahaja


65
NZZ & NHNMS, UiTM SHAH ALAM AIRE,Csɥ ifsalah

female
satu ≠

male
=

reject Honu"
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

𝑇 𝑇 𝑇 (∑ 𝑥) ANOVA
𝑆𝑆(𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡) = 𝑆𝑆𝑇𝑟 = + +⋯+ −
𝑛 𝑛 𝑛 𝑁 (at least 3)
dependent variable

(Gpart
(∑ 𝑥) ,
𝑆𝑆(𝑇𝑜𝑡𝑎𝑙) = 𝑆𝑆𝑇 = 𝑥 − interval /ratio
𝑁
treatment : students of
𝑆𝑆(𝐸𝑟𝑟𝑜𝑟) = 𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑇𝑟 CS dirt
group
AIRE,

if salah
= satu ≠

Reject Honan
Where, ¥É
𝑘 = Number of treatments
𝑛 = The size of sample 𝑖
𝑇 = The sum of value in sample 𝑖
𝑁 = The number of values in all samples
= 𝑛 + 𝑛 + ⋯+ 𝑛
𝑥 = The sum of the values in all samples

= 𝑇 + 𝑇 + ⋯+ 𝑇

𝑥 = The sum of the squares of values in all samples

The procedure to perform the ONE-WAY ANOVA

Step 1: State the hypothesis

H0: The treatment means are equal / to =


Mi = Ma =
M3

H1: At least 2 treatment means are differ

Step 2: Test statistic Fcat

𝐹 = (get from ANOVA table)

Step 3: Decision rule: Reject H0 if 𝐹 > 𝐹


iFfq reject Ho
here,

, , For

Where 𝜶 is the level of significance, k-1 is the degree of freedom for numerator of F
ratio and N-k is the degree of freedom for the denominator of F ratio

Step 4: Decision

Step 5: Conclusion

66
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.13 : Fifteen fourth-grade students were randomly assigned to three groups to
experiment with three different methods of teaching arithmetic. At the end of the semester, the
same test was given to all 15 students. The table gives the scores of students in the three
groups.

Method 1 Method 2 Method 3


48 55 84
73 85 68
51 70 95
65 69 74
87 90 67
324
369 388
At 1% level of significance, can we reject the null hypotheses that mean arithmetic scores of
all fourth-grade students taught by the three methods is the same?

Source of Degree of
Sum of squares Mean of squares F
variation freedom
3242
, 3692+3882=216.2 -
10,8g
"
K -
I
432.1333
Method 5 5 5 3-1=2 2
= 216.0667
T :{n = 1081 = 432.1333

N K -

Error is -3=12
197.3333 1.093
Enz _

CGI [ "
Total ( 482-1737512-1652+87 ? . . . . .
) -

↑} m ,

15-1=14
,

= 80709 -

= 2804.93

Slept Hypothesis :

Solution:
to : M, = M2 =
M3
SS(Method) = H, : at least two treatment means are different

level
step 2 :
sign .

✗ = 0.01

step 3 : test statistic

SST = Fcat : I. 093

step 4 : critical value

FCV =
Fo - 01 , 2,12 = 6.93

Step 5 : Decision Rule

Reject Ho if Foal > For


SSE =
Since Foal : 1.093 < Fcv = 6.93
,

Fail to Ho
reject .

Step 6 : Conclusion

There is insufficient evident to indicate


that Hi

67
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Reconsider the previous example, conduct the F-test based on the SPSS output given
below.
ANOVA
Score
Sum of
df Mean square F Sig.
squares
Between Groups 432.133 2 216.067 1.093 .366
Within groups 2372.800 12 197.733
total 2804.933 14

68
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.14: In a comparison of the strengths of concrete produced by four experimental


mixes, three specimens were prepared from each type of mix. Each of the 12 specimens was
subjected to increasingly compressive loads until breakdown. The accompanying table gives
the compressive loads, in tons per square inch, attained at breakdown.

Mix A Mix B Mix C Mix D


2.30 2.20 2.15 2.25
2.20 2.10 2.15 2.15
2.25 2.20 2.20 2.25

Given SSE = 0.02, MSE = 0.003 with degree of freedom of 8,

a) Construct an ANOVA table for the above experiment.


b) At 5% level of significance, can it be concluded that at least one of the concretes differs in
average strength from the others?

SSE = 0.02 MSE = 0.0025 Foal = 2.000

Mean
Source of
sum of squares Degree
of freedom of squares 1-
variation

M-ethd.tk?::-.-=mss-.==,==
+ + + 0.015
3 3 3 3

ˢ
" =
25
n -
k

ᵗE=!,,""=E=o.oox
2. 000
Islam diabagimlaini )

'
( En)
[ na
'

SST = _

h I

Total
-

=
58.115 -

12
12-1--11
=
0.035

[ nkeciksbbdia sample )
guna
step 1 :
Hypothesis Fail to reject Ho
Ho : Mi = Ma =
M3 =
My
step 6 : conclusion
Hi at least treatment means are different
to indicate that
one
insufficient evident
:

There is

level treatment is different


sign at least mean

step 2
.

: .
one

✗ = 0.05

test statistic
step 3 :

Fcat = 2.000

critical value
step 4 :

For =
Fo 05,3 .
,
8 =
4.07

Decision Rule
Step 5 :

Ho if Fcat > For


Reject
Fcv 4.07
Since Feat = I. 6674 :
69
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

2 variable ( categorical)
4.5 TEST OF INDEPENDENCE → Association

In the previous chapters, data are always assumed to follow a certain distribution such
as Binomial, Poisson and Normal. In this subtopic, we will discuss the independence
test to analyse categorical variables/ independence test in particular deals with testing
the independence between two categorical variables. The test will involve the use of Chi-
square distributions.

The test of independence is performed using the contingency table (cross tabulation). In a test
of independence for contingency table, we test the null hypothesis that the two characteristics
of the elements of a given population are not related (i.e. they are independent) against the
alternative hypothesis that the two characteristic are related (i.e. they are dependent).

The assumptions for the Chi-square distribution

1. The data are obtained from a random sample


2. The expected frequency for in each cell must be 5 or more. If the expected values are
not 5 or more, it should be combine categories.

The formula for the Chi-square distribution is given by the following formula:

(𝑂 − 𝐸)
𝜒 =
𝐸

With degrees of freedom, df = (r-1)(c-1)

Where O is the observed frequencies (actual data)


( ) ×( )
E is the expected frequencies for a cell 𝐸=

The procedure to perform the test of independence:

Step 1: State the hypothesis

H0: The variables are independent/no association/no relationship of each other

H1: The variables are dependent/an association/relationship of each other

Step 2: Test statistic

(𝑂 − 𝐸)
𝜒 =
𝐸

Step 3: Decision rule: Reject H0 if 𝜒 > 𝜒 ,( )( )

Where r is the total number of row and c is the total number of column

Step 4: Decision

Step 5: Conclusion

70
NZZ & NHNMS, UiTM SHAH ALAM
step 1:
Hypothesis
Ho There associate between
: is no
ethnicity & political parties .

Hi : There is an associate between ethnicity & political parties .

Step 2 :
sign . Level

✗ = 0.05

ethic
C
Politi
,Y↓+y B I

A 70 20 10 100
ethic
Observed Table political B ( I
go party
, go go 200
100×100
100×100 100×100
300 300
300
100 100 100 300 A 33.3 33.3
100
33.4 = =
=

100×200 100×200 100×200


300 300 300

Step 3 : Test statistic B


Expected = 66.6 = 66.7 = 66.7

"

✗ cat
'
( Oi Ei )
{ 100 100 100 300
-
=

Ei
'

( 70-33.45+(20-33.3)
'

= + +
(90-66.7)
. . . . . .

33.4 33.3 66.7 hehe total 100

66-7-166.7+66.6=100
= 92.6268
tukarmemaneporpuluhan
step 4 : CV

Xiu ✗ Step 5 Decision Rule 6 Conclusion


:
Step
-

o.os.cr-na.is :
_


'

o .os ,
, Reject Ho if Kcal > X' or There is an association
=
table 8=5.991 Since Kcal -92.6268
-
> Xiv -5.991 between ethnicity and
Reject Ho
political parties .
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.15: A random sample of 400 people is selected from all the 16-yar-olds in a town.
The variables recorded were temper (vile or mild) and hair colour (red, brown or black).
The observed frequencies are displayed in table below. Test the hypothesis that temper and
hair colour are independent at the 5% level of significance.

Temper
Colour hair Total
Vile Mild
Red 40 20 60
Brown 80 100 180
Black 60 100 160
Total 180 220 400
Solution:

Temper Total
Colour hair
Vile Mild
Red Count 40 20 60
Expected Count 27 33
Brown Count 80 100 180
Expected Count 81 99
Black Count 60 100 160
Expected Count 72 88
Total 180 400 220
(𝑂 − 𝐸)
𝜒 =
𝐸
(40 − 27) (20 − 33) (80 − 81) (100 − 99) (60 − 72) (100 − 88)
𝜒 = + + + + +
27 33 81 99 72 88

𝜒 = 6.2593 + 5.1212 + 0.0123 + 0.0101 + 2 + 1.6364

𝜒 = 15.0393

Step 1: State the hypothesis

H0: The color of hair is independent to temper.

H1: The color of hair is dependent to temper.

Step 2: Test statistic

𝜒 = 15.0393

Step 3: Decision rule: Reject H0 if 𝜒 > 𝜒 ,( )( ) =𝜒 . ,( )( ) = 5.991

Step 4: Decision

Since 𝜒 = 15.0393 > 𝜒 . ,( )( ) = 5.991, 𝑡ℎ𝑢𝑠 𝑤𝑒 𝑟𝑒𝑗𝑒𝑐𝑡 H0

Step 5: Conclusion

There is enough evidence to conclude that the color of hair is independent to temper.

71
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

Example 4.16: The attendance and examination of a random sample of 60 pupils are given
in table below. The data were analyzed using SPSS and the output as follows:

Attendance * Exam result Crosstabulation

Exam result
Pass Failed Total
Attendance Excellent Count 25 10 35 if Im
Expected Count K 11.7 35.0
Satisfactory Count 10 5 15 3=35-11.7 =

Expected Count 10.0 5.0 15.0 K = IS -


to =

Poor Count 5 5 10
.

Expected Count 6.7 3.3 10.0 if 4m


Total Count 40 20 60
Expected Count 40.0 20.0
\
60.0
J :
(35×40)/60
K : (15×20)/60
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square L 2 .448 ↑
Likelihood Ratio 1.544 2 .462 p-value
Linear-by-Linear 1.422 1 .233
Association
N of Valid Cases 60
a. 1 cells (16.7%) have expected count less than 5. The minimum
expected count is 3.33.

a. Find the value K and L


( ) ×( )
K=𝐸= = 23.33

(𝑂 − 𝐸)
𝐿= 𝜒 =
𝐸
(25 − 23.33) (10 − 11.7) (10 − 10) (5 − 5) (5 − 6.7) (5 − 3.3)
𝐿=𝜒 = + + + + +
23.33 11.7 10 5 6.7 3.3

𝐿 = 𝜒 = 0.1191 + 0.2470 + 0 + 0 + 0.4313 + 0.8758

𝐿 = 𝜒 = 1.6732
✗ cat

b. State the null and alternative hypotheses for this study


H0: The variables are independent of each other
H1: The variables are dependent of each other

c. By using p-value, test at 5% level of significance whether examination results is


independent of attendance.

=
Step 2: Test statistic

p-value = 0.448

Step 3: Decision rule: Reject H0 if p-value < α

72
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES

:
Step 4: Decision

Since 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.448 > 𝛼 = 0.05, 𝑡ℎ𝑢𝑠 𝑤𝑒 𝑓𝑎𝑖𝑙 𝑡𝑜 𝑟𝑒𝑗𝑒𝑐𝑡 H0

Step 5: Conclusion

There is not enough evidence to conclude that The variables are •


independent of

1.
each other.

There is no sufficient evidence to indicate that attendance and exam result are dependent
mesh Hi

73
NZZ & NHNMS, UiTM SHAH ALAM

You might also like