Copy of STA404_Students Module_v4
Copy of STA404_Students Module_v4
Example 3.4: The breaking strengths of 11 bundles of wool fibers have a sample mean 436.5
and a sample standard deviation of 11.90. Assume the breaking strengths of the populations
are normally distributed. Construct a 90% confidence interval for the mean breakings strengths
for wool fibers.
to
bias sd is unknown unknown
6
games
=
,
known 2
guna
=
table
=
n
7
= ,
,
step 1 : Find ✗
Find 3 Find Ct
step
a :
✗ = I -
0.90=0.10 , 9/2 = 0.05
tab
'd-1 M = Ñ ± takin
-1
¥
Step 2 : Find tab is =
takin-1=-1 , , =
0.1% ,
N = 436.5 ± 6.5015
step 4 : conclusion
a
Find
step 3 :
step 1 : ✗
µ = in ± tab '
"'
¥
I 0.90 = 0.10
a =
¥
-
µ = I ± to ,
9- I
to ,
9- I = 1.860
M < 2.2140
I. 5661 <
t
Emit of measurement)
.
Conclusion keneada
32
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 3.6: A sample 15 bulbs were tested and the lengths of life are as follows (hours):
One-Sample Statistics
One-Sample Test
Test Value = 0
Lower Upper
33
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
In this section we will discuss estimation procedures for the difference between two population
means. There are two different types of interval estimation for difference between two means,
namely independent and dependent samples.
Example:
1) We may want to find interval estimation on the difference between the mean lengths
of insects measured by two different microscopes
2) We may want to find interval estimation on the difference between the mean pH in
rainfall of two different areas.
Let 𝜇 and 𝜇 be the mean of the first and second population respectively. We want to find the
confidence interval of the difference between the two population means 𝜇 − 𝜇 .then 𝑥̅ − 𝑥̅
is the sample statistic used to make the confidence interval.
3.7.1 Confidence Interval for Difference between Two Population Means -Independent
samples
Two samples are independent if they are draw from two different populations and the elements
of first sample have no relationship to the elements of the second sample.
Example: To estimate the difference between the weights of male and female students. We
select two samples, one from male student’s population and another from female student’s
population. Thus, these two samples are independent because they are chosen from two
different populations, and the samples have no effect on each other.
0 known → ( In a.)
,
-
=
,
,
① state in the
}
if unknown → here can variance equal or not sbb diff formula q
② to know the
gpgg output not
variance equal or
calculation
③ calculate by
34
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Assumptions :
i. Either 𝑛 ≥ 30 or 𝑛 < 30
ii. Populations are normally distributed
iii. Populations variances 𝜎 and 𝜎 are known
𝜎 𝜎
(𝑥̅ − 𝑥̅ ) ± 𝑍 +
𝑛 𝑛
Example 3.7: An experiment was conducted in which two types of engines, A and B were
-
compared. Gas mileage in miles per gallon was measured. 75 experiments were conducted
using engine type A and 50 experiments were done for engine type B. The gasoline used and
other conditions were held constant. The average gas mileage for engine A was 42 miles per
gallon and the average for engine B was 36 miles per gallon. Find a 96% confidence interval
on 𝜇 − 𝜇 , where 𝜇 and 𝜇 are population mean gas mileage for engine A and engine B,
respectively. Assume that the population standard deviations are 8 and 6 for engine A and B
respectively.
n
A
= 75 experiment NB = 50 experiment
miles per
ÑA = 42 miles per gallon
ÑB =
36
gallon
◦A =
8 miles per
gallon
◦B = 6 miles per
gallon
step 1 : Find a
-0.96=0.04
✗ =
I
b) From the answer in a) , can we
20.04/2=20.02 =
2. 0537
(I =
(42-36)=1 2.0537 82 62
interval 0
7s so equal =
(I =
6=12.5760
3.420L ( In < 8.5760 Unequal = interval =/ 0
(I =
( 3. 4240 ,
8.5760 ) or ,
-
us
Step 4 : Conclusion
Therefore at 96% confidence interval for the diff between A
i. ,
mean
engine
and
engine
B is between 3. 4240 and 8.5760 miles per gallon .
35
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Assumptions :
𝑠 𝑠
(𝑥̅ − 𝑥̅ ) ± 𝑍 +
𝑛 𝑛
Example 3.8: Two kinds of threat are being compared for strength. Fifty pieces of each type
of thread are tested under similar conditions. Brand A had an average tensile strength of 78.3
kilograms with a standard deviation of 5.6 kilograms, while Brand B had an average tensile
strength of 87.2 kilograms with a standard deviation of 6.3 kilograms. Construct a 95%
confidence interval for the difference of the population means.
step I -
. Find a
✗ = I -
0.95 = 0 . 05
step 2 :
step 3 : Find t % dt
t% df =
to.gs . it = 2.201
↑
table no 7
36
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Assumptions:
(𝑥̅ − 𝑥̅ ) ± 𝑡 , 𝑠 + Where 𝑑𝑓 = 𝑛 + 𝑛 − 2
(𝑛 − 1)𝑠 + (𝑛 − 1)𝑠
𝑠 =
𝑛 +𝑛 −2
ynkalausambungayatso
jadi sample
=
2-
Example 3.9: An insurance company wants to know if the average speed at which men drive
cars is greater than that of women drivers. The company took a random sample of 26 cars
driven by men on a highway and found the mean speed to be 72 miles per hour with a standard
deviation of 2.2 miles per hour. Another sample of 16 cars driven by women on the same
highway gave a mean speed of 68 miles per hour with standard deviation of 2.5 miles per
hour. Assume that the speeds at which all men and all women drive cars on this highway are
both normally distributed with the same population standard deviation. Construct a 98%
confidence interval for the difference between the mean speeds of cars driven by all men and
all women on this highway. (Ans: 𝑠 = 2.317, (2.216,5.784))
dt=V=
equal h tha -
2
nfemak 16 ,
N male = 26 cars
= cars
(±
ñmale = 72 mph
Ñ female = 68 mph unequal iv. df =
1 Find a
step :
4 : Find
step CI
a = I -0.98=0.02
df
.de/Sptn-.-n-. )
= n, -1ha -2
-10.021 ,
df
= -10.01/40=2.423
↑
(I = (72-68) ± 2.423 (2.31711%+16)
table 7
Step 3 : Find Sp Sp :
[ 26 -
1) 2.23+(16-1) 2.52
Cni -
Dsi -11ns 1) s ?
-
26+16-2
gp= hi -1ha -2
2.3171
Sp =
37
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Group Statistics
Lower Upper
a) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
b) State the 95% confidence interval on the differences between the average lifetimes of the
two brands.
c) Based on the confidence interval, can we conclude that the average lifetimes of the two
brands are equal?
whether brand A & brand B have equal
variance
a) Identify
.
' '
Slept to : 6A =
GB
H ,
i
62A =/ 6132
step 2 Find ✗
✗ = 0.05
fail to reject Ho
steps conclusion .
38
NZZ & NHNMS, UiTM SHAH ALAM Brand A & B have equal variances .
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Assumptions:
2
s12 s22
n n2
(𝑥̅ − 𝑥̅ ) ± 𝑡 , + Where df 12 2
s12 s22
n1 n2
n1 1 n2 1
Example 3.11:The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5
and a sample standard deviation of 11.90. In addition, the breaking strengths of another 12
bundles of synthetic fibres have a sample mean 452.8 and a sample standard deviation 3.61.
Assume the breaking strengths of the two populations are normally distributed with unequal
variances. Construct a 95% confidence interval on the mean difference of breaking strengths
between wool fibres and synthetic fibres. Explain your answer? (Ans: (-24.6245, -7.9755) OR
(-24.5235,-8.0764)) Slept : Finds
✗ =/ -
0.95=0.05
Find -1%
Step 2 : .df
table 7
↑
-10.05 -10.025,11
-1% ,df 2.201
= =
ill
=
Step 3 : Find dt
2 2
s? s} 11.902 3.612
" "
df
n ' m
= = = 11.6828 ≈ 11
2 2
s ? 2
S? 11.902 3.612
hi nz It 12
ha I 11 I 12 -
I
I
-
h,
-
-
Find CI
step 4 :
sits:
(I =
Cñi ña ) - ± -1g ,
,df N ,
N2
11.902 3.612
(I = (436.5-452.8)=12.201 11 12
(I = -
16.3=18.2235
(I = - 16.31=8.2235
(I =
f- 24,5235 -8.0764) ,
Conclusion
step 5 :
The 95%
confidence interval of the mean difference of breaking strength
fibres between -24.5235 and -8.0764
between wool fibres
39
and synthetic is
Example 3.12:A set of facilitation tools to help with data analysis for problem solving is being
developed by a group of statisticians at UiTM. In order to test effectiveness of these tools, a
group of research officers were asked to analyze and produce a built-in report for a set of data
on the computer. Twelve equally capable research officers were randomly selected and six
were randomly assigned a standard procedure to complete the task. The other six were asked
to do the task using the developed facilitation tools. The response measured was the time to
completion (in minutes). The output of statistical analysis is shown in the following tables.
Group Statistics
Lower Upper
Equal variances
1.231 .003 9.908 10 .000 29.000 2.927 22.478 35.522
Time assumed
completion Equal variances
9.908 8.908 .000 29.000 2.927 22.368 35.632
not assumed
a) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
b) State the 95% confidence interval to estimate the difference between the average
completion times for the two procedures.
c) Based on the confidence interval, can we conclude that the mean difference between the
average completion times for the two procedures are differ?
d) Show the degree of freedom for unequal variances is 8.908.
Let I = standard procedure
let 2 = facilitation tool
a) Levene Test
1 .
Ho = o ? = o ?
Hi = o ? ≠ o :
value (
Sig )
2.
= 0.003
p
-
3 . ✗ = 0.05
≤ ✗
4. Reject Ho if p-value 0.05
✗ =
0.003 <
Since p value
-
=
40
NZZ & NHNMS, UiTM SHAH ALAM
Step 1 : Find a
✗ =/ -
0.95=0.05
Find -1%
Step 2 : .df
table 7
↑
-1×1 ,
,df
=
£095,8 =
-10.025,8 = 2.306
Step 3 : Find dt
2 2
s? s} 5.8912 4. 0872
6
6
df
ni na
= =
2 2
= 8.9079 ≈ 8
s ? 2
S? g. 8912 4.0872
6 6
hi nz
ha I I 6 I
6
-
I
-
h,
-
-
Find CI
step 4 :
sits:
(I =
Cñi Ña ) - ± -142 , dt ni na
5.8912 4. 0872
(I = (66.50-36.50) ± 2.306
6 6
CI =
30 ± 6.7499
(I =
( 36.7499 ,
23.25 1)
Step 5 : Conclusion
The 95% confidence interval of the mean difference between the average completion times
for standard procedure and facilitation tool is between -24.5235 and -8.0764
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
3.7.2 Confidence Interval for Difference Between Two Population Means - Dependent
samples
Matched or paired samples involve a procedure whereby pairs of observations are matched
as close as possible according to certain relevant characteristics. The two sets of observations
are then subjected to two different treatments. Now, the pairs of observations selected are
similar in characteristics. Hence, if there is any difference in the two sets of observations, this
must be attributed to the treatment alone.
The point estimate for the mean difference between two observations from matched samples
is:
𝜇 =d
The (1-α) 100% confidence interval for the mean difference between two observations from
matched samples, 𝜇 is:
𝑠
d ± t α,
√𝑛
𝑛 = number of differences
∑ [∑ ]2
𝑑̅ = and s = ∑d −
Where 𝑑 is the difference between two observations from each matched samples.
41
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 3.13: The manufacturer of a gasoline additive claimed that the use of this additive
increases gasoline mileage. A random sample of six cars was selected and these cars were
driven for one week without the gasoline additive and then for one week with the gasoline
additive. The following table gives the miles per gallon for these cars without and with the
gasoline additive.
Construct a 95% confidence interval for the difference in mean mileage per gallon for cars
without and with the gasoline additive. (Ans:(-3.2150,-0.2184))
Solution:
Difference [ 26.3-24.6>31.7-28.3
=
,
18.2-18.9 , 25.3-23.7 18.3-15.4, 30.9-29.5 ,
]
=
[ 1. 7,3 4 -0.7 1.6.2.9
.
, , ,
1.4 ]
1.7-13.4+1-0.7) -11.6+2.9+1.4
d- =
6
≈ 1.717
2.9-1.71713+(1.4-1.717)
'
sd =
2.0376 ≈ 1.4274
1.4274
standard mean error : ≈ 0.5827
56
df = n - 1--6-1=5
Slept : Finds
✗ =/ -
0.95=0.05
Find -1%
Step 2 : .df
table 7
↑
-1×1 ,
,df
=
£095,11 =
-10.025,11 = 2.201
Step 3 : Find dt
2 2
s? s} 11.902 3.612
11 12
df
hi N2
= = = 11.6828 ≈ 11
2 2
s ? 2
S? 11.902 3.612
hi nz It 12
ha I 11 I 12 -
I
I
-
h,
-
-
Find CI
step 4 :
"+ ˢ:
(I =
(Ñi Ña ) - ± -14 ,
,df n ,
na
11.902 3.612
(I = (436.5-452.8)=12.201 It 12
(I = -
16.3=18.2235
(I = - 16.31=8.2235
(I =
f- 24,5235 -8.0764) ,
Conclusion
step 5 :
The 95%
confidence interval of the mean difference of breaking strength
fibres between -24.5235 and -8.0764
between wool fibres
and synthetic is
42
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 3.14: Ariff is the Human Resources Director at the head office of a reputable bank
in Ipoh. Ariff finds that absenteeism among the bank’s employee is quite high leading to poor
moral and slow performance. In order to boost employee performance and lower absenteeism
among his employees, he sent the bank’s employees to attend “The Innersole of Highly
Effective People”, a training program conducted by Top Performers Sdn.Bhd. In order to test
the effectiveness of the training program, he selected a random sample of 12 employees and
gathered data on the number of days these employees were absent from work six months
before the training program. He then collected the same data six months after the training
programs. The data is shown in the table.
4 16
C 10 6
D 6 3 3 q
E 7 8 -
1 I
16
F 9 5 4
5 25
G 11 6
H 5 3 2 4
I 7 4 3 9
J 12 10 2 4
25
K 10 5 5
L 12 6 6
Is
Determine and interpret the 95% confidence interval for the mean difference in number of days
d- = ¥
employees were absent before and after training program. (Ans: (2.1326, 4.7008))
=
¥
Solution: = 3.4167
sd =
n
!, [ Ed
'
-
"
%]
=
I, 185 -
c)
= 2.0207
Sd /Jn = 2.0207
JI
= 0.5833
9=0.05 d- = 3.1467
42=0.025 dr = 12-1=11
t 0.025,11 = 2. 201
M = d- ± to.oas.it ( ¥)
=
3.1467=12.201 (0.5833)
= 3.1467 ± 1.2838
=
(4.4305 ,
1.8629)
43
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 3.15: A random sample of nine local banks shows their deposit (in billions of dollars)
3 years ago and their deposits (in billions of dollars) today. Assume the variable is normally
distributed. The data is shown in the table.
d. 2=27.7729-1.0609-6.5025 3.2041-0.4096-0.6084-0.6084 -0.36 0.0169 21.4441
Bank 1 2 3 4 5 6 7 8 9
3 years ago 11.42 8.41 3.98 7.37 2.28 1.10 1.00 0.9 1.35 37.81 mean = 4.20111
Today 16.69 9.44 6.53 5.58 2.92 1.88 1.78 1.5 1.22 47.54 mean = 5.28222
5.27 -1.03 -2.55 1.79 -0.64 -0.78 -0.78 -0.6 0.13 -13.31
-
The data collected was analyzed and the output is shown below
Paired Differences t df
Lower Upper
-11.35 b) '
11.42-1 Ñ)
G) Efki
. . . .
3
.
:
4.20111
-
years ago g
=
variance =
n -
I
≥
(0.131-1.0811)
16.69 -1 + ' 22 = (-5.27-11.0811)%1 . . . . .
today
' .
:
5.28222
- '
. . '
.
9 9-1
≈
8
1.913
Sd =
1-3.662 ≈
sea =
g¥='j¥ˢ≈ 0.638
44
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 3.16: Many engineering students are having problems in data analysis using
statistical software. A professor who teaches statistics for engineering course offered a two
day workshop on this topic. The following table gives the test scores of seven engineering
students before and after they attended the workshop.
Before 56 69 48 74 65 71 58
After 62 73 44 85 71 70 69
The data collected was analyzed and the output is shown below
Lower Upper
Pair 1 before - after -4.714 5.648 2.135 -9.938 .510 -2.208 6 .069
a) Show that 95% confidence interval for the difference in mean tests scores before and
after attending the workshop is between -9.94 and 0.51.
b) Can we conclude whether attending the workshop increases the test score?
45
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
EXERCISE CHAPTER 3
2. A random sample of 15 items is taken, producing a sample mean of 2.364 with a sample
variance of 0.81. Assume x is normally distributed and construct 95% confidence interval
for the population mean.
3. The ACT scores from a random sample of 61 high school seniors were analyzed and found
to have a mean of 25.1 and a standard deviation of 3.6. Find a 95% confidence interval
for the mean population.
4. The drying time (in hours) of a certain brand of latex paint were recorded as follows:
a) Estimate the values of the population mean and population standard deviation
b) Construct a 95% confidence interval for the population mean.
5. To determine the flow characteristics of oil through a valve, the inlet oil temperature is
measured in degrees Fahrenheit. The following are a sample of 8 readings:
Construct a 99% confidence interval for the mean inlet oil temperature.
6. A dietitian wishes to see if a person’s cholesterol level will change if the diet is
supplemented by a certain mineral. Six randomly selected subjects were pretested, and
then they took the mineral supplement for a 6-week periods.
Paired Differences
Construct a 90% confidence interval for the mean difference in cholesterol level before
and after diet.
46
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
7. A new treatment was proposed to fight breast cancer. Six randomly selected new breast
cancer patients were treated with the new treatment. For comparison, five patients with
the old treatments were also selected at random. The survival times, in years from the time
treatments started are recorded as follows.
Group Statistics
Lower Upper
Equal variances
.505 .495 1.773 9 .110 1.433 Y -.395 3.262
assumed
Survival
Equal variances
1.819 8.976 .102 1.433 .788 -.350 3.217
not assumed
Sd /Jn
SE = sd
In
=
47
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
CHAPTER 4:
HYPOTHESIS TESTING
In the case when the value of a population parameter is unknown, the value will be estimated;
either by a point estimate or interval estimate. If a statement (hypothesis) is a statement or
claim made about the value of a population parameter, we would test whether the statement
made is true or not. The procedure to do this is called hypothesis testing or test of significance.
Since the statement made is about population parameters, to do the test we will take a random
sample from that population and calculate the sample statistics. Thus, based on the
information from the sample we will make a decision whether to reject or not to reject the
statement made.
Null hypothesis is a statement that the population parameter has a specific value. We will
use 𝐻0 to represent the null hypothesis. Thus, the null hypothesis is always stated using the
equal sign.
𝐻0 : θ = θ
Alternative hypothesis is the hypothesis opposite to 𝐻0 and this hypothesis will be accepted
if 𝐻0 is rejected. It is also known as the research hypothesis. The alternative hypothesis can
be two forms: directional and non-directional.
a) Non-directional
𝐻 :θ ≠ θ
b) Directional
𝐻 :θ < θ
𝐻 :θ > θ
The two of alternative hypothesis forms two types of tests: one tailed and two tailed tests
Significance level is the maximum probability of committing a type I error. This probability is
symbolized by 𝛼 (Greek Letter alpha).
48
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Test statistic is a single number calculated from the sample data as a basis in deciding to
reject or not to reject the null hypothesis.
The entire set of values that the test statistic may assume is divided into two regions. One
region consists of values that support H1 and lead to rejecting H0 is called the rejection region
(critical region). The other consists of values that support H0 is called the acceptance
region.
The value of the test statistic that divides the non-rejection region from the rejection region is
critical value.
A Type I error occurs when the null hypothesis is rejected when it is true. The value of
𝜶 represents the probability of committing this Type I error, which is.
A Type II error occurs when the null hypothesis is failed to be rejected when it is false. The
value of 𝜷 represents the probability of committing this Type II error.
Power of test is the probability of rejecting H0 given a specific alternative is true, that is, to
make a decision. It is the probability of not committing a Type II error.
𝑃𝑜𝑤𝑒𝑟 = 1 − 𝛽
The following table gives a summary of possible results of any hypothesis test.
Decision
Null Hypothesis
Reject 𝐇𝟎 Accept 𝐇𝟎
𝐇𝟎 is true Type I error Correct Decision
𝐇𝟎 is false Correct Decision Type II error
49
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Situation A
A contractor wishes to lower heating bills by using a special type of insulation
in houses. The average of the monthly heating bill is RM78.
The hypotheses for this situation are:
H0: µ = 78
H1: µ < 78
Situation B
A chemist invents an additive to increase the life of an automobile battery. It is
known that the mean lifetime of the automobile battery is 36 months.
The hypotheses for this situation are:
H0: µ = 36
H1: µ > 36
Situation C
A medical researcher is interested in finding out whether a new medication will
have any undesirable side effects. He is particularly concerned with the pulse
rate of the patients who take the medication. He knows that the mean pulse
rate for the population under study is 82 beats per minute.
The hypotheses for this situation are:
H0: µ = 82
H1: µ ≠ 82
*Note: to obtain a critical value, α-level must be chosen first. For two-tailed test,
α is divided into two equal parts.
If the value of the test statistics lies in the critical region/rejection region, reject
H0
If the value of the test statistics does not lies in the critical region/rejection
region, do not reject H0
50
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
i. By traditional
ii. By P-Value
Reject H0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
51
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
9Th
Stn
Test procedure: z,
lone tail) tcu (one tail) -
Za th ,
v -
degree d- freedom
7*12
than
𝐻 :𝜇 ≠ 𝜇 𝐻 :𝜇 > 𝜇 𝐻 :𝜇 < 𝜇 Reject to it 2-at
> Zn (
-
tail) right
Reject Ho if Zal <
Za Cleft tail)
-
two -
tail
Test statistic:
𝑥̅ − 𝜇0 not
𝑧= 𝜎 equal -
two tail -
√𝑛
Example 4.1: A company producing 3A batteries claims that its batteries last an average of
24 months with a standard deviation of 3 months. A sample of 36 batteries was tested. The
mean life of these batteries was 23 months. Using the 5% level of significance, is there
evidence to indicate that the mean lifetime of 3A batteries is below 24 months?
Ho = M -0.8
-
Solution: below
Hi = m > 0.8
§
Two Population
less than
tail test
% " """
% equal
"
% .
The;%
one known
higher
.
than
-
greater than
M = 24 0=3 n= 36 Ñ 23 = 4=0.05 3 : Confidence Interval
1:
if less than step 3 must be c- ve) tail two tail step
Ho 24 1: Hypothesis 0M -
µ
step
: = /
-
it must be ( tu) Ho :
M = 24
H, n < 24 Him >24 step 3
✗ %
:
Ho
,
,
µ , 24
Hi : ML 24
step 2 :
significant Value H, : µ < 24
P-value/a p-value
2 :
✗ =
0.05×2 -
one tail -
✗ =
0.05 a = o.io
2- cat =
Ñ -
M 23 -24
%
=
3/56=-2 step 3 :
p
-
value ✗
12=0.05
0.06
step Critical Value p value 0.03
=
4 :
- =
✗ =p . OS
tail
2
two
42=0.025
-
Step Rule
-
step 3 Jawapan _
2- or = -
2- ✗ = -
Zao, =
-1.644g 4 : Decision
C- ve)
Hoit p-value Step
Reject < ✗ 3 :( I
step 4
: Decision Rule 05
Since p-value
¥
= 0.03 < ✗ = ◦ '
'
CIn= ñ ± 2- %
Reject Ho it 7cal Zev Ho
Reject
-
Since Zeal = -
24 Zou -
=
-1.6449
Step 5 : Conclusion
= 23 I 1.6449%6
Reject Ho =
(22.1776/23.8225)
Steps : Conclusion
sufficient evidence to indicate that
step 4: Division Rule
There is
52
.
4.1.2 Hypothesis Test for Mean 𝝁 (Variance is unknown and large sample)
Assumption:
Test procedure:
𝐻 :𝜇 = 𝜇
𝐻 :𝜇 ≠ 𝜇 𝐻 :𝜇 > 𝜇 𝐻 :𝜇 < 𝜇
Test statistic:
𝑥̅ − 𝜇0
𝑧= 𝑠
√𝑛
Solution:
Ñ =
1.89 S2 = 0.273
S = 0.5225
Step 1 :
Hypothesis
to : M 1.8
step 2 :
significant - value
9--0.05 ; % = 0.025
3 : Test statistic
step
Ñ M 1.89 1.8
( ( at
-
•
=
=
0.5225/1-30
s/
5h
step 4 : Cu
=
0.9434
for =
-1% , u
to 025 29
=
.
,
=
2.045
Fail to reject Ho
step 6 : conclusion
Therefore there is no sufficient evidence
,
53
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.3: Based on the information given by the housekeepers, it was found that the hotel
has produced 6.1 kilograms of solid waste daily. The following tables shows the results
obtained from a further analysis of the study
One-Sample Statistics
One-Sample Test
Lower Upper
a) If the researcher would like to test whether the mean weight of the solid waste is
different from 6.1kg, what will be the null and alternative hypothesis?
b) Based on the p-value, can the researcher conclude that the mean weight of the solid
waste is different from 6.1kg?
c) Construct a 95% confidence interval for the mean weight of solid waste.
a) Hypothesis
Ho
[ In = Ñ Itani ¥
M
6.1kg
: =
b) Step 1 :
Hypothesis 5.6578 )
=
f- 4.885 ,
Ho M
6.1kg
: =
Conclusion between
Hi ≠
6.1kg
:
: M _
-
step 2 :
significant value
✗ = 0.02
Step 3 :p value -
value : 0.000
p
-
step 4 :
Decision Rule
Ho it p-value < ✗
Reject
0.000<9--0.02
Since p-value =
,
Reject Ho .
Conclusion
Step 5 :
54
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
4.1.3 Hypothesis Test for Mean 𝝁 (Variance is unknown and small sample)
Assumption:
Test procedure:
𝐻 :𝜇 = 𝜇
𝐻 :𝜇 ≠ 𝜇 𝐻 :𝜇 > 𝜇 𝐻 :𝜇 < 𝜇
Test statistic:
𝑥̅ − 𝜇0
𝑡= 𝑠
√𝑛
Example 4.4: The speed limit along the Ipoh-Lumut highway states 90km/h. the highway
petrol centre suspects that cars travelling along the highway exceed this speed limit. A sample
of 15 cars had their speeds measured by radar. The sample mean was 98km/h and the
-
standard deviation was 15km/h. at the 5% level of significance is there evidence to indicate
one tail -
that cars travelling along this highway exceed the speed limit?
Solution: A- IS 1-1,0=90
Step 1 :
Hypothesis
to : u = 90km/h step 6 : conclusion
evident to
90km/h There is sufficient
M of
indicate that the mean
390km/h
step 2 :
significant value speed is
✗ = 0 . 05
test statistic
Step 3 : -
-1cal
-%%
:
= 2.0656
for =
for =
to.gs ,
14=1.761
Reject Ho 55
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.5: The R & D department of an industry imposed that the mean life of the light
bulbs produced should exceed 4000 hours and with a standard deviation of less than 150 < -
hours before it could be supplied to the markets. A sample 15 bulbs were tested and the
-
The data was analysed using SPSS and the output is shown below:
One-Sample Statistics
One-Sample Test
Lower Upper
µ, = it 4000 hours
Reject Ho if -1cal > tcu
b) step 1 :
significant value
since -1cal = 10.8663 > Ecu = 1-7-61
✗ = 0.02
Reject to
test statistic
step 2 : -
-1cal = 41350-4000
step 6 : conclusion
evident to
There is sufficient
124.748
of
15
=
10.8663 indicate that the mean
> 90km/h
speed is
4 critical value
step :
dt = 15-1=14
for =
for =
-10.0s ,
14=1.761
56
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
4.2.1 Hypothesis test for difference between two population means – independent
samples
Assumptions :
Test procedure:
𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0
Test statistic:
-52) Cui Ma)
(Ñ ,
-
-
(𝑥̅ − 𝑥̅ ) − (𝜇 − 𝜇 ) Z =
𝑍=
+ ¥ -1¥;
,
Example 4.6: An experiment was conducted in which two types of engines, A and B were
compared. Gas mileage in miles per gallon was measured. 75 experiments were conducted
using engine type A and 50 experiments were done for engine type B. The gasoline used and
other conditions were held constant. The average gas mileage for engine A was 42 miles per
gallon and the average for engine B was 36 miles per gallon. Test 5% significance level
whether there is significant difference in gas mileage between engine types A and B? Assume
that the population standard deviations are 8 and 6 for engine A and B respectively.
l :
Hypothesis
Rule
Ho : Mi = M2 5 : Decision
: -
2- cat = (42-36) ( -
sufficient evidence
¥st¥s
=
4.7834
4 :
critical Value
Zev =
2- 0.02s =
1.96
57
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Assumptions :
Test procedure:
𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0
Test statistic:
(𝑥̅ − 𝑥̅ ) − (𝜇 − 𝜇 )
𝑍=
+
Example 4.7: Two kinds of threat are being compared for strength. Fifty pieces of each type
of thread are tested under similar conditions. Brand A had an average tensile strength of 78.3
kilograms with a standard deviation of 5.6 kilograms, while Brand B had an average tensile
strength of 87.2 kilograms with a standard deviation of 6.3 kilograms. Test at 5% level of
significance whether the mean difference between brand A and brand B are differ.
1 :
Hypothesis if t test ada statement
- ni .
Ho M, M2
normally disturbed
: =
Hi : Mi ≠ M2 Asumee . . .
.
level
2 :
sign -
5 :
Decision Rule
✗ = 0.05
Sp =
In
,
-
1) ( si) -11ns 1) (si)
-
3 Test statistic Ma
n.tn . -2
am
: - -
,
( cat = (78.3-87.2) ( O ) -
Sp : 4915.62) -14916.32)
98
35.525
so gto
-
=
35.525
= - 13.6839
4 :
critical Value
df = 50-1=49
tw =
-10.025,49 =
58
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Assumptions:
Test procedure:
𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0
t-test
Test statistic: equal variance
↓
( ̅1 ̅2) ( 1 2)
𝑡= 1 1
Where 𝑑𝑓 = 𝑛1 + 𝑛2 − 2
1 2
( 1 1) 12 ( 2 1) 22
𝑠 =
1 2 2
Example 4.8: An insurance company wants to know if the average speed at which men drive
cars is greater than that of women drivers. The company took a random sample of 26 cars
driven by men on a highway and found the mean speed to be 72 miles per hour with a standard
deviation of 2.2 miles per hour. Another sample of 16 cars driven by women on the same
highway gave a mean speed of 68 miles per hour with standard deviation of 2.5 miles per
hour. Assume that the speeds at which all men and all women drive cars on this highway are
both normally distributed with the same population standard deviation.
Test at 2.5% significance level whether the average speed at which men drive cars is greater
than that of women drivers.
Solution:
59
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
not same -
one -
tail
Example 4.9: The manufacturer of a small battery-powered tape recorder decides to include
four alkaline batteries with its product. Two battery suppliers are being considered; each has
its own brand (brand 1 and brand 2). The supervising inspector of incoming quality wants to
two #all
know if the average lifetimes of two brands are the same. Based on past experience, she
believes that the battery lifetimes follow a normal distribution with equal variances. A sample
experiment is conducted: each of ten batteries (five of each brand) is connected to a test
device that places a small drain on the battery power and records the battery lifetimes the
following result (in hours) are obtained:
Hypothesis -
unequal
Brand 1 43 48 38 41 51
Brand 2 30 26 37 31 34
Group Statistics
F Sig. t df Sig. (2- Mean Std. Error 95% Confidence Interval of the
tailed) Differe Difference Difference
nce Lower Upper
Equal
variances .605 .459 4.200 8 .003 12.600 3.000 5.682 19.518
assumed 0.459
>
hours Equal 0.05
variances thus ,
equal 4.200 7.594 .003 12.600 3.000 5.617 19.583
not
variants so
takyahbacanilai bawab
assumed
c) Can the supervising inspector of incoming quality conclude that the average lifetimes
of the two brands are not equal? Yes reject Ho ,
brand c) Ss Conclusion
1 Hypothesis I :
Define
: 1 :
:
2 brand 2 sufficient evidence to indicates
6,2=6: There
:
brand I Ho is
I = a) S1 : Ho : µ, = us
that the mean
H' : °? ≠ °?
2 = brand 2 Hi : Mi =/ Ma
are unequal .
2 value 4 D. Rule b) 52 :
Sig . value
Sig
: .
:
✗ = 0.05
3 :p -
0.499
Fail to reject Ho S4 :D Rule
p value
.
- :
5 : Conclusion 0.05
p-value 0.003h =
since = ✗
Assumptions:
Test procedure:
𝐻 :𝜇 − 𝜇 = 0
𝐻 :𝜇 − 𝜇 ≠ 0 𝐻 :𝜇 − 𝜇 > 0 𝐻 :𝜇 − 𝜇 < 0
Test statistic:
𝑡=
2
1
+
1 2
2
s12 s22
n n
Where df 12 2 2
s12 s22
n1 n2
n1 1 n2 1
Example 4.10: The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5
and a sample standard deviation of 11.90. In addition, the breaking strengths of another 12
bundles of synthetic fibres have a sample mean 452.8 and a sample standard deviation 3.61.
Assume the breaking strengths of the two populations are normally distributed with unequal
variances.Test at 5% level of significance whether the mean breaking strengths for lwools
fibres is less than of synthetic fibres.,
one -
tail
tcv ta ,df
Solution:
=
df =
141.61+13.0321
12
wool fibre
It
1 :
141.61 2
13.03212
synthetic fibre
"
2 : + 12
It -
I 12 -
I
a) 81 : Ho : Mi = Ma =
H, : M ,
< M2
84 : Decision Rule
b) 52 :
Sig . value
Reject Ho if -1cal L -
tu
✗ = 0.05
Since teal =
-4.36274 for = -1.796
test statistic
,
53 :
Reject Ho
-
-1cal = (436.5-452.8) -
0
S5 : conclusion
'
3.612
1¥ +
12 There is sufficient evident to indicate
that Hi
4.3627
.
= -
61
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.11:A set of facilitation tools to help with data analysis for problem solving is being
developed by a group of statisticians at UiTM. In order to test effectiveness of these tools, a
group of research officers were asked to analyze and produce a built-in report for a set of data
on the computer. Twelve equally capable research officers were randomly selected and six
were randomly assigned a standard procedure to complete the task. The other six were asked
to do the task using the developed facilitation tools. The response measured was the time to
completion (in minutes). The output of statistical analysis is shown in the following tables.
Group Statistics
Lower Upper
Equal variances
1.231 .003 9.908 10 .000 29.000 2.927 22.478 35.522
Time assumed
completion Equal variances
9.908 8.908 .000 29.000 2.927 22.368 35.632
not assumed
a) Based on the p-value in the Levene’s Test, test the equality of variances in this study. Use
α = 0.05
b) State the null and alternative hypotheses.
c) At 5% significance level, can it be concluded that the mean difference in time completion
between standard procedures is more than facilitation tools.
SI Hypothesis b) Define :
a)
:
1 Std procedure
6,2=6:
:
Ho
.
2 : facilitation tool
Hi : 6? ≠ 6:
step 1 :
S2 value Hoi Mi Ma
Sig
-
: .
-
H , N > M2
0.05
,
✗ =
c) S2
S3 :p - value
:
Sig . value
2=0.05
003
p value :O
- .
S3 :
p-value
0-000
54 D. Rule
:
p-value _
=
0.000
,
2
55 : conclusion Reject Ho
There
There is sufficient evidence to indicate the mean
the variances are unequal .
of time completion std to procedure more than
facilitation tool .
62
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
4.3 Hypothesis test for difference between two population means - Dependent
samples
Assumptions:
Test procedure:
𝐻 :𝜇 = 0
𝐻 :𝜇 ≠ 0 𝐻 :𝜇 > 0 𝐻 :𝜇 < 0
Test statistic:
𝑑̅ − 𝜇
𝑡= 𝑠
√𝑛
Where:
∑𝑑
𝑑̅ =
𝑛
1 (∑ 𝑑)2
𝑠 = 𝑑2 −
(𝑛 − 1) 𝑛
Two -
dependent One -
sample
ñ m
nd teal
-
-1cal =
d- -
=
s
/Tn
Sd / Tn
Kcal
63
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.12: Many engineering students are having problem in data analysis using
statistical software. A professor who teaches statistics for engineering course offered a two
day workshop on this topic. The following table gives the test scores of seven engineering
students before and after they attended the workshop.
Before 56 69 48 74 65 71 58
after 62 73 44 85 71 70 69
Test at 5% significance level whether attending the workshop increases the test scores?
Lower Upper
tj.at
Pair 1 before - afer -4.714 5.648 2.135 -9.938 .510 -2.208 6 .069
-
Solution: after
output
1- -
for 2-tail
2- before
-
so kene
Step 2 :
Sig .
value
% = 0.05
✗ = 0.05
t.cat = + 2.208
24
^
tcv -10.05
-
I 6
=
, = 1. 943
2
3
D. Rule
4 n .
,
steps :
Reject Ho
64
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
4.4 TESTING FOR THE DIFFERENCE AMONG MORE THAN TWO MEANS
Analysis of Variance is a method where the total variation / variability in a set of data are
partitioned into several components. The main reason why we need to perform the ANOVA is
to test the equality of means that involved more than two population means. These
components can be used to answer the effects of factor on the response variable of interest.
Terminology Definition
Response Variable/ Dependent Variable of interest to be measured in the
Variable experiment
Factors/ Independent Variable Variable whose effect on the response variable
Factor Level Values of the factor utilized in the experiment
Treatment Factor level of combination
Experimental Unit The object on which measurement is taken
The following are assumptions that should be satisfied when applying one way ANOVA:
How does ANOVA works? The idea behind the ANOVA is to compare the ratio of between
group variance to within group variance. If the variance caused by the interaction between the
samples is much larger when compared to the variance that appears within each groups, then
it’s because the means aren’t the same. In order to test the equality of three or more
populations’ means, we use the ANOVA F-test. The test statistic for the F-test can be obtained
from the ANOVA table. Therefore, we have to first construct the ANOVA table to obtain the
test statistic for the F-test. F-test only tells whether there is a difference in the population
means but it does not provide information on which pair of means that differ.
Fcv Gable 9)
How to construct ANOVA table: = " " " "
"
number of
Between group → treatment 𝑆𝑆𝑇𝑟
SSTr k-1 𝑀𝑆𝑇𝑟 = 161.4 -15% significant
(Treatment) 1¥:# ¥ ) -4¥ 𝑘−1 value
;D
CE
Anova two -
independent -1
-
test
(atleast 3) untuk 2 variable
female
satu ≠
male
=
reject Honu"
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
𝑇 𝑇 𝑇 (∑ 𝑥) ANOVA
𝑆𝑆(𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡) = 𝑆𝑆𝑇𝑟 = + +⋯+ −
𝑛 𝑛 𝑛 𝑁 (at least 3)
dependent variable
(Gpart
(∑ 𝑥) ,
𝑆𝑆(𝑇𝑜𝑡𝑎𝑙) = 𝑆𝑆𝑇 = 𝑥 − interval /ratio
𝑁
treatment : students of
𝑆𝑆(𝐸𝑟𝑟𝑜𝑟) = 𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑇𝑟 CS dirt
group
AIRE,
if salah
= satu ≠
Reject Honan
Where, ¥É
𝑘 = Number of treatments
𝑛 = The size of sample 𝑖
𝑇 = The sum of value in sample 𝑖
𝑁 = The number of values in all samples
= 𝑛 + 𝑛 + ⋯+ 𝑛
𝑥 = The sum of the values in all samples
= 𝑇 + 𝑇 + ⋯+ 𝑇
, , For
Where 𝜶 is the level of significance, k-1 is the degree of freedom for numerator of F
ratio and N-k is the degree of freedom for the denominator of F ratio
Step 4: Decision
Step 5: Conclusion
66
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.13 : Fifteen fourth-grade students were randomly assigned to three groups to
experiment with three different methods of teaching arithmetic. At the end of the semester, the
same test was given to all 15 students. The table gives the scores of students in the three
groups.
Source of Degree of
Sum of squares Mean of squares F
variation freedom
3242
, 3692+3882=216.2 -
10,8g
"
K -
I
432.1333
Method 5 5 5 3-1=2 2
= 216.0667
T :{n = 1081 = 432.1333
N K -
Error is -3=12
197.3333 1.093
Enz _
CGI [ "
Total ( 482-1737512-1652+87 ? . . . . .
) -
↑} m ,
15-1=14
,
= 80709 -
= 2804.93
Slept Hypothesis :
Solution:
to : M, = M2 =
M3
SS(Method) = H, : at least two treatment means are different
level
step 2 :
sign .
✗ = 0.01
FCV =
Fo - 01 , 2,12 = 6.93
Fail to Ho
reject .
Step 6 : Conclusion
67
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Reconsider the previous example, conduct the F-test based on the SPSS output given
below.
ANOVA
Score
Sum of
df Mean square F Sig.
squares
Between Groups 432.133 2 216.067 1.093 .366
Within groups 2372.800 12 197.733
total 2804.933 14
68
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Mean
Source of
sum of squares Degree
of freedom of squares 1-
variation
M-ethd.tk?::-.-=mss-.==,==
+ + + 0.015
3 3 3 3
ˢ
" =
25
n -
k
ᵗE=!,,""=E=o.oox
2. 000
Islam diabagimlaini )
'
( En)
[ na
'
SST = _
h I
Total
-
=
58.115 -
12
12-1--11
=
0.035
[ nkeciksbbdia sample )
guna
step 1 :
Hypothesis Fail to reject Ho
Ho : Mi = Ma =
M3 =
My
step 6 : conclusion
Hi at least treatment means are different
to indicate that
one
insufficient evident
:
There is
step 2
.
: .
one
✗ = 0.05
test statistic
step 3 :
Fcat = 2.000
critical value
step 4 :
For =
Fo 05,3 .
,
8 =
4.07
Decision Rule
Step 5 :
2 variable ( categorical)
4.5 TEST OF INDEPENDENCE → Association
In the previous chapters, data are always assumed to follow a certain distribution such
as Binomial, Poisson and Normal. In this subtopic, we will discuss the independence
test to analyse categorical variables/ independence test in particular deals with testing
the independence between two categorical variables. The test will involve the use of Chi-
square distributions.
The test of independence is performed using the contingency table (cross tabulation). In a test
of independence for contingency table, we test the null hypothesis that the two characteristics
of the elements of a given population are not related (i.e. they are independent) against the
alternative hypothesis that the two characteristic are related (i.e. they are dependent).
The formula for the Chi-square distribution is given by the following formula:
(𝑂 − 𝐸)
𝜒 =
𝐸
(𝑂 − 𝐸)
𝜒 =
𝐸
Where r is the total number of row and c is the total number of column
Step 4: Decision
Step 5: Conclusion
70
NZZ & NHNMS, UiTM SHAH ALAM
step 1:
Hypothesis
Ho There associate between
: is no
ethnicity & political parties .
Step 2 :
sign . Level
✗ = 0.05
ethic
C
Politi
,Y↓+y B I
A 70 20 10 100
ethic
Observed Table political B ( I
go party
, go go 200
100×100
100×100 100×100
300 300
300
100 100 100 300 A 33.3 33.3
100
33.4 = =
=
"
✗ cat
'
( Oi Ei )
{ 100 100 100 300
-
=
Ei
'
( 70-33.45+(20-33.3)
'
= + +
(90-66.7)
. . . . . .
66-7-166.7+66.6=100
= 92.6268
tukarmemaneporpuluhan
step 4 : CV
o.os.cr-na.is :
_
✗
'
o .os ,
, Reject Ho if Kcal > X' or There is an association
=
table 8=5.991 Since Kcal -92.6268
-
> Xiv -5.991 between ethnicity and
Reject Ho
political parties .
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.15: A random sample of 400 people is selected from all the 16-yar-olds in a town.
The variables recorded were temper (vile or mild) and hair colour (red, brown or black).
The observed frequencies are displayed in table below. Test the hypothesis that temper and
hair colour are independent at the 5% level of significance.
Temper
Colour hair Total
Vile Mild
Red 40 20 60
Brown 80 100 180
Black 60 100 160
Total 180 220 400
Solution:
Temper Total
Colour hair
Vile Mild
Red Count 40 20 60
Expected Count 27 33
Brown Count 80 100 180
Expected Count 81 99
Black Count 60 100 160
Expected Count 72 88
Total 180 400 220
(𝑂 − 𝐸)
𝜒 =
𝐸
(40 − 27) (20 − 33) (80 − 81) (100 − 99) (60 − 72) (100 − 88)
𝜒 = + + + + +
27 33 81 99 72 88
𝜒 = 15.0393
𝜒 = 15.0393
Step 4: Decision
Step 5: Conclusion
There is enough evidence to conclude that the color of hair is independent to temper.
71
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
Example 4.16: The attendance and examination of a random sample of 60 pupils are given
in table below. The data were analyzed using SPSS and the output as follows:
Exam result
Pass Failed Total
Attendance Excellent Count 25 10 35 if Im
Expected Count K 11.7 35.0
Satisfactory Count 10 5 15 3=35-11.7 =
Poor Count 5 5 10
.
(𝑂 − 𝐸)
𝐿= 𝜒 =
𝐸
(25 − 23.33) (10 − 11.7) (10 − 10) (5 − 5) (5 − 6.7) (5 − 3.3)
𝐿=𝜒 = + + + + +
23.33 11.7 10 5 6.7 3.3
𝐿 = 𝜒 = 1.6732
✗ cat
=
Step 2: Test statistic
p-value = 0.448
72
NZZ & NHNMS, UiTM SHAH ALAM
STA404 STATISTICS FOR BUSINESS AND SOCIAL SCIENCES
:
Step 4: Decision
Step 5: Conclusion
1.
each other.
There is no sufficient evidence to indicate that attendance and exam result are dependent
mesh Hi
73
NZZ & NHNMS, UiTM SHAH ALAM