0% found this document useful (0 votes)
8 views11 pages

F-Tests and Anova

Uploaded by

Aayush Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views11 pages

F-Tests and Anova

Uploaded by

Aayush Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

F-TESTS AND ANOVA:

TESTING RATIO OF VARIANCES IN TWO NORMAL


POPULATION IN SMALL SAMPLES:
1) Testing of two independent normal population standard
deviation(variances) when population means (m1 and m2) are known
Null Hypothesis Alternative Hypothesis *Test statistic Rejection criterion
Reject H if
H: 2 = 22 (1) H1: 2 > 22 (1) F Cal > Fn , n ,α
1 2

(2) H1: 21< 22 (2) F Cal < Fn , n ,1−a


1 2

(3) H1:  ¹ 
1
2
2
2

F cal=
∑ ( x 1−μ1 ) /n ₁
2
(3) F Cal > Fn , n ,α /2
1 2

∑ ( x 2−μ2 )2 /n ₂ OR

F Cal < Fn , n ,1−a /2


1 2

*test statistic has been corrected here. Please make a change in your
handout too.
2) Testing of two independent normal population standard deviation (variances)
when population means (m1 and m2) are unknown
Always consider sample having larger variance as 1st sample.

Null Alternative Hypothesis Test statistic Rejection criterion


Hypothesis Reject H0 if

H0: 2 = ₂2 (1) H1:  > 2 (1) F Cal > Fn −1 , n −1 , α


1 2

(2) H1: 1< 2 * (2) F Cal < Fn −1 , n −1 , 1−a


1 2
2
n s /(n1−1)
1 1
(3) H1: 1¹ 2 F cal= F >F
n ₂ s /(n2−1) (3) Cal n −1 , n −1 , α / 2
2
1 2
2

OR

F Cal < Fn −1 , n −1 , 1−a/ 2


1 2

*most books use Fcal = S12/S22 as test statistic which is also correct.
1. An insurance company is interested in the length of hospital-stays for various illnesses. The
company has randomly selected 20 patients from hospital A and 25 from hospital B who were
treated for the same ailment. The amount of time spent in hospital A had an average of 2.4 days
with s.d of 0.6 days. The treatment time in hospital B averaged 2.3 days with s.d of 0.9 days. Do
patient at hospital A has significantly less variability in their recovery time? Test at 0.01
significance level.
Solution: Xi: length of hospital-stays for various illness
at hospital B and A respectively, where i = 1,2. (as
sample variance of hospital B is more than that for
hospital A.)
Xi: N (µi, σi) where µ₁, µ₂ are unknown.
(i) Hₒ: σ₁2 = σ₂2 Vs H₁: σ₁2 > σ₂2 (right tailed test)
(ii) Since n₁ = 25, n₂ = 20, s₁ = 0.9, s₂ = 0.6, and
population mean µ₁, µ₂ are unknown, we use
F-test with n – 1 df. (note that the sample
means given as 2.3 and 2.4 respectively are
not required in calculating test statistic)
(iii) Test statistic:
2
n1 s1 /(n1−1) 25 x 0.81/(25−1)
F cal= 2
s ₂ n /(n −1)
= 20 x 0.36 /(20−1)
=0.84375/0.37895
2 2

= 2.2265
(iv) Decision criteria: Reject Hₒ if Fcal > Ftab
Ftab = Fn₁ - 1, n₂ - 1, α = F25 – 1, 20 -1, 0.01
= F24, 19, 0.01 = 2.9249.
Since Fcal = 2.2265 < 2.9249, we do not reject
Hₒ at 1% level of significance.
(v) Conclusion: No, there is no significant
difference in the recovery time at hospital A
and B.
 Note: The table for sampling distribution of F is

given below for level of significance 0.05 and 0.01,


and degrees of freedom for numerator (in the first
row) and the denominator (in the first column).
2.
3.
4.
5.
6.

Analysis of variance: ANOVA


(ONE WAY CLASSIFICATION ONLY)
It is a statistical technique specially designed to test
whether the means of three or more populations are
equal. In one-way classification, data is classified
according to one criterion only.
Rationale of the test: If the hypothesis is not true, the
variation between the sample means will tend to be larger
than the variation within samples.
3) Testing of equality of means of several (say k) independent population:
(ANOVA)
Requirements for ANOVA
1. There are k random samples from k population.
2. The samples are independent.
3. The populations are normally distributed.
4. The populations have the same variances. The largest sample standard deviation must be no
more than two times larger than the smallest sample standard deviation.
H0: μ1=μ2 =… μk
H1: at least one μi differs(at least one μi ≠ μ j )

population 1st sample 2nd sample …. Kth sample

1 Y11 Y21 …. Yk1

2 Y12 Y22 ….. Yk2

. . . . .
. . . . .
. . . . .

*ni Y1n1 Y2n2 ….. Yknk

Col. Total T₁ T₂ …. TK
(Ti)

ni is the size of ith sample i=1, 2, ....k and n1¹ n2¹…¹ nk


n=n1 +n2 +…+ nk
Grand total G=∑ T i
2
G
Correction factor C . F=
n
Total sum of square TSS=∑ ∑ Y 2ij −C . F i=1 , 2 ,… , k ∧ j=1, 2 , … , ni
2
Ti
Between sum of square BSS=∑ −C . F
ni
Error sum of square ESS=TSS−BSS
ANOVA

Source Degrees of Sum of Mean sum of square F calculated


freedom square (SS) (MSS=SS/d.f)
(d.f)

Between k-1 BSS MBSS=BSS/(k−1) MBSS


F cal=
MESS

Error n-k ESS MESS=ESS /(n−k )


Total n-1 TSS

Rejection criterion: reject null hypothesis if F Cal > Ftab =F (k−1 ), (n−k ) ,α
7. Suppose that a random sample of 5 was selected from the vineyard properties for sale in Sonoma
County, California, in each of 3 years. The following data are on price per acre (in dollars,
rounded to the nearest thousand) for 3 years on Sonoma County. Carry out an ANOVA to
determine there is evidence to support the claim that the mean price per acre for land was not
same for the 3 years considered. Use 5 % level of significance.
1996(nearest thousand) 1997(nearest thousand) 1998(nearest thousand)

30 30 40

34 35 41

36 37 43

38 38 44

40 40 50

8. Over the course of two days, you have tracked how many hours it took each of your fifteen staff
members to complete the same assignment. Here are the results:
9. Group 1 Soda Group 2 B-Vitamin Group 3
(hrs ) drink(hrs.) Coffee(hrs.)
8 5 7
8 6 6
10 6 7
7 4 7
10 8 9

Test at 1 percent level of significance that the supplied beverages affect productivity.

Solution: Xi: time taken to complete a task (in hrs.) after


consuming soda, B-vitamin drink, and coffee, and
Xi~ N (µi, σi), where i = 1,2,3.
(i) Hₒ: µ₁ = µ₂ = µ3 Vs H₁: µ₁ = µ₂ ≠ µ3
(at least two population means differ)
(ii) Since there are three samples ni = 5 (<30),
Where, i = 1,2,3, we use ANOVA.
(iii) Test statistic: let X₁:group1 soda, X₂: group 2
Vitamin B drink, X3:group 3 Coffee, k =3, n=
15
Group 1 Group 2 B- Group 3
Soda (hrs ) Vitamin Coffee(hrs) X₁2 X₂2 X32
X₁ drink(hrs.) X3
X₂
8 5 7 64 25 49
8 6 6 64 36 36
10 6 7 100 36 49
7 4 7 49 16 49
10 8 9 100 64 81
T₁ = 43 T₂ = 29 T3 = 36 ∑x₁2 = ∑x₂ = ∑x32 =
2

377 177 264

The grand total G = ∑Ti = T₁ + T₂ + T3 = 43 + 29 + 36 = 108


Also, n = n₁ + n₂ + n3 = 5 + 5 + 5 = 15 (all samples are of same size).
G2 = (108)2 = 11664
Correction factor (C.F) = G2/ n = 11664 / 15 = 777.6
Taking sample Totals T₁ = 43, T₂ = 29, T3 = 36
2
Ti
Between sum of square BSS=∑ −C . F
ni
Thus BSS = (T₁2+ T₂2 + T32) / ni – C.F
= [ (43)2 + (29)2 + (36)2] / 5 - 777.6
= 797.2 – 777.6
= 19.6

Total sum of square TSS=∑ ∑ X ij−C . F


2

i=1 , 2 ,… , k ∧ j=1, 2 , … , ni

TSS = ∑X₁2 +∑X₂2 +∑X32 – C.F


= (377 + 177 +264) – 777.6
= 818 – 777.6 = 40.4
Error sum of square ESS=TSS−BSS
= 40.4 – 19.6
= 20.8

ANOVA

Source Degrees Sum of Mean sum of F


of square square calculated
freedom (SS) (MSS=SS/d.f.)
(d.f.)
Between k-1 BSS = MBSS=9.8=BSS /(k−1) F = MBSS
cal
= 19.6/2 = 9.8 MESS
=3 - 1= 19.6
2
Error n-k = ESS = MESS=ESS / ( n−k )= 9.8
= 1.7333
15 -3 20.8 20.8/ 12 =
= 5.6540
=12 1.7333
Total n-1 = TSS=
15 -1 40.4
=14

(iv) Rejection criterion: reject null hypothesis if


F Cal > Ftab =F (k−1 ), (n−k ) ,α = F (3 –1), (15–3), 0.01 =F2, 12, 0.01 = 6.9266.

Since Fcal = 5.6540 < Ftab = 6.9266, we do not reject null


hypothesis at 1% level of significance.
(v) Conclusion: No, the supplied beverages did not affect
productivity.
Do numerical no. 7 as assignment.

You might also like