6 - Test of Hypothesis (Part - 2)
6 - Test of Hypothesis (Part - 2)
6 - Test of Hypothesis (Part - 2)
Simultaneously
STAT-600 1
Analysis of Variance (ANOVA)
Analysis of Variance is a procedure that
partitions the total variability in the data into
distinct components.
Each component represents the variation due to a
recognized source of variation, in addition, one
component represents the variation due to
uncontrolled factors and random errors
associated with the response measurements
Explained Un-Explained
Total Variation 2
Example:- The milk butterfat percentage of 4 breeds of cows is desired
to be known. A random sample of 6 Mature cows from each of 4
breeds was taken and the following data were recorded.
Breed 1 Breed 2 Breed 3 Breed 4
3.6 4.6 3.7 5.8
4.1 4.9 3.6 5.0
4.0 5.7 3.8 5.3 Test the hypothesis that the average
milk butterfat percentage for four
3.9 5.9 3.2 5.2
breeds are same
3.2 4.3 3.9 4.9
4.3 5.1 3.2 5.8
23.1 30.5 21.4 32.0 107.0
Explained Un-Explained
(Between Breed) STAT-600 (Within Breed) 3
Graphical View of the data
Dot Plot of Butter fat percentage Boxplot of Butter fat percentage
6.0 6.0
5.5 5.5
Butter fat percentage
4.5 4.5
4.0 4.0
3.5 3.5
3.0 3.0
Breed 1 Breed 2 Breed 3 Breed 4 Breed 1 Breed 2 Breed 3 Breed 4
Breed Breed
Average milk butterfat percentage of Breed 2 and 4 while for Breed1 & 3 are almost
similar. Although Breed 2 has largest variability in the data but variability between
4
four breeds are same.
Statistical Analysis by One Way ANOVA
Ho : 1=2=3=4
Average milk butterfat percentage are same for 4 breeds
H1: At least two means are different
2
Test Statistic F S b
2
S w
Within Breed
Total- Between Breed
6
17.8383-13.919=3.92 6
(S.O.V) DF SS MSS=SS/df Fcal
Between Breed 3 13.919 4.640 S2b 23.67*
Within Breed 20 3.920 0.196 S2w(MSE)
TOTAL 23 17.8383
7
7
Breed Mean
Breed 1 3.850 Mean Plot for Butterfat Percentage
Breed 2 5.083 5.5
Breed 3 3.567
Breed 4 5.333
5.0
Mean
4.5
4.0
3.5
Breed 1 Breed 2 Breed 3 Breed 4
Breed
STAT-600 8
TWO WAY ANOVA
The effective life (in hours) of batteries is compared by three material type
Batteries are randomly selected from each material type and are then randomly allocated
to each temperature level. The resulting life of all batteries is shown below:
Explained Un-Explained
• Due to material type (Error)
• Due to different Temp
Type I Type II Type III total Correction Factor (CF)
Low 180 188 160 528 (G.T)2/Obs= (1395)2/9 = 216225
Medium 215 210 190 615
High TotalSS
82 90 80 252
(180)2+(215)2 …(80)2 – CF
1395
Total 477 488 430 = 240993 – 216225 = 24768
Between Material
(477) 2 (488) 2 (430) 2
CF 632.67
Error 3 3 3
Total – Material – Temp
Between Temp
24768-632.67-23946=189.33 (528) 2 (615) 2 ( 252) 2
CF 23946
3 3 3
11
(S.O.V) DF SS MSS=SS/df Fcal Ftab
Material 2 632.67 316.33 S2M 6.68ns F0.05(2,4)=6.94
Temp 2 23946 11973 S2T 252.95* F.05(2,4)=6.94
Error
4 189.33 47.33 (MSE)
TOTAL 8 24768
12
12
Low Medium High
Means 176 205 84
200
175
Average Life
150
125
100
STAT-600 13
Chi-Square Goodness of Fit Test
STAT-600 14
Example:- Genetic theory suggested that the ratio of different types of
Categories Observed
2
O E 2
E
O: Observed Frequency
E : Expected Frequency i.e Frequency considering Ho true
STAT-600 16
The ratio is 9:3:3:1
Categories O E
2 O E 2
(9/16)x195 = 109.68 E
Yellow with green stigma 110
Yellow with red stigma 40 (3/16)x195 = 36.563
White with green stigma 30 (3/16)x195 = 36.563
=2.15
White with red stigma 15 (1/16)x195 = 12.188
TOTAL 195 TOTAL 195
Reject Ho if
cal
2
2.15
cal
2
2
( k 1) 0
2
.05(3)
7.81
k # catagories
STAT-600 Don' t reject Ho 18
TEST OF INDEPENDENCE
BETWEEN QUALITATIVE VARIABLES
Sugar 30 12 10 52
STAT-600 20
Expected Frequency Category Helped Harmed No Effect Sub
(Row Total) (Column Total) Total
Drug (112x122)/164=83.32 (112x22)/164=15.02 (112x20)/164=13.66 112
Total Observations
Sugar (52x122)/164=38.68 (52x22)/164=6.98 (52x20)/164=6.34 52
STAT-600 21