We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Notes on ANOVA technique
1. ANOVA is short form of Analysis of Variance.
2. This technique is useful in comparing the mean value of a numerical variable across various classifications. For example, a. We want to compare sales across various modes of advertising. Let us say that there are 4 different modes of advertising (newspapers, social media by people, FM radio channels and display hoardings). Sales values are observed for each of these modes. We want to know whether any one mode of advertising leads to significantly higher or significantly lower sales. For this, we can use ANOVA technique. b. Similarly, we may want to compare marks across 3 divisions. c. Three drug dosages are to be compared to see which drug dosage is effective in reducing blood pressure d. We want to find which plant locations (out of 8 plant locations) in a manufacturing company have low productivity so that corrective managerial action can be taken.
3. In each of the cases above, a numerical variable is compared across the
classifications of a categorical variable.
Sr No Numerical variable Categorical variable Number of classifications
1 Sales Mode of advertising 4 modes of advertising
2 Marks Division of students 3 Divisions
Reduction in blood 3 pressure Dosage of drug 3 levels of dosages of drugs
4. Note that each situation above has more than 2 classifications.
5. If there were only two classifications (ie only two divisions of students for comparing marks) then two sample t test would be the method appropriate for evaluating whether any one division has higher marks. 6. When 3 divisions are compared, 3 pairs of two divisions get compared. For example, if there are 3 divisions viz. division A, division B and division C then comparison is required across A vs B, B vs C and A vs C ie 3 pairs. Hence 3 T tests would be needed. 7. Advantage of ANOVA method is that one ANOVA test can accomplish the same task done by 3 T tests. 8. ANOVA method basically works on the following logic: If one classification level is significantly better or worse than the other classification levels, the variation of the mean value for that level from the overall sample mean would be far higher than the variation of sample values within that classification level. To illustrate using the example of division wise marks: If Division A students have scored significantly more marks than other divisions then sum of variation between the marks for each student of Division A and overall average marks across all divisions would be far higher than the sum of variation of marks within the Division A.
9. Post Hoc tests
a. ANOVA method simply shows whether there is any one classification level that is significantly better or worse than the other levels. But it does not identify that classification level. b. Post Hoc tests are carried out to identify such classification levels. c. There are two basic methods: i. Compare all possible pairs ii. Compare all other levels with one control level d. Such comparisons , if unadjusted; result in loss of significance i. To illustrate 1. When two samples are compared with 5% significance, there is probability of 5% or less that there are different. Hence there is probability of 95% or more that the two samples are not different. Let us call this test for two samples as the test for one pair of samples. 2. Now if marks for 3 divisions are compared; totally 3 comparisons are made (A with B, A with C and B with C; as mentioned earlier). ie 3 pairs are compared. 3 ‘T’ tests are carried out. 3. Probability that atleast one pair is different = 1 – probability that no pair is different 4. Since 3 pairs are compared, probability that none of the 3 pairs is different = 0.95^3 5. Hence probability that atleast 1 pair is different = 1- 0.95^3 = 14.26% 6. We can tolerate maximum of 5% for significance across the 3 divisions. 14.26% would not be acceptable. 7. Hence for each pair the acceptable significance level needs to be adjusted below 5% so that at the aggregate level, 5% significance is achieved. (If significance for each pair is reduced from 5% to 1.69% then the probability that a pair is not different is 1-1.69% ie 98.31%. Probability that none of the 3 pairs is different is .9831^3. This is same as 0.95. Hence probability that atlease one pair is different is 1 – 0.95 = .05. This is the desired significance at the aggregate level.) 8. This adjustment is automatically done by the SAS procedure by selecting a. Bonferoni or Tuckey adjustment/correction for pairwise comparison b. Dunnet adjustment/correction for comparison with control level 10. ANOVA method assumes that a. Sample is normally distributed with observations independent of each other i. This is tested using Shapiro-Wilkinson test or Jacques Barra test for normality b. Variance within each classification level is same i. This is tested using Levene’s test for squared residuals.