0% found this document useful (0 votes)
178 views19 pages

Advanced Statistics: Analysis of Variance (ANOVA) Dr. P.K.Viswanathan (Professor Analytics)

1) The document discusses analysis of variance (ANOVA), a statistical technique used to compare population means. 2) ANOVA decomposes total variation in a variable into different sources of variation, including treatment variation and error. 3) The document provides an example of how ANOVA can be used to determine the best size of advertisement (quarter, half, or full page) for a product by comparing store traffic under each treatment.

Uploaded by

vishnuvk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views19 pages

Advanced Statistics: Analysis of Variance (ANOVA) Dr. P.K.Viswanathan (Professor Analytics)

1) The document discusses analysis of variance (ANOVA), a statistical technique used to compare population means. 2) ANOVA decomposes total variation in a variable into different sources of variation, including treatment variation and error. 3) The document provides an example of how ANOVA can be used to determine the best size of advertisement (quarter, half, or full page) for a product by comparing store traffic under each treatment.

Uploaded by

vishnuvk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Advanced Statistics

Analysis of Variance(ANOVA)
Dr. P.K.Viswanathan(Professor
Analytics)

1
ANOVA Basics
 This technique is part of the domain called
“Experimental Designs”.
 This helps in establishing in a precise fashion the Cause
- Effect relation amongst variables.
 From the Statistical Inference Point of View, ANOVA is
an extension of independent t test for testing the
equality of two population means.
 When we have to compare more than two population
means, we use ANOVA
 Typically, the null hypothesis( H0 ) is as under:
 H0 : µ1= µ2=µ3=µ4=……=µk for testing the equality of
Population Means for k populations

2
ANOVA-One Way Classification
Assumptions involved in using ANOVA

 The samples drawn from different populations are


independent and random.

 The response variables of all the populations are normally


distributed.

 The variances of all the populations are equal.

3
Hypotheses of One-Way ANOVA


 All population means are equal

 H 0 : μ1  μ 2  μ 3    μ k
 For at least one pair, the population means are
unequal

H1 : Not all of the population means are


4 equal
One-Way ANOVA
Null Hypothesis(H0=True)
H 0 : μ1  μ 2  μ 3    μ k
H1 : Not all μ j are equal

μ1  μ 2  μ 3
One-Way ANOVA
Alternative Hypothesis(H1=True)

H 0 : μ1  μ 2  μ 3    μ k

H1 : Not all μ j are equal

or

μ1  μ2  μ3 μ1  μ2  μ3
ANOVA Basics

 The beauty of ANOVA is that it performs the


test of equality of more than two population
means by actually analyzing the variance.
 In simple terms, ANOVA decomposes the total
variation into components of variation. That is,
explaining the changes in the response variable
caused by these components.
 To put it succinctly, the total sum of squares is
equal to the sum of squares due to causes.

7
Partition of Total Variation(Information
Content
Total Variation [Total Sum of Squares(TSS)]

≡ Treatment Sum of Squares(TRSS) + Error Sum of Squares(ESS)

8
ANOVA-One Way Classification-
Example
 A supermarket is interested in knowing
whether it should go for a quarter-page,
half-page, or a full-page advertisement
for a Product.
 In order to choose the size of the
advertisement that will bring in the most
store traffic, the supermarket can use
ANOVA technique.
 Here, you are trying to establish a cause-
effect relationship between store traffic
and the various sizes of advertisement.

9
ANOVA-One Way Classification
How One-Way Classification Works in Practice?

 Total Sum of Squares ≡ Treatment Sum of Squares +


Error Sum of Squares.

 The word treatment is generic and as such may denote


different methods, machines, different advertisement
copy platforms, different strategies, different brands
and the like.

 The variation in sum of squares of the response variable


(dependent variable) is caused only by treatment and
any thing unexplained by the treatment is attributed to
error term.

10
One Way ANOVA- Application
Sporting goods manufacturing company wanted to compare the
distance traveled by golf balls produced using four different designs.
Ten balls were manufactured with each design and were brought to the
local golf course for the club professional to test. The order in which the
balls were hit with the same club from the first tee was randomized so
that the pro did not know which type of ball was being hit. All 40 balls
were hit in a short period of time, during which the environmental
conditions were essentially the same. The results (distance traveled in
yards) for the four Designs are stored in Golfball.csv

At the 0.05 level of significance, is there evidence of a difference in the


mean distances traveled by the golf balls with different designs?

Problem 10.64, Chapter 10, Page 381 of the Textbook


Business Statistics- A First Course 7th Edition Pearson Education Indian
Edition
Anova Output

Mean F
Df Sum Sq Sq value Pr(>F)
2.73E-
Design 3 2990.99 997.00 53.03 13
Residu
als 36 676.82 18.80
TukeyHSD Test
diff lwr upr p adj
Design2- 11.90 6.679 17.12 2.65E-
Design1 20 5 45 06
Design3- 19.97 14.75 25.19 1.64E-
Design1 40 15 65 11
Design4- 22.00 16.78 27.23 8.89E-
Design1 80 55 05 13
Design3- 8.072 2.849 13.29 0.0010
Design2 0 5 45 3
Design4- 10.10 4.883 15.32 4.51E-
Design2 60 5 85 05
Interesting Application of Two Factor Anova/Ancova
Testing The Effects of Price and Advertising

Newfood Product Management opted to conduct a market


test
Experiment using a balanced two factor design with three
levels of price(low, medium, high) and two levels of advertising
(low and high). Each combination of price and advertising was
used in four different stores resulting in a total of 24
observations. The data are given in paul-newfood.csv. Perform
anova/ancova and interpret the results.
Newfood-Anova-Main Effect
Df Sum Sq Mean Sq F value Pr(>F)
Price 2 600412.5833 300206.2917 13.6640 0.0002
Advertisement 1 32.6667 32.6667 0.0015 0.9696
Residuals 20 439412.5833 21970.6292
Newfood-Interaction Effect
Newfood-Anova-Interaction Effect
Df Sum Sq Mean Sq F value Pr(>F)
Price 2 600412.5833 300206.2917 14.7819 0.0002
Advertisement 1 32.6667 32.6667 0.0016 0.9685
Price:Advertisement 2 73850.0833 36925.0417 1.8182 0.1909
Residuals 18 365562.5000 20309.0278
Newfood-Ancova-Interaction
Effect
Df Sum Sq Mean Sq F value Pr(>F)
StoreSize 1 191540.9780 191540.9780 18.8113 0.0004
Price 2 501543.2182 250771.6091 24.6284 0.0000
Advertisement 1 128371.9891 128371.9891 12.6075 0.0025
Price:Advertisement 2 45304.1585 22652.0793 2.2247 0.1386
Residuals 17 173097.4895 10182.2053
Newfood-Ancova-Interaction Effect-Adjusted for
store Size
Df Sum Sq Mean Sq F value Pr(>F)
Price 2 386945.4676 193472.7338 19.0011 0.0000
Advertisement 1 128371.9891 128371.9891 12.6075 0.0025
Price:Advertisement 2 45304.1585 22652.0793 2.2247 0.1386
Residuals 17 173097.4895 10182.2053

You might also like