0% found this document useful (0 votes)
27 views95 pages

Anova

This document discusses analysis of variance (ANOVA) techniques. It covers one-way and two-way ANOVA models. Key concepts covered include: (1) Situations where ANOVA is applicable, such as comparing population means across multiple groups; (2) Terminology used in ANOVA such as dependent/independent variables, treatment levels, and experimental units; (3) How ANOVA can be viewed as a regression model with categorical predictors; and (4) How total variation is partitioned into explained and unexplained components in ANOVA. Worked examples are provided to illustrate key ANOVA concepts and calculations.

Uploaded by

xueli li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views95 pages

Anova

This document discusses analysis of variance (ANOVA) techniques. It covers one-way and two-way ANOVA models. Key concepts covered include: (1) Situations where ANOVA is applicable, such as comparing population means across multiple groups; (2) Terminology used in ANOVA such as dependent/independent variables, treatment levels, and experimental units; (3) How ANOVA can be viewed as a regression model with categorical predictors; and (4) How total variation is partitioned into explained and unexplained components in ANOVA. Worked examples are provided to illustrate key ANOVA concepts and calculations.

Uploaded by

xueli li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

Business Analytics

Analysis of Variance
ANOVA

Prof. Dr. Julia Hartmann


Fall Term 2022
Business Analytics

Data Collection Data Analytics Data Mining Predictive Analytics

Refinitiv Descriptives Logistic Forecasting


(Bloomberg) Testing Clustering Simulation
Regression
ANOVA

2
Agenda

1. Fundamentals
2. One-way ANOVA
3. Two-way ANOVA

3
Fundamentals

Situations for ANOVA


ANOVA Types
Terminology

4
(1/3) Situations for ANOVA: A/B Test

• Which one do you like more?

• 65% of respondents prefer the apple

5
(2/3) Situations for ANOVA: Serial A/B Test

• Which one do you like more?

• 80% of respondents prefer cherries


• (Serial) A/B tests do not enable a test for interactions between variables.
• The best combination cannot be found.
• Multivariate testing (MVT) is preferred.
6
(3/3) Situations for ANOVA

• The procedure for analyzing the difference between more than two population
means is commonly called analysis of variance, or ANOVA.
– There are two typical situations where ANOVA is used:
• When there are several distinct populations
– Example: Do graduates from MSc in Management versus Finance versus Digital Marketing
versus Real Estate get different starting salaries?
• In randomized experiments; in this case, a single population is treated in one of
several ways.
– Population: All persons who suffer from long-COVID
– Treatment: receive medication
– Control: receive placebo

7
(1/4) ANOVA Types

– In an observational study, we analyze data already available to us (e.g. student


entry salaries after graduation).
• The disadvantage is that it is difficult or impossible to rule out factors over
which we have no control for the effects we observe.
– In a designed experiment, we control for various factors such as age, gender, or
socioeconomic status so that we can learn more precisely what is responsible for
the effects we observe.
• In a carefully designed experiment, we can be fairly sure that any differences
across groups are due to the variables that we purposely manipulate.
• This ability to infer causal relationships is never possible with observational
studies.

8
(2/4) ANOVA Types

• Experimental design is the science (and art) of setting up an experiment so that the
most information can be obtained for the time and money involved.
– Decide which variables to manipulate, which levels and which observations to get
– In a carefully designed experiment, we can be fairly sure that any differences
across groups are due to the variables that we purposely manipulate.
– This ability to infer causal relationships is never possible with observational
studies.
– Unfortunately, managers do not always have the luxury of being able to design a
controlled experiment for obtaining data, but often have to rely on whatever data
are available (that is, observational data).

9
(3/4) ANOVA Types: One-way ANOVA

Dependent
Independent Variable
Variable or
Factor
Prize
• Low Purchase
Treatment • Medium Intention
levels • High

10
(4/4) ANOVA Types: Two-way ANOVA
Dependent
Independent Variable
Variables or
Factors
Prize
• Low Purchase
• Medium Intention
• High
Treatment
levels Quality
• Low
• High

11
(1/2) Terminology

• The variable of primary interest that we wish to measure is called the dependent
variable (or sometimes the response or criterion variable).
– This is the variable we measure to detect differences among groups.
• The groups themselves are determined by one or more factors (sometimes called
independent or explanatory variables), each varied at several treatment levels (often
shortened to levels).
– It is best to think of a factor as a categorical variable, with the possible categories
being its levels.
• The entities measured at each treatment level (or combination of levels) are called
experimental units.

12
(2/2) Terminology

• The number of factors determines the type of ANOVA.


– In one-way ANOVA, a single dependent variable is measured at various levels of a
single factor.
• Each experimental unit is assigned to one of these levels.
– In two-way ANOVA, a single dependent variable is measured at various
combinations of the levels of two factors.
• Each experimental unit is assigned to one of these combinations of levels.
– In three-way ANOVA, there are three factors.
• In balanced design, an equal number of experimental units is assigned to each
combination of treatment levels.

13
Basic Mechanisms of ANOVA

ANOVA as Regression with Categorical Predictors


Deviations from Means

14
(1/5) ANOVA as Regression with Categorical Predictors

Dependent
Independent Variable
Variable or
Factor
Prize
• Low Purchase
Treatment • Medium Intention
levels • High

15
(2/5) ANOVA as Regression with Categorical Predictors

Estimated multiple regression equation with dummies denoting the experimental


conditions:

• 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒 𝑖𝑛𝑡𝑒𝑛𝑡𝑖𝑜𝑛 = 𝑏0 + 𝑏1 𝑚𝑒𝑑𝑖𝑢𝑚 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦 + 𝑏2 ℎ𝑖𝑔ℎ 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦

16
(3/5) ANOVA as Regression with Categorical Predictors

Low price group


• 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒 𝑖𝑛𝑡𝑒𝑛𝑡𝑖𝑜𝑛 = 𝑏0 + 𝑏1 𝑚𝑒𝑑𝑖𝑢𝑚 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦 + 𝑏2 ℎ𝑖𝑔ℎ 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦
• As in regression using dummy variables for categories, the base category (low price)
is denoted by x1 = 0 and x2 = 0.
• Hence,
– ഥ 𝑙𝑜𝑤 = 𝑏0 + 𝑏1 ∗ 0 + 𝑏2 ∗ 0 = 𝑏0
𝑝𝑖
– ഥ 𝑙𝑜𝑤 = 𝑏0
𝑝𝑖

17
(4/5) ANOVA as Regression with Categorical Predictors

High price group


• 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒 𝑖𝑛𝑡𝑒𝑛𝑡𝑖𝑜𝑛 = 𝑏0 + 𝑏1 𝑚𝑒𝑑𝑖𝑢𝑚 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦 + 𝑏2 ℎ𝑖𝑔ℎ 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦
ഥ ℎ𝑖𝑔ℎ = 𝑏0 + 𝑏1 ∗ 0 + 𝑏2 ∗ 1 = 𝑏0 + 𝑏2
• 𝑝𝑖
• We already know that 𝑏0 is the mean for the low price group. Hence
ഥ ℎ𝑖𝑔ℎ = 𝑝𝑖
– 𝑝𝑖 ഥ 𝑙𝑜𝑤 + 𝑏2
• Solving for 𝑏2 , we get
ഥ ℎ𝑖𝑔ℎ − 𝑝𝑖𝑙𝑜𝑤
– 𝑏2 = 𝑝𝑖
• 𝑏2 is the difference in purchase intention between the means of the high and low
price groups.

18
(5/5) ANOVA as Regression with Categorical Predictors

Medium price group


• 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒 𝑖𝑛𝑡𝑒𝑛𝑡𝑖𝑜𝑛 = 𝑏0 + 𝑏1 𝑚𝑒𝑑𝑖𝑢𝑚 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦 + 𝑏2 ℎ𝑖𝑔ℎ 𝑝𝑟𝑖𝑐𝑒𝑑𝑢𝑚𝑚𝑦
ഥ 𝑚𝑒𝑑𝑖𝑢𝑚 = 𝑏0 + 𝑏1 ∗ 1 + 𝑏2 ∗ 0 = 𝑏0 + 𝑏1
• 𝑝𝑖
• We already know that 𝑏0 is the mean for the low price group. Hence
ഥ 𝑚𝑒𝑑𝑖𝑢𝑚 = 𝑝𝑖
– 𝑝𝑖 ഥ 𝑙𝑜𝑤 + 𝑏1
• Solving for 𝑏1 , we get
ഥ 𝑚𝑒𝑑𝑖𝑢𝑚 − 𝑝𝑖𝑙𝑜𝑤
– 𝑏1 = 𝑝𝑖
• 𝑏1 is the difference in purchase intention between the means of the medium and low
price groups.

19
(1/4) Differences in Means

Purchase
Intention

Total sum of squares denotes


the difference between the
observed values and the grand
mean value of the outcome
variable (across all observations)
SST = σ𝐽𝑗=1 𝑛𝑗(𝑌𝑗 − 𝑌)
ധ 2
Grand Mean 𝑌ധ
In ANOVA, this total variation is
split into explained and
unexplained parts.

0 Participants
High Price Medium Price Low Price
20
(2/4) Differences in Means

Purchase
Intention
Sum of squares within groups
denotes the difference between
the observed values and the
group mean value.
SSW = σ𝐽𝑗=1 𝑛𝑗(𝑌𝑗 − 𝑌)
ത 2

In ANOVA: This is the variation


Grand Mean 𝑌ധ which is left unexplained
(because subjects within the
experimental condition still vary,
but not due to the experimental
condition)

0 Participants
High Price Medium Price Low Price
21
(3/4) Differences in Means

Purchase
Intention

Sums of squares between


groups denotes the difference
between the group mean values
and the grand mean.
SSB = σ𝐽𝑗=1 𝑛𝑗(𝑌ത 𝑗 − 𝑌)
ധ 2

Grand Mean 𝑌ധ In ANOVA, this is the variation


explained by the model (i.e. the
experimental conditions).

0 Participants
High Price Medium Price Low Price
22
(4/4) Differences in Means

• This is the essence of the ANOVA procedure:


– Compare variation between the sample means to the variation within the individual
treatment levels.
• If the between variation is large relative to the within variation, we conclude that
there are differences across population means and reject the equal-means
hypothesis, H0
𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑆𝑆𝐵𝑒𝑡𝑤𝑒𝑒𝑛
• Hence, if =
𝑢𝑛𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑆𝑆𝑊𝑖𝑡ℎ𝑖𝑛
– Will be 1 if there is balance in explained and unexplained variation
– Will be larger than 1 if the explained variation SSB is higher than the unexplained variation SSW
– Will be smaller than 1 of the unexplained variation SSW is higher than the explained variation SSB

23
Differences in Means

Equal Means Test (Hypothesis Test)


Confidence Intervals

24
(1/12) The Equal-Means Test

• Set up the first question as a hypothesis test.


– The null hypothesis is that there are no differences in population means across
treatment levels:
𝐻𝑜 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑗
– The alternative hypothesis is the opposite – meaning that at least one pair of
population means are not equal.
• If we can reject the null hypothesis at some typical level of significance, then we hunt
further to see which means are different from which others.
– To do this, calculate confidence intervals for differences between pairs of means
and see which of these confidence intervals do not include zero.

25
(2/12) The Equal-Means Test

• The simplest design to analyze is the one-factor design.


– There are basically two situations:
• The data could be observational data, in which case the levels of the single
factor might best be considered as “subpopulations” of an overall population.
• The data could be generated from a designed experiment, where a single
population of experimental units is treated in different ways.
– The data analysis is basically the same in either case.
• First, we ask: Are there any significant differences in the mean of the
dependent variable across the different groups?
• If the answer is “yes,” we ask the second question: Which of the groups differ
significantly from which others?
26
(3/12) The Equal-Means Test

In ANOVA, we do not compare differences in means. Rather, we compare variation within


treatment levels to variation between treatment levels. If between variation is large relative to
27
within variation, we conclude that there are differences across population means
(4/12) The Equal-Means Test

• Assumptions:
– The population variance are all equal to some common variance σ²
– The populations are normally distributed
• To run the test:
– Let Yഥj, s²j, and nj be the sample mean, sample variance, and sample size from treatment
level j.
– Also let n and 𝑌ധ be the combined number of observations and the sample mean of all n
observations. 𝑌ധ is called the grand mean
– Then, we can calculate a measure for between variation MSB, “mean square between”
and
– A measure for within variation MSW, “mean square within”

28
(5/12) The Equal-Means Test

• Mean Squares Between (MSB):

𝑆𝑆𝐵 σ𝐽𝑗=1 𝑛𝑗(𝑌ത 𝑗 − 𝑌)


ധ 2
𝑀𝑆𝐵 = =
𝑑𝑓𝑏 𝐽−1
• Notation
– SSB = Sums of squares between groups
– 𝑑𝑓𝑏 = degrees of freedom
– Yഥj = sample mean
– 𝑌ധ = sample mean of all n observations (called grand mean)
– nj = sample size at treatment level j
– J = total number of treatments / groups
29
(6/12) The Equal-Means Test

• Mean Squares Within (MSW):

𝑆𝑆𝑊 σ𝐽𝑗=1 𝑛𝑗( 𝑌𝑖 − 𝑌ത 𝑗)2 σ𝐽𝑗=1 𝑛𝑗(𝑛𝑗 − 1)𝑠𝑗2


𝑀𝑆𝑊 = = =
𝑑𝑓𝑤 𝑛−𝐽 𝑛−𝐽
• Notation
– SSW = Sums of squares within groups
– 𝑑𝑓𝑤 = degrees of freedom
– s2j = sample variance
– nj = sample size at treatment level j
– n = total sample size for all treatment levels
– J = total number of treatments / groups

30
(7/12) The Equal-Means Test

Application in Excel
• In Excel, it is actually easier to implement MSB and MSW calculations using:
– 𝑆𝑆𝐵 = 𝑆𝑆𝑇 − 𝑆𝑆𝑊
𝑆𝑆𝐵
– 𝑀𝑆𝐵 =
𝑑𝑓𝑏
𝑆𝑆𝑊
– 𝑀𝑆𝑊 =
𝑑𝑓𝑤

• Use the DEVSQ function


– Get SST by applying DEVSQ to the whole dataset which must be organized in unstacked form
– Get SSW by applying DEVSQ to each treatment group (i.e. apply to column) and summarize results
– Get SSB by subtracting SSW from SST
– Then divide by degrees of freedom

31
Exercise using one_way.xlsx

• Objective:
– To use one-way ANOVA to see whether shelf height makes any difference in mean sales of
Brand X, and if so, to discover which shelf heights outperform the others.
• Solution:
– For this experiment, Midway supermarket chain selects 75 of its stores to be as alike as
possible. The stores are divided into three randomly selected groups, and each group of 25
stores places brand X of cereal on a specific shelf for a month.
– The number of boxes of brand X sold is recorded at each of the stores for the last two
weeks of the experiment.
– Does shelf location appear to make a difference in sales? Calculate within and between
variation in average sales for the 3 types of shelf positions.

32
(8/12) The Equal-Means Test

F-test to assess the equal means assumption (H0)


• Remember:
– H0: µ1 = µ2 = µ3 = …. = µn
– Ha: at least one mean is different
• Use an F-test to examine if at least one pair of means is significantly different from 0.
– The two-sample procedure for a difference between population means depends on
whether population variances are equal.
– Therefore, it is natural to test first for equal variances.
– This test is referred to as the F test for equality of two variances.
– The test statistic for this test is the ratio of sample variances:
𝑀𝑆𝐵
• 𝐹𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 =
𝑀𝑆𝑊
33 • If the ratio is 1 -> equal variance
(9/12) The Equal-Means Test

F-test to assess the equal means assumption (H0) (continued)


• F-test
– Assuming that the population variances are equal, this test statistic has an F distribution
with n1 – 1 and n2 – 1 degrees of freedom.
– Under the null hypothesis of equal population means, this test statistic has an F
distribution with dfB and dfW degrees of freedom.
– If the null hypothesis is not true, then we would expect MSB to be large relative to MSW.
– If the alternative hypothesis is true, MSB is large relative to MSW.
– The p-value for the test is found by finding the probability to the right of the F-ratio in the
F distribution with dfB and dfW degrees of freedom.

34
(10/12) The Equal-Means Test

F Density
• Calculate the F-value for the ANOVA
function
groups (F=MSB/MSW)
• Compare to the critical F-value from the F-
distribution.
• If variance is 1, then there is no difference
in means. Hence, we test the assumption
that F is greater than 1.
• If Fcalc ≥ Fcrit then reject H0
• Set the confidence level, e.g. 95%, hence α
= 0.05
• If p ≤ α then reject H0 0 x

35
(11/12) The Equal-Means Test

Excel’s F-distribution function to receive the critical F-value


• F.DIST(x, degrees of freedom 1, degrees of freedom 2)
– Finds the probability of "less than x"
– Where, in ANOVA
• Degrees of freedom 1 = df between
• Degrees of freedom 2 = df within
• F.DIST.RT(x, degrees of freedom 1, degrees of freedom 2)
– Finds the probability of “greater than x"
• F.INV(p, degrees of freedom 1, degrees of freedom 2)
– Finds the inverse of the left-tailed F-probability
• F.INV.RT(p, degrees of freedom 1, degrees of freedom 2)
– Finds the inverse of the right-tailed F-probability
36
(12/12) The Equal-Means Test

• The elements of this test are usually presented in an ANOVA table.


– The bottom line in this table is the p-value for the F-ratio.
• If the p-value is sufficiently small, we can conclude that the population means
are not all equal.
• Otherwise, we cannot reject the equal-means hypothesis.

37
Exercise using one_way.xlsx

• Objective:
– To use one-way ANOVA to see whether shelf height makes any difference in mean sales of
brand X, and if so, to discover which shelf heights outperform the others.
• Solution:
– Calculate F and compare it to the critical value from an F-distribution.
– Calculate the associated p-value.
– How do you interpret the results?
– Can you reject H0?

38
(1/8) Confidence Intervals

• The simplest design to analyze is the one-factor design.


– There are basically two situations:
• The data could be observational data, in which case the levels of the single
factor might best be considered as “subpopulations” of an overall population.
• The data could be generated from a designed experiment, where a single
population of experimental units is treated in different ways.
– The data analysis is basically the same in either case.
• First, we ask: Are there any significant differences in the mean of the
dependent variable across the different groups?
• If the answer is “yes,” we ask the second question: Which of the groups differs
significantly from which others?
39
(2/8) Confidence Intervals

• If we can reject the equal-means hypothesis, then it is customary to form confidence


intervals for the differences between pairs of population means.
• The confidence interval for any difference μi − μj is of the form shown in the
expression below:
1 1
𝑌ത 𝑖 − 𝑌ത 𝑗 ± 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 × 𝑀𝑆𝑊( + )
𝑛𝑖 𝑛𝑗
– There are several possibilities for the appropriate multiplier in this expression.
– Regardless of the multiplier, we are always looking for confidence intervals that
do not include 0.
– If the confidence interval for μi − μj is all positive (or all negative), then we can
conclude with high confidence that these two means are not equal and that μi is
40
larger than μj.
(4/8) Confidence Intervals

Without Correction
• Set confidence level, e.g. 95%
• Calculate α (1 – CI = 0.05)
• Identify the correct number of degrees of freedom
– Total sample size of the two groups to be compared minus 2, i.e. the number of groups to be
compared
• Use a t-distribution to obtain the t-value, i.e. multiplier, at the 0.05 significance level
the appropriate degrees of freedom
• Multiply the resulting values with the standard error for the difference in means, i.e.
1 1
𝑀𝑆𝑊( + )
𝑛𝑖 𝑛𝑗

41
Exercise using one_way.xlsx

• Objective:
– To use one-way ANOVA to see whether shelf height makes any difference in mean sales of
brand X, and if so, to discover which shelf heights outperform the others.
• Solution:
– Calculate the confidence intervals for the three pairs of means.
– Which ones are significantly different from zero?

42
One-way ANOVA

ANOVA as Regression
Assumptions in One-way ANOVA

43
(1/3) Using Regression to Perform ANOVA

• Most of the same ANOVA results obtained by traditional ANOVA can be obtained by
multiple regression analysis.
– Remember the use of categorical predictors in regression.
– The advantage of using regression is that many people understand regression
better than the formulas used in traditional ANOVA.
– The disadvantage is that some of the traditional ANOVA output can be obtained
with regression only with some difficulty.

44
(2/3) Using Regression to Perform ANOVA

• To perform ANOVA with regression,


– run a regression with the same dependent variable as in ANOVA and
– use dummy variables for the treatment levels as the only explanatory variables.
• In the resulting regression output,
– the ANOVA table will be exactly the same as the ANOVA table we obtain from traditional
ANOVA,
– the coefficients of the dummy variables will be estimates of the mean differences between
the corresponding treatment levels and the reference level.

45
(3/3) Using Regression to Perform ANOVA

• The regression output also provides an R2 value, the percentage of the variation of the
dependent variable explained by the various treatment levels of the single factor.
• This R2 value is not part of the traditional ANOVA output.
• However, we do not automatically obtain confidence intervals for some mean
differences, and the confidence intervals we do obtain are not of the “Tukey” type we
obtain with ANOVA.

46
Exercise using one_way.xlsx

• Objective:
– To use regression instead of ANOVA and compare the results.
• Solution:
– The treatment levels, i.e. shelf height, can be perceived as categorical variable.
– To run this in a regression format, you first need to create a dummy coded variable
version (for the Data Analysis Tab) of the dataset.
– Note that CI are different from the ANOVA output. The confidence interval indicates that
the mean for next-to-highest is significantly larger than the mean for next-to-lowest.
– This is basically because the Tukey intervals quoted in the ANOVA output are more
“conservative” and typically lead to fewer significant differences.
– How much variance in sales is explained by shelf height?

47
Exercise using one_way.xlsx

• Solution:
– To be able to use regression as ANOVA in Excel, the data need to be organized in a so-
called “long-format”, i.e. they need to be recoded such that:
A B C D E A B C D E F
1 Lowest Next-to-lowest Middle Next-to-highest Highest
1 lowest next-to-lowestmiddlenext-to-highesthighest sales
2 340 347 444 456 358
3 376 428 281 471 427 2 1 0 0 0 0 340
4 378 219 378 484 325 3 1 0 0 0 0 376
5 371 431 425 448 428 4 1 0 0 0 0 378
6 395 377 485 330 522 5 1 0 0 0 0 371
7 332 238 353 405 455
6 1 0 0 0 0 395
8 307 368 332 375 315
9 333 364 453 546 466 7 1 0 0 0 0 332
10 239 529 466 489 341 8 1 0 0 0 0 307
11 301 399 377 502 204 9 1 0 0 0 0 333
12 298 505 471 373 317 10 1 0 0 0 0 239
13 358 412 178 486 342
11 1 0 0 0 0 301
14 373 430 301 513 326
15 387 328 504 346 371 12 1 0 0 0 0 298
16 351 431 388 319 331 13 1 0 0 0 0 358
17 235 541 423 475 387 14 1 0 0 0 0 373
18 307 459 426 242 416 15 1 0 0 0 0 387
19 278 318 418 424 422
20 455 302 442 425 479
16 1 0 0 0 0 351
21 346 394 327 274 351 17 1 0 0 0 0 235
22 355 225 354 358 330 18 1 0 0 0 0 307
23 202 374 381 411 449 19 1 0 0 0 0 278
24 389 345 284 564 461 20 1 0 0 0 0 455
25 417 329 349 395 375
26 250 374 346 546 399
21 1 0 0 0 0 346

48
(1/1) Assumptions in ANOVA

• Inferences based on the ANOVA procedure


rely on two assumptions:
– equal variances across treatment levels and
– normally distributed data.
• Often a look at side-by-side box plots can
indicate whether there are serious violations
of these assumptions.
• If the assumptions are seriously violated, you
should not blindly report the ANOVA results.
• In some cases, a transformation of the data
will help.

49
Two-way ANOVA

The Multiple Comparison Problem


Two-way ANOVA
Confidence Intervals for Contrasts
Assumptions in Two-Way ANOVA

50
(1/7) The Multiple Comparison Problem

• In many statistical analyses, including ANOVA studies, we want to make statements


about multiple unknown parameters.
• Any time we make such a statement, there is a chance that we will be wrong; that is,
there is a chance that the true population value will not be inside the confidence
interval.
– For example, if we create a 95% confidence interval, then the error probability is 0.05.
– However, in statistical terms, if we run each confidence interval at the 95% level, the
overall confidence level (of having all statements correct) is much less than 95%.
– This is called the multiple comparison problem.
– It says that if we make a lot of statements, each at a given confidence level such as 95%,
then the chance of making at least one wrong statement is much greater than 5%.

51
(5/8) Confidence Intervals

The multiple comparison problem


• As we assess simultaneous differences in means, we have several sources of error
when using a confidence level of 0.05
– We are 95% certain that the true mean for n1 is within CI
– We are 95% certain that the true mean for n2 is within CI
– We are 95% certain that the true mean for n3 is within CI
– 0.95 * 0.95 * 0.95 = 0.857
→Level of confidence is decreased
• The question is how to get the overall confidence level equal to the desired value,
such as 95%
• The answer is that we need to correct the individual confidence intervals.
52
(3/7) The Multiple Comparison Problem

• The three most popular correction methods are:


– Tukey method
– Bonferroni method
– Scheffé method
• All of the correction methods use a multiplier that is larger than the multiplier used for
the “no-correction” method.
• By using a larger multiplier, we get a wider confidence interval, which decreases the
chance that the confidence interval will fail to include the true mean difference.
• This decreases the chance that the CI will fail to include the true mean difference.

53
(6/8) Confidence Intervals

With correction
• Tukey
– Uses a q-value from a studentized q-table (instead of a t-value from a t-distribution)
– Cannot be calculated in Excel (but in StatTools) and has to be looked up in table instead.
1 1 𝑞 − 𝑣𝑎𝑙𝑢𝑒 1 1
ത ത
𝑌𝑖 − 𝑌𝑗 ± 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 × 𝑀𝑆𝑊( + ) 𝑌ത 𝑖 − 𝑌ത 𝑗 ± × 𝑀𝑆𝑊( + )
𝑛𝑖 𝑛𝑗 2 𝑛𝑖 𝑛𝑗

– In our shelf example, the multiplier to use would be slightly higher, about 2.4 instead of
2.0
– Results in wider CI (as in the other correction methods).

54
(7/8) Confidence Intervals

With correction
• Bonferroni
– Finds a value for alpha that would correct for the change in confidence level
– To compute the significance level for the Bonferroni correction, simply divide the
significance level by the number of groups J
𝛼/2 𝛼 1 𝛼
– 𝛼𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑒𝑑 = = ∗ =
𝐽 2 𝐽 2𝐽
– In our example: 0.05 (2*3) = 0.0083
– The new confidence level is 1 − 𝛼 𝐽
– In our shelf-example: (1-0.0083)³ = 0.9752
– Using this information, we can calculate a correct t-multiple to use for CI creation

55
(8/8) Confidence Intervals

With correction
• Scheffé
– Is the most conservative correction method among the three and results in the widest CIs
– Requires the calculation of an F-value for the multiplier such that

1 1 1 1
ത ത
𝑌𝑖 − 𝑌𝑗 ± 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 × 𝑀𝑆𝑊( + ) 𝑌ത 𝑖 − 𝑌ത 𝑗 ± 𝐽 − 1 ∗ 𝐹𝛼,𝐽−1,𝑛−𝐽 ∗ 𝑀𝑆𝑊( + )
𝑛𝑖 𝑛𝑗 𝑛𝑖 𝑛𝑗

56
(5/7) The Multiple Comparison Problem

• The reason there are so many methods has to do with the purpose of the study.
– When a researcher who initiates a study has a particular interest in a few specific
differences, the differences of interest are called planned comparisons.
– Planned contrasts: Are comparisons that have been known before the study was
conducted.
– Unplanned contrasts: Are comparisons done after the data were collected. Usually
multiple pairwise comparisons.
– If there are only a few differences of interest, the no-correction method is usually
acceptable.
– If there are more than a few planned comparisons, then it is better to report
corrected CIs.
57
(6/7) The Multiple Comparison Problem

Study is interested in a few,


Use planned comparisons,
specific differences between
means
i.e. no correction

Use unplanned
Study is interested in examining
comparisons, i.e. Tukey,
all differences between means
Bonferroni

58
(1/5) Two-Way ANOVA

• In two-way ANOVA, we allow two factors, each at several levels.


• Some of the ideas from one-way ANOVA carry over to two-way ANOVA.
• However, there are differences in the data setup, the analysis itself, and perhaps most
important, the types of questions we ask.
Dependent
Independent Variable
Variables or
Factors Prize
• Low
Purchase intention
• Medium
• High
Treatment
Quality
levels
• Low
59
• High
(2/5) Two-Way ANOVA – Key Terms

Full Factorial Design Incomplete Design

An experimental design in which An experimental design in which


observations are made at each observations are made only at a subset
combination of factor levels. of the combinations of factor levels.

60
(3/5) Two-Way ANOVA – Key Terms

Main effects Interactions

Indicate patterns of differences in


Indicate whether there are different means that could not be guessed from
means for treatment levels of one the main effects alone. They exist when
factor when averaged over the levels of the effect of one factor on the
the other factors. dependent variable depends on the
level of the other factor.

61
(1/2) Confidence Intervals for Contrasts

• If you find that main effects and/or interactions are significant, then you will probably
want to check which factor levels, or factor level combinations, produce significantly
larger means than others.
• A contrast is a weighted combination of means where the weights sum to 0; used to
contrast one combination of means with another
• An example of a simple contrast is the difference between two means.
• Examples:
– μ3-μ1 You would study this contrast if you were interested in whether μ3 is different from μ1
– (μ1+μ2)∕2-(μ3+μ4+μ5)∕3 You would study this contrast if you were interested in whether the
average of μ1 and μ2 is different from the average of μ3, μ4, and μ5
• Once StatTools has been used to run a two-way ANOVA, you can then form confidence
intervals for any contrasts of interest.
62
(2/2) Confidence Intervals for Contrasts

• Confidence Intervals for Contrasts

𝑃𝑜𝑖𝑛𝑡 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑜𝑓 𝑐𝑜𝑛𝑡𝑟𝑎𝑠𝑡 ± 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 × 𝑀𝑆𝑊 ෍ 𝑐𝑗2 /𝑛𝑗


𝑗
• Notation
– 𝑀𝑆𝑊 = mean square error from ANOVA output
– nj = sample size
– cj = coefficient of the corresponding mean in the contrast
– The point estimate of a contrast is the difference between means of interest

63
Exercise using two_way

• Assume that there are five major brands of golf balls, labeled A through E.
• A consumer testing service runs an experiment where 60 balls of each brand are
driven under three temperature conditions
– cool (about 40 degrees),
– mild (about 65 degrees), and
– warm (about 90 degrees)
• to see whether some brands differ significantly, on average, from other brands and
what effect temperature has on mean differences between brands.

64
Exercise using two_way

• This is a controlled experiment where the experimental units are the individual golf
balls and the dependent variable is the length (in yards) of each drive.
• There are two factors: brand and temperature. The brand factor has five treatment
levels, A through E, and temperature has three levels: cool, mild, and warm.
• This is called a full factorial two-way design because the golf balls are tested at each
of the 15 possible treatment level combinations.

65
Exercise using two_way, i.e the PDF files

• Solution:
– Once the details of the experiment have been decided
and the golf balls have been hit, there will be 300
observations (yardages) at various conditions.
– The usual way to enter the data in Excel is in the
stacked or long form.
– There must be two categorical variables that represent
the levels of the two factors (Brand and Temperature)
and a measurement variable that represents the
dependent variable (Yards).
– Although many rows are hidden in the figure, there
are actually 300 rows of data, 20 for each of the 15
combinations of Brand and Temperature.

66
Exercise using two_way, i.e the PDF files

• Questions to ask:
– Looking at the column “grand total”, do any brands average significantly more yards than others?
– Looking at the row “grand total”, do average yardages differ significantly across temperatures?
– Looking at the columns, do differences among averages of brands depend on temperature?
– Looking at the rows, do differences among averages of temperatures depend on brand?

Average of Yards Temperature


Brand Cool Mild Warm Grand Total
A 218.8 236.5 258.4 237.9
B 224.1 245.1 258.3 242.5
C 228.0 242.7 263.0 244.6
D 215.0 237.6 256.1 236.2
E 224.8 255.7 270.9 250.5
67
Grand Total 222.1 243.5 261.4 242.3
Exercise using two_way, i.e the PDF files

68
Exercise using two_way, i.e the PDF files

• Questions to ask:
• The next question is whether the main effects and interactions in a table of sample means are
statistically significant.
• As in one-way ANOVA, two-way ANOVA collects the information about the different sources
of variation in an ANOVA table.
• However, instead of just two sources of variation, there are now four sources of variation:
1. One for within variation as in one-way ANOVA
2. one for the main effect of each factor,
3. one for interactions, and
4. one for variation within treatment level combinations (remaining error variation)
• Test whether main effects or interactions are statistically significant by examining p-values.

69
Exercise using two_way, i.e the PDF files

70
Exercise using two_way, i.e the PDF files

• Objective:
– To form and test contrasts for the golf ball data, and to interpret the results.
• Solution:
– One golf ball retail shop would like to test the claims that
• (1) brand C beats the average of the other four brands in cool weather and
• (2) brand E beats the average of the other four brands when it is not cool.
– Let μC,W be the mean yardage for brand C balls hit in warm weather, and define similar means for
the other brands and temperatures.
– Then the first claim concerns the contrast
𝜇𝐴𝐶 + 𝜇𝐵𝐶 + 𝜇𝐷𝐶 + 𝜇𝐸𝐶
𝜇𝐶𝐶 −
4
– and the second claim concerns the contrast
𝜇𝐸𝑀 + 𝜇𝐸𝑀 𝜇𝐴𝑀 + 𝜇𝐴𝑊 + 𝜇𝐵𝑀 + 𝜇𝐵𝑊 + 𝜇𝐶𝑀 + 𝜇𝐶𝑊 + 𝜇𝐷𝑀 + 𝜇𝐷𝑊

71 2 8
Exercise using two_way, i.e the PDF files

• Solution: 2) You
Sum h
Cool Mild Warm
1. Record the coefficients of the means of the contrasts
A -0.25 0 0
2. The point estimate of a contrast is the SUMPRODUCT B -0.25 0 0
of the samples means and these coefficients. C 1 0 0
D -0.25 0 0
3. Use the t-value to determine the Cis. The degrees of E -0.25 0 0
freedom correspond to the ones from the error
variation. Cool Mild Warm
A 0 -0.125 -0.125
4. Apply the formula. B 0 -0.125 -0.125
C 0 -0.125 -0.125
D 0 -0.125 -0.125
E 0 0.5 0.5

72
Exercise using two_way

• Solution:
– Brand C beats the average of the
competition by at least 1.99 yards in
cool weather.
– Brand E beats the average of the
competition by at least 9.86 yards in
weather that is not cool.

73
(1/3) Assumptions of Two-Way ANOVA

• The assumptions for the two-way ANOVA procedure are basically the same as for one-
way ANOVA.
– If we focus on any particular combination of factor levels, we assume that the:
1. Distribution of values for this combination is normal, and
2. Variance of values at this combination is the same as at any other
combination.
• It is always wise to check for at least gross violations of these assumptions, especially
the equal-variance assumption.
• The StatTools output provides an informal check by providing a table of standard
deviations for the factor level combinations.
• It is useful to use log-transformation of variance are very different.
74
(2/3) Assumptions of Two-Way ANOVA

75
Experimental Design

Fundamentals
Randomization and Blocking
Incomplete Designs

76
(1/2) Experimental Design

• We can break up the topic of experimental design into two parts:


– The actual design of the experiment
– The analysis of the resulting data

77
(2/2) Experimental Design

• Experimental design has to do with the


1. selection of factors,
2. the choice of the treatment levels,
3. the way experimental units are assigned to the treatment level combinations,
4. and the conditions under which the experiment is run.
• These decisions must be made before the experiment is performed, and they should
be made very carefully.
• Experiments are typically costly and time-consuming, so the experiment should be
designed (and performed) in a way that will provide the most useful information
possible.

78
(1/2) Randomization and Blocking

• The purpose of most experiments is to see which of several factors have an effect on a
dependent variable.
• The factors in question are chosen as those that are controllable and most likely to
have some effect.
• Often, however, there are “nuisance” factors that cannot be controlled, at least not
directly.
– One important method for dealing with such nuisance factors is randomization—
the process of randomly assigning experimental units so that nuisance factors are
spread uniformly across treatment levels.

79
Exercise using randomization.xlsx

• Objective:
– To use randomization of paper types to see whether differences in sharpness are really due to
different brands of printers.
• Solution:
– A computer magazine would like to test sharpness of printed images across three popular brands of
inkjet printers.
– It purchases one printer of each brand, prints several pages on each printer, and measures the
sharpness of image on a 0-100 scale for each page.
– Which printer appears to be best?
– Why might these results be misleading?

80
Exercise using randomization.xlsx

81
Exercise using randomization.xlsx

• Objective:
– To use randomization of paper types to see whether differences in sharpness are really due to
different brands of printers.
• Solution:
– The data and analysis indicate that printer A is best on average and printer C is worst.
– Suppose, however, that there is another factor, type of paper, that is not the primary focus of the
study but might affect the sharpness of the image.
– Suppose further that all type 1 paper is used in printer A, all type 2 paper is used in printer B, and
all type 3 paper is used in printer C.
– It is possible that type 1 paper tends to produce the sharpest image, regardless of the printer used.
– The solution is to randomize over paper type: For each sheet to be printed by any printer, randomly
select a paper type.
– This will tend to even out the paper types across the printers, and we can be more
82 confident that any differences are due to the printers themselves.
Exercise using randomization.xlsx

83
(2/2) Randomization and Blocking

• Another method for dealing with nuisance factors is blocking.


• Blocking: A technique of assigning experimental units to similar blocks of experimental
units to decrease error variation
• There are many forms of blocking designs.
• The simplest is the randomized block design, in which the experimental units are
divided into several “similar” blocks.
– Then each experimental unit within a given block is randomly assigned a different
treatment level.

84
Exercise using blocking.xlsx

• Objective:
– To use a blocking design with store as the blocking variable to see whether type of
dispenser makes a difference in sales of liquid soap.
• Solution:
– SoftSoap Company is introducing a new liquid hand soap into the market, and four types
of dispensers are being considered.
– It chooses eight supermarkets that have carried its products, and it asks each supermarket
to stock all four versions of its new product for a 2-week test period.
– It records the number of items purchased at each store during this period.

85
Exercise using blocking.xlsx

• Solution:
– In this experiment, there is a single factor, dispenser type, varied at four levels, and there
are eight observations at each level.
– However, it is very possible that the dependent variable, number of sales, is correlated
with store.
– Therefore, we treat each store as a block, so that the experimental design appears as
shown on the next slide.
– Each treatment level (dispenser type) is assigned exactly once to each block (store).
– To obtain this output, use the StatTools Two-Way ANOVA procedure, with Store and
Dispenser as the two categorical variables.

86
Exercise using blocking.xlsx

87
Exercise using blocking..xlsx

• Solution:
– Because there is only one observation per
store/dispenser combination, the ANOVA table
has no Interaction row.
– However, it still provides interaction charts to
check for the no-interaction assumption.
– The F-value and corresponding p-value in row
47 of the ANOVA table are for the main effect
of dispenser type.
– Because the p-value is essentially 0, there are
significant differences across dispenser types.
– If SoftSoap had to market only one dispenser
type, it would almost certainly select type 3.

88
(1/4) Incomplete Designs

• In a full factorial design, one or more observations are obtained for each combination
of treatment levels.
– This is the preferred way to run an experiment from a statistical point of view, but it
can be very expensive, or even infeasible, if there are more than a few factors.
• As a result, statisticians have devised incomplete, or fractional factorial, designs that
test only a fraction of the possible treatment level combinations.
– Obviously, something is lost by not gaining information from all of the possible
combinations.
– Specifically, different effects are confounded, which means that they cannot be
estimated independently.

89
(2/4) Incomplete Designs

• A “half-fractional” design with four factors, each at


two levels, is shown at the right.
– If this were a full factorial design, there would
be 24 = 16 combinations of treatment levels.
– The “half-fractional” design means that only
half, or eight, of these are used.
– When using only two levels for each factor, it is
customary to label the lower level with a − 1
and the higher level with a + 1.
– Each row in the figure represents one of eight
combinations of the factor levels.

90
(3/4) Incomplete Designs

• To see how the confounding works, it is useful to create new columns by multiplying
the appropriate original A-D columns.
• There is now a column for each possible two-way and three-way interaction, and the
columns come in pairs (e.g., AB is the same as CD).
– When two columns are identical, we say that one is the alias of the other.
• If two effects are aliases of one another, it is impossible to estimate their
separate effects.
• Therefore, we try to design the experiment so that only one of these is likely to
be important and the other is likely to be insignificant.

91
(4/4) Incomplete Designs

92
Conclusion

93
Conclusion

• The business world has also begun to realize the importance of designed experiments
for designing and producing better products, and this trend will undoubtedly continue.
• It is important to keep sight of the overall goal: to see whether variations in one or
more factors have significant effects on a dependent variable of interest.
• The role of experimental design is to set up experiments in a way—using
randomization, blocking, fractional factorial designs, or whatever—to get as much
information from the resulting data as possible.
• Then the techniques of ANOVA indicate whether any main effects or interactions are
significant.
• If there are significant effects, confidence intervals can be formed to measure the
magnitudes of specific differences between means or other contrasts.
• The goal of good experimental design is to identify important factor effects when they
94 exist.
If you want to learn more:

Advanced Market Research: Experimentation


• How to formulate research questions, managerial
problems, and hypotheses.
• How to use experiments for causal research.
• How to set-up, design, conduct and analyze
experimental data.
• Taught by Professor Franziska Krause, Assistant
Professor for Customer Demand and Brand Experience
at EBS

95

You might also like