0% found this document useful (0 votes)
155 views45 pages

Analysis of Variance and Covariance: Chapter 16 Marketing Research

This document provides an overview of analysis of variance (ANOVA) and covariance techniques. It defines key terms like factors, treatments, dependent and independent variables. It outlines the steps in conducting a one-way ANOVA, including identifying variables, decomposing total variation, measuring effects through statistics like eta-squared, testing significance with the F-statistic, and interpreting results. An illustrative example is provided using data on store sales and promotion levels.

Uploaded by

Shachi Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views45 pages

Analysis of Variance and Covariance: Chapter 16 Marketing Research

This document provides an overview of analysis of variance (ANOVA) and covariance techniques. It defines key terms like factors, treatments, dependent and independent variables. It outlines the steps in conducting a one-way ANOVA, including identifying variables, decomposing total variation, measuring effects through statistics like eta-squared, testing significance with the F-statistic, and interpreting results. An illustrative example is provided using data on store sales and promotion levels.

Uploaded by

Shachi Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Analysis of Variance and Covariance

Chapter 16 Marketing Research


Chapter Outline
• Relationship Among Techniques
• One-Way Analysis of Variance
• Statistics Associated with One-Way Analysis of Variance
• Conducting One-Way Analysis of Variance
i. Identification of Dependent & Independent

Variables
ii. Decomposition of the Total Variation
iii. Measurement of Effects
iv. Significance Testing
v. Interpretation of Results
• Illustrative Data
• Illustrative Applications of One-Way Analysis of
Variance
• Assumptions in Analysis of Variance
• N-Way Analysis of Variance
• Analysis of Covariance
• Issues in Interpretation
i. Interactions
ii. Relative Importance of Factors
iii. Multiple Comparisons
Relationship Among Techniques
• Analysis of variance (ANOVA) is used as a test of
means for two or more populations. The null
hypothesis, typically, is that all means are equal.
• Analysis of variance must have a dependent
variable that is metric (measured using an interval
or ratio scale).
• There must also be one or more independent
variables that are all categorical (nonmetric).
Categorical independent variables are also called
factors.
• A particular combination of factor levels, or
categories, is called a treatment.
• One-way analysis of variance involves only one
categorical variable, or a single factor. In one-way
analysis of variance, a treatment is the same as a
factor level.
• If two or more factors are involved, the analysis is
termed n-way analysis of variance.
• If the set of independent variables consists of both
categorical and metric variables, the technique is
called analysis of covariance (ANCOVA). In this
case, the categorical independent variables are still
referred to as factors, whereas the metric-
independent variables are referred to as covariates.
Relationship Amongst Test, Analysis of
Variance, Analysis of Covariance, & Regression
One-way Analysis of Variance
• Marketing researchers are often interested in examining
the differences in the mean values of the dependent
variable for several categories of a single independent
variable or factor. For example:

• Do the various segments differ in terms of their volume


of product consumption?
• Do the brand evaluations of groups exposed to different
commercials vary?
• What is the effect of consumers' familiarity with the
store (measured as high, medium, and low) on
preference for the store?
Statistics Associated with One-way
Analysis of Variance
• eta2 (η2): The strength of the effects of X
(independent variable or factor) on Y (dependent
variable) is measured by eta2 (η2). The value of η2
varies between 0 and 1.
• F statistic: The null hypothesis that the category
means are equal in the population is tested by an F
statistic based on the ratio of mean square related
to X and mean square related to error.
• Mean square: This is the sum of squares divided
by the appropriate degrees of freedom
• SS between: Also denoted as SS x, this is the variation
in Y related to the variation in the means of the
categories of X. This represents variation between
the categories of X, or the portion of the sum of
squares in Y related to X.
• SS within: Also referred to as SS error, this is the
variation in Y due to the variation within each of the
categories of X. This variation is not accounted for
by X.
• SS y: This is the total variation in Y.
Conducting One-way ANOVA
Identify the Dependent and Independent Variable

• The dependent variable is denoted by Y and the


independent variable by X.
• X is a categorical variable having c categories.
• There are n observations on Y for each category of
X, as shown in figure 16.1.
• The sample size in each category of X is n, and the
total sample size N = n * c.
Decompose the Total Variation
• Decomposition of the total variation: In one way ANOVA,
separation of the variation observed in the dependent variable into
the variation due to the independent variables plus the variation due
to error.
• The total variation in Y, denoted by SSy, can be decomposed into
two components: 
SS y = SS between + SS within
 
• where the subscripts between and within refer to the categories of X.
SS between is the variation in Y related to the variation in the means of
the categories of X.
• For this reason, SS between is also denoted as SS x. SS within is the
variation in Y related to the variation within each category of X. SS
within is not accounted for by X. Therefore it is referred to as SS error.
• The total variation in Y may be decomposed as:
SS y = SS x + SS error

• In analysis of variance, we estimate two measures of variation:


within groups (SS within) and between groups (SS between).
• Thus, by comparing the Y variance estimates based on
between-group and within-group variation, we can test the null
hypothesis.
Measure the Effects

The strength of the effects of X on Y are measured as


follows:
 
η2 = SS x/SS y
= (SS y – SS error)/ SS y
 
The value of η2 varies between 0 and 1.
Test the Significance
• In one-way analysis of variance, the interest lies in testing the null
hypothesis that the category means are equal in the population.
 
H0: µ1 = µ2 = µ3 = ........... = µc
 
• Under the null hypothesis, SS x and SS error come from the same source of
variation. In other words, the estimate of the population variance of Y,

Sy2= SS x / (c - 1)
= Mean square due to X
= MS x
or
Sy2 = SS error/(N - c)
= Mean square due to error
= MS error
• The null hypothesis may be tested by the F
statistic based on the ratio between these two
estimates:
SS x /(c - 1) MS x
F= =
SS error/(N - c) MS error
• This statistic follows the F distribution, with (c -
1) and (N - c) degrees of freedom (df).
Interpret the Results
• If the null hypothesis of equal category means is
not rejected, then the independent variable does
not have a significant effect on the dependent
variable.
• On the other hand, if the null hypothesis is
rejected, then the effect of the independent
variable is significant.
• A comparison of the category mean values will
indicate the nature of the effect of the independent
variable.
Illustrative Applications of One-way
Analysis of Variance
• We illustrate the concepts discussed in this chapter
using the data presented in Table 16.2.
• The department store is attempting to determine the
effect of in-store promotion (X) on sales (Y).
• For the purpose of illustrating hand calculations, the
data of Table 16.2 are transformed in Table 16.3 to
show the store sales (Y ij) for each level of promotion.  
• The null hypothesis is that the category means are
equal:
H0: µ1 = µ2 = µ3
Table 16.2
Store Num ber Coupon Level In-Store Prom otion Sales Clientel Rating
1 1.00 1.00 10.00 9.00
2 1.00 1.00 9.00 10.00
3 1.00 1.00 10.00 8.00
4 1.00 1.00 8.00 4.00
5 1.00 1.00 9.00 6.00
6 1.00 2.00 8.00 8.00
7 1.00 2.00 8.00 4.00
8 1.00 2.00 7.00 10.00
9 1.00 2.00 9.00 6.00
10 1.00 2.00 6.00 9.00
11 1.00 3.00 5.00 8.00
12 1.00 3.00 7.00 9.00
13 1.00 3.00 6.00 6.00
14 1.00 3.00 4.00 10.00
15 1.00 3.00 5.00 4.00
16 2.00 1.00 8.00 10.00
17 2.00 1.00 9.00 6.00
18 2.00 1.00 7.00 8.00
19 2.00 1.00 7.00 4.00
20 2.00 1.00 6.00 9.00
21 2.00 2.00 4.00 6.00
22 2.00 2.00 5.00 8.00
23 2.00 2.00 5.00 10.00
24 2.00 2.00 6.00 4.00
25 2.00 2.00 4.00 9.00
26 2.00 3.00 2.00 4.00
27 2.00 3.00 3.00 6.00
28 2.00 3.00 2.00 10.00
29 2.00 3.00 1.00 9.00
30 2.00 3.00 2.00 8.00
TABLE 16.3
EFFECT OF IN-STORE PROMOTION ON SALES
Store Level of In-store Promotion
No. High Medium Low
Normalized Sales _________________
1 10 8 5
29 8 7
3 10 7 6
48 9 4
59 6 5
68 4 2
79 5 3
87 5 2
97 6 1
10 6 4 2
_____________________________________________________
 
Column Totals 83 62 37
Category means:  ȳ j 83/10 62/10 37/10
= 8.3 = 6.2 = 3.7
Grand mean,  ȳ = (83 + 62 + 37)/30 = 6.067
• It can be verified that
SSy = SSx + SSerror
as follows:
185.867 = 106.067 +79.80
• The strength of the effects of X on Y are measured
as follows:
2 = SS x/SS y
= 106.067/185.867
= 0.571
•  In other words, 57.1% of the variation in sales (Y)
is accounted for by in-store promotion (X),
indicating a modest effect. The null hypothesis
may now be tested.
SS x /(c - 1)
F= = MS X
SS error/(N - c) MS error

106.067/(3-1)
F=
79.800/(30-3)

= 17.944
• In the Statistical Appendix we see that for 2 and 27
degrees of freedom, the critical value of F is 3.35
for = 0.05 . Because the calculated value of F is
greater than the critical value, we reject the null
hypothesis.
• We now illustrate the analysis of variance
procedure using a computer program.
• The results of conducting the same analysis by
computer are presented in Table 16.4.
Table 16.3

Source of Sum of df Mean F ratio F prob.


Variation squares square
Between groups 106.067 2 53.033 17.944 0.000
(Promotion)
Within groups 79.800 27 2.956
(Error)
TOTAL 185.867 29 6.409

Source of Sum of df Mean F ratio F prob.


Variation squares square
Between groups 106.067 2 53.033 17.944 0.000
(Promotion)
Within groups 79.800 27 2.956
(Error)
TOTAL 185.867 29 6.409
Assumptions in Analysis of Variance
• The salient assumptions in analysis of variance can be
summarized as follows.

1. Ordinarily, the categories of the independent variable are


assumed to be fixed. Inferences are made only to the
specific categories considered. This is referred to as the
fixed-effects model.

2. The error term is normally distributed, with a zero mean


and a constant variance. The error is not related to any of
the categories of X.

3. The error terms are uncorrelated. If the error terms are


correlated (i.e., the observations are not independent), the F
ratio can be seriously distorted.
N-way Analysis of Variance
• In marketing research, one is often concerned with the
effect of more than one factor simultaneously. For example:

• How do advertising levels (high, medium, and low) interact


with price levels (high, medium, and low) to influence a
brand's sale?

• Do educational levels (less than high school, high school


graduate, some college, and college graduate) and age (less
than 35, 35-55, more than 55) affect consumption of a
brand?

• What is the effect of consumers' familiarity with a


department store (high, medium, and low) and store image
(positive, neutral, and negative) on preference for the store?
• Consider the simple case of two factors X1 and X2
having categories c1 and c2. The total variation in
this case is partitioned as follows:
SS total = SS due to X1 + SS due to X2 + SS due to
interaction of X1 and X2 + SS within
 or
SS y = SS x 1 + SS x 2 + SS x 1x 2 + SS error
• The strength of the joint effect of two factors,
called the overall effect, or multiple 2
, is
measured as follows
 2 (SS x + SS x + SS x x )/ SS y
• multiple = 1 2 1 2
• The significance of the overall effect may be tested by an F test, as follows

(SS x 1 + SS x 2 + SS x 1x 2)/dfn
F=
SS error/dfd
SS x 1,x 2,x 1x 2/ dfn
=
SS error/dfd
MS x 1,x 2,x 1x 2
=
MS error
where
 
dfn= degrees of freedom for the numerator
= (c1 - 1) + (c2 - 1) + (c1 - 1) (c2 - 1)
= c1c2 - 1
dfd = degrees of freedom for the denominator
= N - c1c2
MS = mean square
• If the overall effect is significant, the next step is to examine
the significance of the interaction effect. Under the null
hypothesis of no interaction, the appropriate F test is:

SS x 1x 2/dfn
F=
SS error/dfd

MS x 1x 2
=
MS error
where
 
dfn = (c1 - 1) (c2 - 1)
dfd= N - c1c2
Analysis of Covariance
• When examining the differences in the mean values
of the dependent variable related to the effect of the
controlled independent variables, it is often
necessary to take into account the influence of
uncontrolled independent variables.

• For example:
• In determining how different groups exposed to
different commercials evaluate a brand, it may be
necessary to control for prior knowledge.
• In determining how different price levels will affect
a household's cereal consumption, it may be
essential to take household size into account.
Issues in Interpretation
• Important issues involved in the interpretation of ANOVA
results include interactions, relative importance of factors, and
multiple comparisons.

Interactions

• The different interactions that can arise when conducting


ANOVA on two or more factors are shown in Figure 16.3.
• An interaction effect occurs when the effect of an independent
variable on dependent variable is different for different
categories or levels of another independent variable.
• The interaction may be ordinal or dis-ordinal.

.
• In ordinal interaction, the rank order of the effects
attributable to one factor does not change across
the levels of the second factor.
• Dis-ordinal interaction involves the change in rank
order of the effects of one factor across the levels
of another.
Patterns of Interaction
Case 1: No Interaction Case 2: Ordinal Interaction
X X
22 22
Y X Y X
21 21

X X X X X X
11 12 13 11 12 13
Case 3: Disordinal Interaction: Case 4: Disordinal Interaction:
Noncrossover Crossover
X X
22 22
Y X Y
21
X
21

X X X X X X
11 12 13 11 12 13
Relative Importance of Factors
• Experimental designs are usually balanced, in that
each cell contains the same number of
respondents.
• This results in an orthogonal design in which the
factors are uncorrelated.
• Hence, it is possible to determine unambiguously
the relative importance of each factor in
explaining the variation in the dependent variable.
• The most commonly used measure in ANOVA is
omega squared, ω2 .
• This measure indicates what proportion of the
variation in the dependent variable is related to a
particular independent variable or factor.
• The relative contribution of a factor X is calculated
as follows:
SS x - (dfx x MS error)
2x =
SS total + MS error

• Normally, ω2 is interpreted only for statistically


significant effects.
• In Table 16.5, ω2 associated with the level of in-
store promotion is calculated as follows:
2 106.067 - (2 x 0.967)
p =
185.867 + 0.967

104.133
=
186.834

= 0.557
Note, in Table 16.5, that
SStotal = 106.067 + 53.333 + 3.267 + 23.2
= 185.867
Likewise, the ω2 associated with couponing is:

 53.333 - (1 x 0.967)
2
c =
185.867 + 0.967

52.366
=
186.834

= 0.280

• As a guide to interpreting , a large experimental effect produces an


index of 0.15 or greater, a medium effect produces an index of
around 0.06, and a small effect produces an index of 0.01.
• In Table 16.5, while the effect of promotion and couponing are both
large, the effect of promotion is much larger.
Multiple Comparisons
• If the null hypothesis of equal means is rejected, we can
only conclude that not all of the group means are equal.
• We may wish to examine differences among specific
means. This can be done by specifying appropriate
contrasts, or comparisons used to determine which of
the means are statistically different.
• A priori contrasts are determined before conducting
the analysis, based on the researcher's theoretical
framework.
• Generally, a priori contrasts are used in lieu of the
ANOVA F test. The contrasts selected are orthogonal
(they are independent in a statistical sense).
• A posteriori contrasts are made after the analysis.
These are generally multiple comparison tests.
• They enable the researcher to construct generalized
confidence intervals that can be used to make
pairwise comparisons of all treatment means.
• These tests, listed in order of decreasing power,
include least significant difference, Duncan's
multiple range test, Student-Newman-Keuls,
Tukey's alternate procedure, honestly significant
difference, modified least significant difference,
and Scheffe's test.
• Of these tests, least significant difference is the
most powerful, Scheffe's the most conservative.
REPEATED MEASURES ANOVA
• An ANOVA technique used when respondents are
exposed to more then one treatment condition and
repeated measurements as obtained.
Here,
• SS total = SS between people+ SS within people
• SS within people = SS x + SS error
NON METRIC ANALYSIS OF VARIANCE

• Non- metric ANOVA: An ANOVA technique for


examining the difference in the central tendencies for
more than two groups when the dependent variable is
measured on an ordinal scale.
• k- sample median test: Non parametric test that is
used to examine differences among groups when the
dependent variable is measured on ordinal scale.
• Kruskal Wallis one way analysis of variance: A non
metric ANOVA test that uses the rank value of each
case not merely its location relative to the median.
MULTIVARIATE ANALYSIS OF VARIANCE

• An ANOVA technique using two or more metric


dependent variables.
• In MANOVA the null hypothesis is that the
vectors of all means on multiple dependent
variables are equal across groups.

You might also like