Discriminant Analysis (Student Notes)
Discriminant Analysis (Student Notes)
Discriminant Analysis
Discriminant Analysis
Analysis of Dependence
We now focus our discussion on the multivariate techniques that deal with
analysis of dependence.
The purpose of these techniques is to predict a variable from a set of
independent variables. The dependence techniques we cover in this course
include multiple regression, discriminant analysis, and cluster analysis.
Discriminant Analysis
The scatter plot in figure-1 yields two groups, one containing primarily Back
Yard Burgers customers and the other containing primarily households that
patronize other fast-food restaurants.
From this example, it appears that X1 and X2 are critical discriminators of fastfood restaurant patronage. Although the two areas overlap, the extent of the
overlap does not seem to be substantial.
QTA
Discriminant Analysis
From a statistical perspective, this involves studying the direction of group
differences based on finding a linear combination of independent variables
the discriminant functionthat shows large differences in group means.
Figure-I: Discriminant Analysis Scatter Plot of Lifestyle and Income Data for Fast-Food
Restaurant Patronage
We will use a two group discriminant analysis example in which the dependent
variable, Y , is measured on a nominal scale (i.e., patrons of Back Yard Burgers
versus other fast-food restaurants).
Now the researcher must find a linear function of the independent variables
that shows large differences in group means.
The discriminant score, or the Z score, is the basis for predicting to which
group the particular individual belongs and is determined by a linear function.
This Z score will be derived for each individual by means of the following
equation:
Zi b1 X1i b2 X2i bn Xni
QTA
Discriminant Analysis
the
score
of
each
Independent variables with large discriminatory power will have large weights,
and those with little discriminatory power will have small weights.
Discriminant function coefficients: The multipliers of variables in the
discriminant function when the variables are in the original units of
measurement.
Returning to our fast-food example, suppose the marketing manager finds the
standardized weights or coefficients in the equation to be:
Zi b1 X1 b2 X2
0.32X1 0.37X2
Total
110
122
N = 232
QTA
Discriminant Analysis
While our example illustrated how discriminant analysis helped classify users
and nonusers of the restaurant based on independent variables, other
applications include the following:
Direct
Marketing:
QTA
Discriminant Analysis
users and light users, or we could see if the perceptions differ depending on
how far customers drove to eat at Deli Depot.
The first thing you do is transfer variable X7 to the Grouping Variable box at
the top, and then click on the Define Range box just below it.
You must tell the program what the minimum and maximum numbers are for
the grouping variable.
In this case the minimum is 0 = female and the maximum is 1 = male, so just
put these numbers in and click on Continue.
Next you must transfer variables X1X6 into the Independents box. Then
click on the Statistics box at the bottom and check Means, Univariate
ANOVAS, and Continue. The Method default is Enter, and we will use this.
Now click on Classify and Compute from group sizes. We do not know if
the sample sizes are equal, so we must check this option. You should also click
Summary Table and then Continue.
We do not use any options under Save, so click OK to run the program.
Discriminant analysis is a program that gives you a lot of output you will not
use for a simple analysis like this one. We will look at only five tables.
At the bottom we see that the overall ability of our discriminant function to
predict group membership is 90 percent. This is very good because without
the discriminant function we could predict with only 60 percent accuracy (our
sample sizes are males = 20 and females = 30, so if we said all respondents
were female, we would predict with 60 percent accuracy).
Test of
Function(s)
1
Wilks
ChiLambda
Square
.317
51.687
Table-II: WIlks Lambda
Classification
Results
df
Sig.
.000
QTA
Origin
al
Gend
Femal
er
e
Coun Femal 26
t
e
%
Male
1
Femal 86.7
e
Male
5.0
Table-III: Classification
Discriminant Analysis
Male
Total
4
30
19
13.3
20
100.0
95.0
Results
100.0
Results shown in the table labeled Tests of Equality of Group Means show
which perceptions variables differ between males and females on a univariate
basis. Note that variables X1, X2, X3, and X6 are all highly statistically
significant. Variables X4 and X5 are not significant. To consider the variables
from a multivariate perspective, we can either look at the information in the
Standardized Canonical Discriminant Function Coefficients table or in
the Structure Matrix table.
Lets use the information from the Structure Matrix table. First we identify
the numbers in the Function column that are .30 or higher. This cutoff level is
determined in a manner similar to a factor loading.
X1Friendly
Employees
X2Competitive Price
X3Competent
Employees
X4Food Quality
X5Food Variety
X6Speed of Service
Wilks
Lambda
.679
df1
22.73
1
5
.855
8.140
1
.578
35.10
1
9
1.000
.010
1
1.000
.004
1
.520
44.37
1
7
Table-IV: Tests of Equality of Group Means
X6Speed of Service
X3Competent
Employees
X1Friendly Employees
X2Competitive Price
6
Functio
n
1
.655
.583
.469
.281
df2
Sig.
48
.000
48
48
.006
.000
48
48
48
.920
.947
.000
QTA
Discriminant Analysis
X4Food Quality
.010
X5Food Variety
.007
Table-V: Structure Matrix
The End