Discriminant & Logit Analysis Using SAS Enterprise Guide

Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

Discriminant & Logit Analysis

Using SAS Enterprise Guide


Similarities and Differences Between ANOVA, Regression, and
Discriminant Analysis
Similarities and Differences Among ANOVA, Regression, and
Discriminant Analysis
black Discriminant/
ANOVA Regression Logit Analysis
Similarities black black black
Number of dependent variables One One One
Number of independent variables Multiple Multiple Multiple
Differences black black black
Nature of the dependent variables Metric Metric Categorical/binary
Nature of the independent variables Categorical Metric Metric
Discriminant Analysis (1 of 3)
Discriminant analysis is a technique for analyzing data when the
criterion or dependent variable is categorical and the predictor or
independent variables are interval in nature.
Discriminant Analysis (2 of 3)
The objectives of discriminant analysis are as follows:
• Development of discriminant functions, or linear combinations
of the predictor or independent variables, which will best
discriminate between the categories of the criterion or
dependent variable (groups).
• Examination of whether significant differences exist among the
groups, in terms of the predictor variables.
• Determination of which predictor variables contribute to most of
the intergroup differences.
• Classification of cases to one of the groups based on the values
of the predictor variables.
• Evaluation of the accuracy of classification.
Discriminant Analysis (3 of 3)
• When the criterion variable has two categories, the technique is
known as two-group discriminant analysis.
• When three or more categories are involved, the technique is
referred to as multiple discriminant analysis.
• The main distinction is that, in the two-group case, it is possible
to derive only one discriminant function. In multiple discriminant
analysis, more than one function may be computed. In general,
with G groups and k predictors, it is possible to estimate up to
the smaller of G - 1, or k, discriminant functions.
• The first function has the highest ratio of between-groups to
within-groups sum of squares. The second function,
uncorrelated with the first, has the second highest ratio, and so
on. However, not all the functions may be statistically significant.
Geometric Interpretation
A Geometric Interpretation
of Two-Group Discriminant
Analysis
Discriminant Analysis Model
The discriminant analysis model involves linear combinations of
the following form:
D = b0 + b1X1 + b2X2 + b3X3 + . . . + bkXk
Where:
D = discriminant score
b 's = discriminant coefficient or weight
X 's = predictor or independent variable
• The coefficients, or weights (b), are estimated so that the groups differ
as much as possible on the values of the discriminant function.
• This occurs when the ratio of between-group sum of squares to within-
group sum of squares for the discriminant scores is at a maximum.
Statistics Associated with Discriminant Analysis (1 of 4)
• Canonical correlation. Canonical correlation measures the extent of association
between the discriminant scores and the groups. It is a measure of association
between the single discriminant function and the set of dummy variables that define
the group membership.
• Centroid. The centroid is the mean values for the discriminant scores for a particular
group. There are as many centroids as there are groups, as there is one for each
group. The means for a group on all the functions are the group centroids.
• Classification matrix. Sometimes also called confusion or prediction matrix, the
classification matrix contains the number of correctly classified and misclassified
cases.
Statistics Associated with Discriminant Analysis (2 of 4)
• Discriminant function coefficients. The discriminant function coefficients
(unstandardized) are the multipliers of variables, when the variables are in the original
units of measurement.
• Discriminant scores. The unstandardized coefficients are multiplied by the values of
the variables. These products are summed and added to the constant term to obtain
the discriminant scores.
• Eigenvalue. For each discriminant function, the Eigenvalue is the ratio of between-
group to within-group sums of squares. Large Eigenvalues imply superior functions.
Statistics Associated with Discriminant Analysis (3 of 4)
• F values and their significance. These are calculated from a one-way
ANOVA, with the grouping variable serving as the categorical independent
variable. Each predictor, in turn, serves as the metric dependent variable in the
ANOVA.

• Group means and group standard deviations. These are computed for each
predictor for each group.

• Pooled within-group correlation matrix. The pooled within-group correlation


matrix is computed by averaging the separate covariance matrices for all the
groups.
Statistics Associated with Discriminant Analysis (4 of 4)
• Standardized discriminant function coefficients. The standardized
discriminant function coefficients are the discriminant function
coefficients and are used as the multipliers when the variables have
been standardized to a mean of 0 and a variance of 1.
• Structure correlations. Also referred to as discriminant loadings, the
structure correlations represent the simple correlations between the
predictors and the discriminant function.
• Total correlation matrix. If the cases are treated as if they were from
a single sample and the correlations computed, a total correlation
matrix is obtained.
• Wilks'λ. Sometimes also called the U statistic, Wilks' λ for each
predictor is the ratio of the within-group sum of squares to the total
sum of squares. Its value varies between 0 and 1. Large values
of λ (near 1) indicate that group means do not seem to be different.
Small values of λ (near 0) indicate that the group means seem to be
different.
Conducting Discriminant Analysis
Conducting Discriminant Analysis
Conducting Discriminant Analysis
Formulate the Problem
• Identify the objectives, the criterion variable, and the
independent variables.
• The criterion variable must consist of two or more mutually
exclusive and collectively exhaustive categories.
• The predictor variables should be selected based on a
theoretical model or previous research, or the experience of the
analyst.
• One part of the sample, called the estimation or analysis
sample, is used for estimation of the discriminant function.
• The other part, called the holdout or validation sample, is
reserved for validating the discriminant function.
• Often the distribution of the number of cases in the analysis and
validation samples follows the distribution in the total sample.
Information on Resort Visits: Analysis Sample (1 of 2)

Information on Resort Visits: Analysis Sample


Annual Attitude Importance Age of Amount Spent
Resort Family Toward Attached to Household Head of on Family
No. Visit Income ($000) Travel Family Vacation Size Household Vacation
1 1 50.2 5 8 3 43 M (2)
2 1 70.3 6 7 4 61 H (3)
3 1 62.9 7 5 6 52 H (3)
4 1 48.5 7 5 5 36 L (1)
5 1 52.7 6 6 4 55 H (3)
6 1 75.0 8 7 5 68 H (3)
7 1 46.2 5 3 3 62 M (2)
8 1 57.0 2 4 6 51 M (2)
9 1 64.1 7 5 4 57 H (3)
10 1 68.1 7 6 5 45 H (3)
11 1 73.4 6 7 5 44 H (3)
12 1 71.9 5 8 4 64 H (3)
13 1 56.2 1 8 6 54 M (2)
14 1 49.3 4 2 3 56 H (3)
15 1 62.0 5 6 2 58 H (3)
Information on Resort Visits: Analysis Sample (2 of 2)

[Continued]
Annual Attitude Importance Age of Amount Spent
Resort Family Toward Attached to Household Head of on Family
No. Visit Income ($000) Travel Family Vacation Size Household Vacation
16 2 32.1 5 4 3 58 L (1)
17 2 36.2 4 3 2 55 L (1)
18 2 43.2 2 5 2 57 M (2)
19 2 50.4 5 2 4 37 M (2)
20 2 44.1 6 6 3 42 M (2)
21 2 38.3 6 6 2 45 L (1)
22 2 55.0 1 2 2 57 M (2)
23 2 46.1 3 5 3 51 L (1)
24 2 35.0 6 4 5 64 L (1)
25 2 37.3 2 7 4 54 L (1)
26 2 41.8 5 1 3 56 M (2)
27 2 57.0 8 3 2 36 M (2)
28 2 33.4 6 8 2 50 L (1)
29 2 37.5 3 2 3 48 L (1)
30 2 41.3 3 3 2 42 L (1)
Information on Resort Visits: Holdout Sample

Information on Resort Visits: Holdout Sample


Annual Amount
Family Attitude Importance Age of Spent on
Resort Income Toward Attached to Household Head of Family
No. Visit ($000) Travel Family Vacation Size Household Vacation
1 1 50.8 4 7 3 45 M (2)
2 1 63.6 7 4 7 55 H (3)
3 1 54.0 6 7 4 58 M (2)
4 1 45.0 5 4 3 60 M (2)
5 1 68.0 6 6 6 46 H (3)
6 1 62.1 5 6 3 56 H (3)
7 2 35.0 4 3 4 54 L (1)
8 2 49.6 5 3 5 39 L (1)
9 2 39.4 6 5 3 44 H (3)
10 2 37.0 2 6 5 51 L (1)
11 2 54.5 7 3 3 37 M (2)
12 2 38.2 2 2 3 49 L (1)
Conducting Discriminant Analysis to Estimate the
Discriminant Function Coefficients
• The direct method involves estimating the discriminant function so that all the
predictors are included simultaneously.

• In stepwise discriminant analysis, the predictor variables are entered


sequentially, based on their ability to discriminate among groups.
Results of Two-Group Discriminant Analysis (1 of 6)

Results of Two-Group Discriminant Analysis


Group Means blank blank blank blank blank
Visit Income Travel Vacation Hsize Age
1 60.52000 5.40000 5.80000 4.33333 53.73333
2 41.91333 4.33333 4.06667 2.80000 50.13333
Total 51.21667 4.86667 4.93333 3.56667 51.93333
Group Standard Deviations blank blank blank blank blank
1 9.83065 1.91982 1.82052 1.23443 8.77062
2 7.55115 1.95180 2.05171 0.94112 8.27101
Total 12.79523 1.97804 2.09981 1.33089 8.57395
Pooled Within-Groups Correlation Income Travel Vacation Hsize Age
Matrix
INCOME 1.00000 blank blank blank blank
TRAVEL 0.19745 1.00000 blank blank blank
VACATION 0.09148 0.08434 1.00000 blank blank
HSIZE 0.08887 −0.01681 0.07046 1.00000 blank
AGE −0.01431 −0.19709 0.01742 −0.04301 1.00000
Results of Two-Group Discriminant Analysis (2 of 6)

[Continued]
Wilks’ λ (U-statistic) and univariate F ratio with 1 and 28 degrees of freedom
Variable Wilks’ λ F Significance
INCOME 0.45310 33.80 0.0000
TRAVEL 0.92479 2.277 0.1425
VACATION 0.82377 5.990 0.0209
HSIZE 0.65672 14.64 0.0007
AGE 0.95441 1.338 0.2572

Canonical Discriminant Functions


Percent of Cumulative Canonical After
Function Eigenvalue Variance Percent Correlation Function Wilks’ λ Chi-Square df Sig.
blank

blank blank blank blank blank : 0 0.3589 26.130 5 0.0001


blank
1* 1.7862 100.00 100.00 0.8007 : blank blank blank blank
*Marks the 1 canonical discriminant functions remaining in the analysis.
Results of Two-Group Discriminant Analysis (3 of 6)

[Continued]
Standard Canonical Discriminant Function Coefficients
blank Func 1
INCOME 0.74301
TRAVEL 0.09611
VACATION 0.23329
HSIZE 0.46911
AGE 0.20922

Structure Matrix
Pooled within-groups correlations between discriminating variables and canonical
discriminant functions (variables ordered by size of correlation within function).
blank Func 1
INCOME 0.82202
HSIZE 0.54096
VACATION 0.34607
TRAVEL 0.21337
AGE 0.16354
Results of Two-Group Discriminant Analysis (4 of 6)

[Continued]
Unstandardized Canonical Discriminant Function Coefficients
blank Func 1
INCOME 0.8476710E-01
TRAVEL 0.4964455E-01
VACATION 0.1202813
HSIZE 0.4273893
AGE 0.2454380E-01
(constant) −7.975476

Canonical Discriminant Functions Evaluated at Group Means (Group Centroids)

Group Func 1

1 1.29118
2 −1.29118
Results of Two-Group Discriminant Analysis (5 of 6)

[Continued]

Classification Results
Predicted Group Membership
blank blank Visit 1 2 Total
Original Count 1 12 3 15
blank blank 2 0 15 15
blank % 1 80.0 20.0 100.0
blank blank 2 0.0 100.0 100.0
Cross-validated Count 1 11 4 15
blank blank 2 2 13 15
blank % 1 73.3 26.7 100.0
blank blank 2 13.3 86.7 100.0
aCross-validation is done only for those cases in the analysis. In cross-validation, each
case is classified by the functions derived from all cases other than that case.
b90.0% of original grouped cases correctly classified.
c80.0% of cross-validated grouped cases correctly classified.
Results of Two-Group Discriminant Analysis (6 of 6)

[Continued]
Classification Results for Cases Not Selected for Use in the Analysis
(Holdout Sample)
Predicted Group Membership
blank Actual Group No. of Cases 1 2
Group 1 6 4 2
blank blank blank 66.7% 33.3%
Group 2 6 0 6
blank blank blank 0.0% 100.0%

Percent of grouped cases correctly classified: 83.33%.


Conducting Discriminant Analysis to Determine the Significance of
Discriminant Function
• The null hypothesis that, in the population, the means of all discriminant
functions in all groups are equal can be statistically tested.
• If the null hypothesis is rejected, indicating significant discrimination, one can
proceed to interpret the results.
Conducting Discriminant Analysis
Interpret the Results
• The interpretation of the discriminant weights, or coefficients, is similar
to that in multiple regression analysis.
• Given the multicollinearity in the predictor variables, there is no
unambiguous measure of the relative importance of the predictors in
discriminating between the groups.
• With this caveat in mind, we can obtain some idea of the relative
importance of the variables by examining the absolute magnitude of
the standardized discriminant function coefficients.
• Some idea of the relative importance of the predictors can also be
obtained by examining the structure correlations, also called canonical
loadings or discriminant loadings. These simple correlations between
each predictor and the discriminant function represent the variance
that the predictor shares with the function.
• Another aid to interpreting discriminant analysis results is to develop a
Characteristic profile for each group by describing each group in
terms of the group means for the predictor variables.
Conducting Discriminant Analysis
Assess Validity of Discriminant Analysis
• Many computer programs, such as SPSS, offer a leave-one-out cross-validation option.
• The discriminant weights, estimated by using the analysis sample, are multiplied by the values
of the predictor variables in the holdout sample to generate discriminant scores for the cases in
the holdout sample. The cases are then assigned to groups based on their discriminant scores
and an appropriate decision rule. The hit ratio, or the percentage of cases correctly classified,
can then be determined by summing the diagonal elements and dividing by the total number of
cases.
• It is helpful to compare the percentage of cases correctly classified by discriminant analysis to
the percentage that would be obtained by chance. Classification accuracy achieved by
discriminant analysis should be at least 25% greater than that obtained by chance.
Results of Three-Group Discriminant Analysis (1 of 6)

Results of Three-Group Discriminant Analysis


Group Means blank blank blank blank blank
Amount Income Travel Vacation Hsize Age
1 38.57000 4.50000 4.70000 3.10000 50.30000
2 50.11000 4.00000 4.20000 3.40000 49.50000
3 64.97000 6.10000 5.90000 4.20000 56.00000
Total 51.21667 4.86667 4.93333 3.56667 51.93333
Group Standard Deviations blank blank blank blank blank
1 5.29718 1.71594 1.88856 1.19722 8.09732
2 6.00231 2.35702 2.48551 1.50555 9.25263
3 8.61434 1.19722 1.66333 1.13529 7.60117
Total 12.79523 1.97804 2.09981 1.33089 8.57395
Pooled Within-Groups Correlation blank blank blank blank blank
Matrix
blank Income Travel Vacation Hsize Age
INCOME 1.00000 blank blank blank blank
TRAVEL 0.05120 1.00000 blank blank blank
VACATION 0.30681 0.03588 1.00000 blank blank
HSIZE 0.38050 0.00474 0.22080 1.00000 blank
AGE −0.20939 −0.34022 −0.01326 −0.02512 1.00000
Results of Three-Group Discriminant Analysis (2 of 6)

[Continued]
Wilks’ λ (U-statistic) and univariate F ratio with 2 and 27 degrees of
freedom.
Variable Wilks’λ F Significance
INCOME 0.26215 38.000 0.0000
TRAVEL 0.78790 3.634 0.0400
VACATION 0.88060 1.830 0.1797
HSIZE 0.87411 1.944 0.1626
AGE 0.88214 1.804 0.1840

Canonical Discriminant Functions


Fcn Eigenvalue % of Variance CUM Pct Canonical Corr After Fcn Wilks’ λ Chi-square df Sig.
blank
blank blank blank blank blank
: 0 0.1664 44.831 10 0.00
1* 3.8190 93.93 93.93 0.8902 : 1 0.8020 5.517 4 0.24
blank blank blank blank blank blank
2* 0.2469 6.07 100.00 0.4450
*Marks the two canonical discriminant functions remaining in the analysis.
Results of Three-Group Discriminant Analysis (3 of 6)

[Continued]
Standardized Canonical Discriminant Function Coefficients
blank Func 1 Func 2
INCOME 1.04740 −0.42076
TRAVEL 0.33991 0.76851
VACATION −0.14198 0.53354
HSIZE −0.16317 0.12932
AGE 0.49474 0.52447

Structure Matrix
Pooled within-groups correlations between discriminating variables and canonical
discriminant functions (variables ordered by size of correlation within function).
Results of Three-Group Discriminant Analysis (4 of 6)

[Continued]
blank Func 1 Func 2
INCOME 0.85556* −0.27833
HSIZE 0.19319* 0.07749
VACATION 0.21935 0.58829*
TRAVEL 0.14899 0.45362*
AGE 0.16576 0.34079*

Unstandardized Canonical Discriminant Function Coefficients


blank Func 1 Func 2
INCOME 0.1542658 −0.6197148E-01
TRAVEL 0.1867977 0.4223430
VACATION −0.6952264E-01 0.2612652
HSIZE −0.1265334 0.1002796
AGE 0.5928055E-01 0.6284206E-01
(constant) −11.09442 −Z3.791600

Canonical Discriminant Functions Evaluated at Group Means (Group


Centroids)
Results of Three-Group Discriminant Analysis (5 of 6)

[Continued]
Group Func 1 Func 2
1 −2.04100 0.41847
2 −0.40479 −0.65867
3 2.44578 0.24020

Classification Results Predicted Group Membership


blank blank Amount 1 2 3 Total
Original Count 1 9 1 0 10
blank blank 2 1 9 0 10
blank blank 3 0 2 8 10
blank % 1 90.0 10.0 0.0 100.0
blank blank 2 10.0 90.0 0.0 100.0
blank blank 3 0.0 20.0 80.0 100.0
Cross-validated Count 1 7 3 0 10
blank blank 2 4 5 1 10
blank blank 3 0 2 8 10
blank % 1 70.0 30.0 0.0 100.0
blank blank 2 40.0 50.0 10.0 100.0
blank blank 3 0.0 20.0 80.0 100.0
Results of Three-Group Discriminant Analysis (6 of 6)

[Continued]
aCross-validation is done only for those cases in the analysis. In cross-validation, each
case is classified by the functions derived from all cases other than that case.
b86.7% of original grouped cases correctly classified.
c66.7% of cross-validated grouped cases correctly classified.

Classification Results for Cases Not Selected for Use in the Analysis
Predicted Group Membership
blank Actual Group No. of Cases 1 2 3
Group 1 4 3 1 0
blank blank blank 75.0% 25.0% 0.0%
Group 2 4 0 3 1
blank blank blank 0.0% 75.0% 25.0%
Group 3 4 1 0 3
blank blank blank 25.0% 0.0% 75.0%

Percent of grouped cases correctly classified: 75.0%


All-Groups Scattergram
All-Groups Scattergram
Territorial Map
Territorial Map
Stepwise Discriminant Analysis (1 of 2)
• Stepwise discriminant analysis is analogous to stepwise multiple
regression (see Chapter 17) in that the predictors are entered
sequentially based on their ability to discriminate between the
groups.
• An F ratio is calculated for each predictor by conducting a
univariate analysis of variance in which the groups are treated
as the categorical variable and the predictor as the criterion
variable.
• The predictor with the highest F ratio is the first to be selected
for inclusion in the discriminant function, if it meets certain
significance and tolerance criteria.
• A second predictor is added based on the highest adjusted or
partial F ratio, taking into account the predictor already selected.
Stepwise Discriminant Analysis (2 of 2)
• Each predictor selected is tested for retention based on its association with other
predictors selected.
• The process of selection and retention is continued until all predictors meeting the
significance criteria for inclusion and retention have been entered in the discriminant
function.
• The selection of the stepwise procedure is based on the optimizing criterion adopted.
The Mahalanobis procedure is based on maximizing a generalized measure of the
distance between the two closest groups.
• The order in which the variables were selected also indicates their importance in
discriminating between the groups.
The Logit Model
• The dependent variable is binary and there are several independent variables
that are metric
• The binary logit model commonly deals with the issue of how likely an
observation is to belong to each group
• It estimates the probability of an observation belonging to a particular group
Conducting Binary Logit Analysis
Conducting Binary Logit Analysis
Binary Logit Model Formulation
The probability of success may be modeled using the logit
model as:

𝑃
log 𝑒 = 𝑎0 + 𝑎1 𝑋1 + 𝑎2 𝑋2 + ⋯ + 𝑎𝑘 𝑋𝑘
1−𝑃

Or 𝑛
𝑃
log 𝑒 = ෍ 𝑎𝑖 𝑋𝑖
1−𝑃
𝑖=0
Model Formulation

exp σ𝑘𝑖=0 𝑎𝑖 𝑋𝑖
𝑃=
1 + exp σ𝑘𝑖=0 𝑎𝑖 𝑋𝑖

Where
P = Probability of success
Xi = Independent variable i
ai = parameter to be estimated.
Properties of the Logit Model
• Although Xi may vary from −∞ to +∞, P is constrained to lie between 0 and 1.

• When Xi approaches −∞, P approaches 0.

• When Xi approaches +∞, P approaches 1.

• When OLS regression is used, P is not constrained to lie between 0 and 1.


Estimation and Model Fit
• The estimation procedure is called the maximum likelihood method.
• Fit: Cox & Snell R Square and Nagelkerke R Square.
• Both these measures are similar to R2 in multiple regression.
• The Cox & Snell R Square can not equal 1.0, even if the fit is perfect.
• This limitation is overcome by the Nagelkerke R Square.
• Compare predicted and actual values of Y to determine the percentage of
correct predictions.
Significance Testing
The significance of the estimated coefficients is based on Wald’s statistic.
Wald = (ai / SEai)2
Where,
ai = logistical coefficient for that predictor variable
SEai = standard error of the logistical coefficient
The Wald statistic is chi-square distributed with 1 degree of freedom if the
variable is metric and the number of categories minus 1 if the variable is
nonmetric.
Interpretation of Coefficients
• If Xi is increased by one unit, the log odds will change by ai units, when the
effect of other independent variables is held constant.

• The sign of ai will determine whether the probability increases (if the sign is
positive) or decreases (if the sign is negative) by this amount.
Explaining Brand Loyalty (1 of 2)
Explaining Brand Loyalty
No. Loyalty Brand Product Shopping
1 1 4 3 5
2 1 6 4 4
3 1 5 2 4
4 1 7 5 5
5 1 6 3 4
6 1 3 4 5
7 1 5 5 5
8 1 5 4 2
9 1 7 5 4
10 1 7 6 4
11 1 6 7 2
12 1 5 6 4
13 1 7 3 3
14 1 5 1 4
15 1 7 5 5
Explaining Brand Loyalty (2 of 2)
[Continued]
No. Loyalty Brand Product Shopping
16 0 3 1 3
17 0 4 6 2
18 0 2 5 2
19 0 5 2 4
20 0 4 1 3
21 0 3 3 4
22 0 3 4 5
23 0 3 6 3
24 0 4 4 2
25 0 6 3 6
26 0 3 6 3
27 0 4 3 2
28 0 3 5 2
29 0 5 5 3
30 0 1 3 2
Results of Logistic Regression (1 of 2)
Results of Binary Logit Model or Logistic Regression

Dependent Variable Encoding


Original Value Internal Value
Not loyal 0
Loyal 1

Model Summary
Step −2 Log Likelihood Cox & Snell R Square Nagelkerke R Square
1 23.471a .453 .604
aEstimation terminated at iteration number 6 because parameter estimates changed by
less than .001.
Results of Logistic Regression (2 of 2)
[Continued]
Classification Tablea
blank blank blank blank Predicted blank
blank blank blank Loyalty to the Brand blank blank
blank Observed blank Not Loyal Loyal Percentage Correct
Step 1 Loyalty to the brand Not loyal 12 3 80.0
blank blank Loyal 3 12 80.0
blank Overall percentage blank blank blank 80.0
aThe cut Value is .500
Variables in the Equation
blank blank B S.E. Wald df Sig. Exp (B)
Step 1a Brand 1.274 .479 7.075 1 .008 3.575
blank Product .186 .322 .335 1 .563 1.205
blank Shopping .590 .491 1.442 1 .230 1.804
blank Constant -8.642 3.346 6.672 1 .010 .000
aVariable(s) entered on step 1: Brand, Product, Shopping.
SAS Enterprise Guide
Both two-group and multiple discriminant analysis can be performed using the
Discriminant Analysis task within SAS Enterprise Guide. To select this task, click:

Analyze>Multivariate>Discriminant Analysis

To run logit analysis or logistic regression using SAS Enterprise Guide, click:

Analyze > Regression > Logistic Regression


SAS Enterprise Guide: Two-Group Discriminant (1 of 2)
1. Open SAS Table_18_2 using SAS Enterprise Guide.
2. Select ANALYZE from the menu bar.
3. Click MULTIVARIATE and then DISCRIMINANT ANALYSIS.
4. Select VISIT and move it to the CLASSIFICATION variable task role.
5. Select INCOME, TRAVEL, VACATION, HSIZE, and AGE and move them to the
ANALYSIS variables task role.
6. Click OPTIONS in the box to the left.
7. Select UNIVARIATE test for equality of class means and SUMMARY results of cross
validation classification.
8. Select RESULTS and check SUMMARY STATISTICS AND DISRIMINANT
FUNCTIONS.
9. Click the PREVIEW CODE button on the bottom left.
SAS Enterprise Guide: Two-Group Discriminant (2 of 2)
10. Check the “Show custom code insertion points” box at the top left.
11. Scroll to CROSSVALIDATE and double-click <insert custom code here> to add code
before the CROSSVALIDATE option.
12. Type CAN and PCORR (or use the pop-up box), then close the code preview box.
13. Click RUN.
SAS Enterprise Guide: Logit Analysis
1. Open SAS Table_18_6 using SAS Enterprise Guide.
2. Select ANALYZE from the menu bar.
3. Click REGRESSION and then LOGISTIC REGRESSION.
4. Select LOYALTY and move it to the DEPENDENT variable
task role.
5. Select BRAND, PRODUCT, and SHOPPING and move them
to the QUANTITATIVE variables task role.
6. Select MODEL in the box to the left, then EFFECTS.
7. Choose BRAND, PRODUCT, and SHOPPING as Main Effects.
8. Select MODEL>OPTIONS and check SHOW
CLASSIFICATION TABLE.
9. Enter 0.5 as the critical probability value.
10. Click RUN.
Thank You

One Shot!

You might also like