P Value2
P Value2
7B
Multiple Regression: Statistical
Methods Using IBM SPSS
T his chapter will demonstrate how to perform multiple linear regression with IBM SPSS
first using the standard method and then using the stepwise method. We will use the
data file Personality in these demonstrations.
366–
▼
Chapter 7B: Multiple Regression: Statistical Methods Using IBM SPSS– – 367
▼
Figure 7b.1 Main Dialog Window for Linear Regression
▼
related measures. We also retained the following defaults: Model fit, which provides
R-square, adjusted R-square, the standard error, and an ANOVA table; R squared change,
which is useful when there are multiple predictors that are being entered in stages so that
we can see where this information is placed in the output; Descriptives, which provides
the means and standard deviations of the variables as well as the correlations table; and
Part and partial correlations, which produces the partial and semipartial correlations
and conveys important information when multiple predictors are used. Clicking
Continue returns us to the main dialog screen.
▼
Figure 7b.4 Descriptive Statistics and Correlations Output for Standard Regression
three major rows: the first contains the Pearson r values, the second contains the prob-
abilities of obtaining those values if the null hypothesis was true, and the third provides
sample size.
The dependent variable esteem is placed by IBM SPSS on the first row and column, and
the other variables appear in the order we entered them into the analysis. The study repre-
sented by our data set was designed for a somewhat different purpose, so our choice of
variables was a bit limited. Thus, the correlations of self-esteem with the predictor variables
in the analysis are higher than we would ordinarily prefer, and many of the other variables
are themselves likewise intercorrelated more than we would prefer. Nonetheless, the example
is still useful for our purposes.
370– – PART III: PREDICTING THE VALUE OF A SINGLE VARIABLE
▼
Figure 7b.5 displays the results of the analysis. The middle table shows the test of sig-
nificance of the model using an ANOVA. There are 419 (N - 1) total degrees of freedom.
With six predictors, the Regression effect has 6 degrees of freedom. The Regression effect is
statistically significant indicating that prediction of the dependent variable is accomplished
better than can be done by chance.
The upper table in Figure 7b.5 labeled Model Summary provides an overview of the
results. Of primary interest are the R Square and Adjusted R Square values, which are .607
and .601, respectively. We learn from these that the weighted combination of the predictor
variables explained approximately 60% of the variance of self-esteem. The loss of so little
strength in computing the Adjusted R Square value is primarily due to our relatively large
sample size combined with a relatively small set of predictors. Using the standard regression
procedure where all of the predictors were entered simultaneously into the model, R Square
Change went from zero before the model was fitted to the data to .607 when the variable was
entered.
Chapter 7B: Multiple Regression: Statistical Methods Using IBM SPSS– – 371
▼
The bottom table in Figure 7b.5 labeled Coefficients provides the details of the results.
The Zero-order column under Correlations lists the Pearson r values of the dependent vari-
able (self-esteem in this case) with each of the predictors. These values are the same as those
shown in the correlation matrix of Figure 7b.4. The Partial column under Correlations lists
the partial correlations for each predictor as it was evaluated for its weighting in the model
(the correlation between the predictor and the dependent variable when the other predictors
are treated as covariates). The Part column under Correlations lists the semipartial correla-
tions for each predictor once the model is finalized; squaring these values informs us of the
percentage of variance each predictor uniquely explains. For example, trait anxiety accounts
uniquely for about 3% of the variance of self-esteem (-.170 * -.170 = .0289 or approximately
.03) given the other variables in the model.
The Y intercept of the raw score model is labeled as the Constant and has a value here
of 98.885. Of primary interest here are the raw (B) and standardized (Beta) coefficients, and
their significance levels determined by t tests. With the exception of negative affect and
openness, all of the predictors are statistically significant. As can be seen by examining the
beta weights, trait anxiety followed by neuroticism followed by positive affect were making
relatively larger contributions to the prediction model.
The raw regression coefficients are partial regression coefficients because their values
take into account the other predictor variables in the model; they inform us of the pre-
dicted change in the dependent variable for every unit increase in that predictor. For
example, positive affect is associated with a partial regression coefficient of 1.338 and
signifies that for every additional point on the positive affect measure, we would predict a
gain of 1.338 points on the self-esteem measure. As another example, neuroticism is asso-
ciated with a partial regression coefficient of -.477 and signifies that for every additional
point on the neuroticism measure, we would predict a decrement of .477 points on the
self-esteem measure.
This example serves to illustrate two important related points about multiple
regression analysis. First, it is the model as a whole that is the focus of the analysis.
Variables are treated akin to team players weighted in such a way that the sum of the
squared residuals of the model is minimized. Thus, it is the set of variables in this par-
ticular (weighted) configuration that maximizes prediction—swap out one of these
predictors for a new variable and the whole configuration that represents the best pre-
diction can be quite different.
The second important point about regression analysis that this example illustrates,
which is related to the first, is that a highly predictive variable can be “left out in the cold,”
being “sacrificed” for the “good of the model.” Note that negative affect correlates rather
substantially with self-esteem (r = -.572), and if it was the only predictor it would have a beta
weight of -.572 (recall that in simple linear regression the Pearson r is the beta weight of the
predictor), yet in combination with the other predictors is not a significant predictor in the
multiple regression model. The reason is that its predictive work is being accomplished by
one or more of the other variables in the analysis. But the point is that just because a variable
372– – PART III: PREDICTING THE VALUE OF A SINGLE VARIABLE
▼
receives a modest weight in the model or just because a variable is not contributing a statis-
tically significant degree of prediction in the model is not a reason to presume that it is itself
a poor predictor.
It is also important to note that the IBM SPSS output does not contain the structure
coefficients. These are the correlations of the predictors in the model with the overall predic-
tor variate, and these structure coefficients help researchers interpret the dimension under-
lying the predictor model (see Section 7A.11). They are easy enough to calculate by hand
(the Pearson correlation between the predictor and the criterion variable divided by the
multiple correlation), and we incorporate these structure coefficients into our report of the
results in Section 7B.1.5.
▼
Table 7b.1 Correlations of the Variables in the Analysis (N = 420)
stepwise analysis on the same set of variables that we used in our standard regression analy-
sis in Section 7B.1. We will use the data file Personality in these demonstrations. In the
process of our description, we will point out areas of similarity and difference between the
standard and step methods.
374– – PART III: PREDICTING THE VALUE OF A SINGLE VARIABLE
▼
7B.2.1 Main Regression Dialog Window
Select the path Analyze â Regression â Linear. This brings us to the Linear
Regression main dialog window displayed in Figure 7b.6. From the variables list panel,
we click over esteem to the Dependent panel and negafect, posafect, neoopen, neoex-
tra, neoneuro, and tanx to the Independent(s) panel. The Method drop-down menu
contains the set of step methods that IBM SPSS can run. The only one you may not
recognize is Remove, which allows a set of variables to be removed from the model
together. Choose Stepwise as the Method from the drop-down menu as shown in
Figure 7b.6.
▼
Figure 7b.7 The Linear Regression Statistics Window
▼
Figure 7b.8 The Linear Regression Options Window
▼
Figure 7b.9 Tests of Significance for Each Step in the Regression Analysis
number of predictors in the model). We can also deduce that no variables were removed
from the model since the count of predictors in the model steadily increases from 1 to 4.
This latter deduction is verified by the display shown in Figure 7b.10, which tracks vari-
ables that have been entered and removed at each step. As can be seen, trait anxiety, positive
affect, neuroticism, and extraversion have been entered on Steps 1 through 4, respectively,
without any variables having been removed on any step.
Figure 7b.11, the Model Summary, presents the R Square and Adjusted R Square values
for each step along with the amount of R Square Change. In the first step, as can be seen
from the footnote beneath the Model Summary table, trait anxiety was entered into the
model. The R Square with that predictor in the model was .525. Not coincidentally, that is
the square of the correlation between trait anxiety and self-esteem (.7242 = .525), and is the
value of R Square Change.
On the second step, positive affect was added to the model. The R Square with both
predictors in the model was .566; thus, we gained .041 in the value of R Square (.566 –
.525 = .041), and this is reflected in the R Square Change for that step. By the time we
arrive at the end of the fourth step, our R Square value has reached .603. Note that this
378– – PART III: PREDICTING THE VALUE OF A SINGLE VARIABLE
▼
value is very close to but not
Figure 7b.10 Variables That Were Entered and
identical to the R2 value we
Removed
obtained under the standard
method.
The Coefficients table in
Figure 7b.12 provides the details
of the results. Note that both the
raw and standardized regression
coefficients are readjusted at each
step to reflect the additional vari-
ables in the model. Ordinarily,
although it is interesting to
observe the dynamic changes tak-
ing place, we are usually inter-
ested in the final model. Note also
that the values of the regression
coefficients are different from
those associated with the same
variables in the standard regres-
sion analysis. That the differences
are not huge is due to the fact that
these four variables did almost
the same amount of predictive
work in much the same configu-
ration as did the six predictors
accomplished using the standard
method. If economy of model
were relevant, we would probably
be very happy with the trimmed
model of four variables replacing
the full model containing six
variables.
Figure 7b.13 addresses the
fate of the remaining variables. For each step, IBM SPSS tells us which variables were not
entered. In addition to tests of the statistical significance of each variable, we also see dis-
played the partial correlations. This information together tells us what will happen in the
following step. For example, consider Step 1, which contains the five excluded variables.
Positive affect has the highest partial correlation (.294), and it is statistically significant; thus,
it will be the variable next entered on Step 2. On the second step, with four variables (of the
six) excluded, we see that neuroticism with a statistically significant partial correlation of
Chapter 7B: Multiple Regression: Statistical Methods Using IBM SPSS– – 379
▼
Figure 7b.11 Model Summary
-.269 wins the struggle for entry next. By the time we reach the fourth step, there is no vari-
able of the excluded set that has a statistically significant partial correlation for entry at Step
5; thus, the stepwise procedure ends after completing the fourth step.
380– – PART III: PREDICTING THE VALUE OF A SINGLE VARIABLE
▼
Figure 7b.13 The Results of the Stepwise Regression Analysis
▼
The prediction model contained four of the six predictors and was reached in four
steps with no variables removed. The model was statistically significant, F(4, 415) =
157.626, p < .001, and accounted for approximately 60% of the variance of self-
esteem (R2 = .603, Adjusted R2 = .599). Self-esteem was primarily predicted by lower
levels of trait anxiety and neuroticism, and to a lesser extent by higher levels of
positive affect and extraversion. The raw and standardized regression coefficients of
the predictors together with their correlations with self-esteem, their squared semi-
partial correlations, and their structure coefficients are shown in Table 7b.3. Trait
anxiety received the strongest weight in the model followed by neuroticism and
positive affect; extraversion received the lowest of the four weights. With the sizeable
correlations between the predictors, the unique variance explained by each of the
variables indexed by the squared semipartial correlations, was relatively low: trait
anxiety, positive affect, neuroticism, and extraversion uniquely accounted for
approximately 4%, 2%, 3%, and less than 1% of the variance of self-esteem. The
latent factor represented by the model appears to be interpretable as well-being.
Inspection of the structure coefficients suggests that trait anxiety and neuroticism
were very strong indicators of well being, positive affect was a relatively strong indi-
cator of well-being, and extraversion was a moderate indicator of well-being.