0% found this document useful (0 votes)
420 views71 pages

Chapter 14 - Multiple Regression - 2019

This document provides an overview of multiple regression analysis. It discusses how multiple regression can determine the unique contribution of predictor variables to the regression equation, even when the predictors are correlated with each other. Multiple regression yields a regression equation that maximally predicts the dependent variable using beta weights for each predictor. It also estimates the unique contribution of each predictor over and above the other predictors in the model. The document uses years of education, IQ, and annual earnings as an example and discusses how multiple regression can help determine if education and IQ each uniquely predict earnings.

Uploaded by

Ti SI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
420 views71 pages

Chapter 14 - Multiple Regression - 2019

This document provides an overview of multiple regression analysis. It discusses how multiple regression can determine the unique contribution of predictor variables to the regression equation, even when the predictors are correlated with each other. Multiple regression yields a regression equation that maximally predicts the dependent variable using beta weights for each predictor. It also estimates the unique contribution of each predictor over and above the other predictors in the model. The document uses years of education, IQ, and annual earnings as an example and discusses how multiple regression can help determine if education and IQ each uniquely predict earnings.

Uploaded by

Ti SI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.

14
Multiple Regression

Contents
2
Multiple R versus Unique Effects ................................................................................................. 2
Multiple Regression Equation ....................................................................................................... 3
Standardized Versus Unstandardized Beta Weights ................................................................. 4
Multiple R (and Multiple R2)...................................................................................................... 4
Multiple Regression – Method Enter: SPSS .............................................................................. 5
Common Mistakes in Interpreting Standardized Beta Weights................................................ 9
Multiple Regression and Semi-Partial Correlations .................................................................... 10
Multiple Regression Assumptions ............................................................................................... 11
Tolerance ................................................................................................................................. 13
Variance Inflation Factor ......................................................................................................... 14
Dealing with Multi-Collinearity ................................................................................................... 15
Sample Size Requirements .......................................................................................................... 16
Hierarchical Multiple Regression ................................................................................................ 18
Hierarchical Multiple Regression: SPSS ................................................................................... 18
Hierarchical Multiple Regression: Comments ......................................................................... 20
Stepwise Multiple Regression ..................................................................................................... 21
Stepwise Multiple Regression Example: SPSS ......................................................................... 22
Does the total effect between X and Y need to be significant? .............................................. 46
Mediation versions Moderation ............................................................................................. 47
Advanced Topics ......................................................................................................................... 48
Proving that Multiple R is Really Just a Pearson r ................................................................... 48
Multiple Regression: Bootstrapping........................................................................................ 48
Multiple R2 Confidence Intervals............................................................................................. 49
Multiple Regression Scatter Plot ............................................................................................. 50
Testing the Difference between Two Beta Weights ............................................................... 51
Koenker Test of Homoscedasticity .......................................................................................... 53
Dealing with Heteroscedasticity: Wild Bootstrap ................................................................... 54
Dealing with Heteroscedasticity: Adjusted Standard Errors ................................................... 56
Multiple R2 Increased but p-value Decreased? ....................................................................... 57
Evaluating the Importance of Predictors: Relative Importance Analysis ................................ 58
Suppressor Effects ................................................................................................................... 61
Cooperative Suppression ........................................................................................................ 63
Practice Questions ...................................................................................................................... 67
Advanced Practice Questions...................................................................................................... 68
References................................................................................................................................... 69
CHAPTER 14: MULTIPLE REGRESSION

Multiple R2 versus Unique Effects


In previous chapters, the association between years of education completed and
annual earnings was examined in a number of ways. Based on the Pearson correlation, it was
found that there was a positive association between years of education completed and annual
earnings equal to r = .337, which corresponded exactly to the results reported by Zagorsky
(2007). Thus, 11.4% of the variance in earnings was accounted for by education. Based on a
bivariate regression, it was found that for every additional year of education completed, one
would expect, on average, a person to earn $2,777 more annually. However, it was also found
that a regression equation based solely on number of years of education completed would
yield, on average, prediction errors equal to a whopping $24,086!1 A natural question to ask is
whether it would be possible to increase the predicative accuracy of the regression analysis by
including a second independent variable to the regression equation. For example, Zagorsky
(2007) also found that IQ was related positively to annual earnings (r = .31; r2 = .096). Thus,
perhaps years of education completed and IQ, in combination, may yield a model R2 value
greater than that based on years of education completed alone (i.e., model R2 = .114).
In the context of multiple regression, model R is also known as multiple R2, because it
is based on two or more predictors. In addition to a multiple R2, multiple regression yields beta
weights unique to each predictor included in the model. The word ‘unique’, in this context,
refers to a unique contribution to the equation; one that is independent of the effects of any
other variable included in the model. IQ and years of education completed was reported by
Zagorsky (2007) to be correlated at r = .57. Thus, higher levels of intelligence were associated
with higher levels of education completed. The fact that IQ and education were correlated
positively with earnings, and IQ and education were correlated with each other, makes it
difficult to estimate the amount of variance in earnings that can be accounted for by IQ and
education combined. That is, it is not simply a process of adding the two r2 values together,
i.e., .114 + .100 = .214. Instead, some of the predictive capacity of years of education
completed and IQ would be expected to be overlapping/redundant. Thus, the two main tasks
of multiple regression are to maximally predict the dependent variable with a regression
equation (model R2), as well as estimate the unique contributions of each predictor variable to
the regression equation.
Essentially, multiple regression combines the attributes of regression and semi-partial
correlation (type A). Semi-partial correlation is relevant to multiple regression, because only
the unique effects associated with the independent variables are included in the multiple
regression equation. This is really important to understand. Multiple regression helps
decompose the common and unique predictive effects of two or more independent variables

1
65.99*365 = 24,086
C14.2
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

on a dependent variable. In some cases, one might have 5 or more hypothesized predictors of
a dependent variable. Multiple regression will help determine which of the 5 hypothesized
predictors contribute unique predictive capacity in predicting the dependent variable. In the
years of education completed, IQ, and annual earnings example, a multiple regression will help
determine whether years of education completed and IQ can both predict annual earnings
statistically significantly and uniquely. If that sounds like a type A semi-partial correlation to
you, you’re on the right track. The main difference is that multiple regression builds a
regression equation that yields beta weights to maximize the prediction of the dependent
variable (i.e., maximize multiple R2). By contrast, multiple regression is not directly concerned
with semi-partial correlations. So, multiple regression beta weights and semi-partial
correlations (type A) have similarities, but they are also somewhat different, as I do describe in
further detail below.

Multiple Regression Equation


Mathematically, the outcome of conducting a multiple regression is a regression
equation that predicts maximally the dependent variable. To this effect, matrix algebra is used
to estimate the terms of the regression equation: one intercept and a beta weight for each of
the predictors included in the model. To round out the equation, an error term is also
included, as the vast majority of regression equations are associated with a certain amount of
error (i.e., they do not predict the dependent variable perfectly). Thus, the multiple regression
equation may be specified as follows:

Y  a  b1 X 1  b2 X 2  ...  bn X n   (1)

Where Y = the predicted value of the dependent variable,  = the intercept, b1 = the beta
weight associated with the first independent variable, b2 = the beta weight associated with the
second independent variable, bn = the beta weight associated with the nth independent
variable, and ε = error. The X1, X2, Xn values correspond to specified values associated with the
independent variables (i.e., particular raw scores). Recall that the purpose of a multiple
regression equation is to predict a particular case’s value on Y. Once the multiple regression
equation has been solved, each case’s predicted value on Y can be estimated based on their
scores on the independent variables and the application of the regression equation.
Consider the example of years of education completed, IQ, and annual earnings.
Suppose someone had 14 years of education completed and an IQ of 112, what would I predict
their annual earnings to be? Suppose someone had 8 years of education completed and an IQ
of 79, what would I predict their annual earnings to be? I can’t answer this until I conduct a
multiple regression, because I need an intercept, and two beta weights: one for years of
education completed and one for IQ. Before I conduct the multiple regression analysis, I’d like
to point out the two different types of beta weights that can be estimated.

C14.3
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Standardized Versus Unstandardized Beta Weights


There are two types of beta weights estimated in a multiple regression: standardized
and unstandardized. Standardized beta weights in multiple regression are associated with the
same interpretation as a standardized beta weight in a bivariate regression: a one standard
deviation increase in the value of X is associated with a beta-weight percentage of a standard
deviation change in the value of Y. Whether the change in Y is an increase or decrease will
depend on whether the beta weight is positive or negative in direction. Also, whether the
change in Y is large or small will dependent on the magnitude of the beta weight. This
interpretation of a standardized beta weight is the same as that described in the bivariate
regression case. However, there is one very important difference to keep in mind: the beta
weights associated with a multiple regression represent unique effects. Recall that in the
bivariate case, the standardized beta weight was equal to the Pearson correlation. In practice,
this no longer holds true in the multiple regression case. Almost invariably, a variable’s beta
weight is smaller than its corresponding Pearson correlation. The beta weight is smaller,
because it represents only the unique effects on the dependent variable not shared with the
other independent variables. So, in the years of education, IQ and annual earnings example,
the years of education and IQ beta weights will definitely be smaller than their corresponding
Pearson correlations, because education and IQ are correlated with each other positively.
Unstandardized beta weights in multiple regression have the same interpretation as
that described in the bivariate regression case, with, again, the important difference that they
represent the unique contributions to the regression equation. Typically, in the behavioral
sciences, only the standardized beta weights are interpreted and reported. The reason is that
many variables in research, particularly the behavioral sciences, do not have a naturally
interpretable scale. However, as I will show below, in this example, it makes a lot of sense to
report both the unstandardized and standardized beta weights, because years of education,
IQ, and annual earnings are measured on a scale that many people understand. One final
aspect of multiple regression I would like to describe, before I conduct the analysis, is multiple
R2.

Multiple R (and Multiple R2)


In bivariate regression, the Pearson r and the model R are always the same values,
because there is only one independent variable in the analysis. However, in multiple
regression, model R will always be larger than the Pearson r between any particular
independent variable and the dependent variable. The reason is that each independent
variable will be expected to contribute some unique variance in predicting the dependent
variable. That unique variance may or may not be statistically significant, however, the model

C14.4
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

R will necessarily be larger, at least numerically. In the context of this example, recall that the
model R associated with the bivariate regression with education as the independent variable
and earnings as the dependent variable was estimated at .337 (as described in the Bivariate
Regression chapter). One of the questions posed by conducting a multiple regression is
whether adding IQ to the regression equation will increase model R in a statistically significant
manner. Next, I conduct the multiple regression analysis in SPSS.

Multiple Regression – Method Enter: SPSS


There are several different methods of multiple regression that one can conduct,
including stepwise, hierarchical, and enter, for example. The first type of multiple regression
demonstrated in this chapter is known as method ‘enter’, because all of the independent
variables are forced into the regression equation, irrespective of whether they are found to be
statistically significant contributors to the regression equation or not. In order to demonstrate
a multiple regression (enter) analysis in SPSS, I have simulated data to correspond very closely
to the results reported by Zygorsky (2007) (Data File: education_earnings_N_250). I have
simulated the data to correspond to a sample size of 250 (rather than 5,000+) to make things a
little more interesting with respect to p-values.2 In order to conduct a multiple regression in
SPSS, the ‘Linear’ utility can be used (Watch Video 14.1: Multiple Regression (Method Enter) in
SPSS). The first table outputted by SPSS is entitled ‘Descriptive Statistics’. It provides
information on the variable means, standard deviations, and the sample size.

In order to get additional important descriptive statistics, such as skew and kurtosis
values, you will need to use the Frequency utility in SPSS which will yield the SPSS table
entitled ‘Statistics’ (see Chapter 2).
The main element of the table to point out is that the skew associated with the
earnings variable is both positive and very large (3.45), as was observed when the analysis was
conducted in the bivariate regression example. For the purposes of learning, I will carry on
with the analysis as if the data were sufficiently normally distributed. I will then follow-up with
the bootstrapped version of the analysis, which does not assume normally distributed data.

2
Note that the sample size used in the correlation and bivariate regression chapters was N =
40. Thus, the results reported in this chapter differ slightly to the results reported in chapter
12.
C14.5
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

The next table outputted by SPSS is the correlation matrix table entitled ‘Correlations’.

It can be observed in the SPSS table entitled ‘Correlations’ that there were positive
correlations between all three of the variables included in the analysis. In particular, both
education (r = .332) and IQ (r = 307) correlated positively with earnings. Furthermore,
education and IQ were correlated positively with each other (r = .572).
The next table outputted by SPSS is entitled ‘Variables Entered/Removed’. It includes
the names of the variables included in the analysis. Additionally, it specifies the type of
method that was used, which, in this case, was ‘Enter’. The ‘Variables Removed’ column only
comes into play when one uses multiple regression methods such as ‘stepwise’, for example,
rather than method enter.

C14.6
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Next up is the ‘Model Summary’ table which provides information relevant to multiple R.

It can be seen in the SPSS ‘Model Summary’ table that the multiple R was estimated at
.361, which corresponds to a multiple R2 value of .130. Thus, 13.0% of the variance in annual
earnings was accounted for by a regression equation which included both years of education
completed and IQ as predictors. There is also an adjusted R2 value reported which
corresponded to .123. The adjusted R2 represents a more accurate estimate of the effect in the
population (you can square root the adjusted R2 to get the adjusted model R). Thus, from this
more accurate perspective, 12.3% of the variance in earnings would be expected to be
accounted for in the population by the multiple regression equation. Finally, the standard
error of the estimate was estimated at $40,375.37, which corresponds to the standard
deviation associated with the residuals (i.e., difference between observed dependent variable
scores and predicted dependent variable scores). A standard deviation in residuals equal to
$40,375 is massive, considering the mean in annual earnings was only $43,314. Thus, on
average, the prediction equation makes an error in predicting Y (the dependent variable,
annual earnings) equal to $40,375. This is the reality of doing science, in most cases. Recall
that the percentage of variance accounted in earnings was only about 13%, so a massive 87%
of the variance was not accounted for by education and IQ. Still, it is a numerical improvement
over the inclusion of only education as a predictor, which yielded an adjusted model R2 = .090
(reported in chapter 12).
Next up is the ‘ANOVA‘ table which tests the statistical significance of the multiple R
value:

It can be observed in the ‘ANOVA’ table that the multiple R value of .361 was associated with
an F-value of 18.525. With 2 and 247 degrees of freedom, the F-value was statistically
significant, p < .001. Note that the F-value corresponds to the unadjusted R value (not the
adjusted R value; SPSS does not report an F-value for the adjusted R2 estimate).

C14.7
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Next, SPSS reports a table entitled ‘Coefficients’ which includes the values associated
with the multiple regression equation.

It can be observed in the table entitled ‘Coefficients’ that the intercept (or what SPSS
refers to as the ‘Constant’) was equal to -56261.42 which is an impossible amount of money to
earn in a year (no matter how badly someone does their job). Also, note that none of the
values in the data file were negative. It is simply a reality that multiple regression will
sometimes estimate an “impossible” intercept value, if that is the best way to come up a
multiple regression equation that maximally predicts the dependent variable based on the
data.
In the same column as the intercept are the unstandardized beta weights. It can be
observed that education was associated with an unstandardized beta weight of b = 3756.289.
Therefore, a one unit increase in education (i.e., one extra year of education completed) was
associated with an approximate unique (not shared with IQ) increase of $3,756 in annual
earnings. To determine if the unstandardized beta weight was statistically significant, you need
to look at the ‘Sig.’ column. You’ll see that the education beta weight was associated with a p-
value of .002, based on a t-value of 3.211. Furthermore, the unstandardized beta weight of
3756.289 was associated with 95% confidence intervals equal to 1451.938 and 6060.639.
The IQ independent variable was also a statistically significant contributor to the
regression equation: t = 2.402, p =.017. Furthermore, the unstandardized beta weight was
equal to b = 501.978. Thus, a one unit increase in IQ was associated with an approximate
unique (not shared with education) increase of $501.98 in annual earnings. The 95%
confidence intervals for the IQ unstandardized beta weight were equal to 90.383 and 913.573.
I haven’t mentioned anything about the ‘Std. Error column’. The main interesting bit about
that column is that if you divide the unstandardized beta weight by its corresponding standard
error, you get the t-value. So, for IQ, 501.978 / 208.972 = 2.402.
In this example, reporting and interpreting the unstandardized beta weights made a
lot of sense, because they were naturally interpretable. People know what an extra year of
education implies, for example. Also, people understand annual earnings expressed as dollars.
However, all too often, the unstandardized beta weights are not naturally interpretable,
because the independent variables were measured on an unnatural metric (e.g., sum of items
C14.8
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

scored on a 5-point Likert scale), therefore, the standardized beta weights become very useful
to interpret in such cases. Also, although the education and IQ unstandardized beta-weights
are readily interpretable, they can’t be compared in terms of size (i.e., relative contribution to
the regression equation). One must consult the standardized beta-weights for such purposes.
In this example, education was associated with a standardized beta weight of  =.232.
You cannot square a standardized beta weight and speak of percentage of variance accounted
for. However, you can say that a one standard deviation increase in years of education
completed (SD = 2.67) corresponded to .232 of a standard deviation increase in annual
earnings (i.e., 43123.58 * .232 = $10,004.67). Note that, in the context of this regression
equation, the .232 of a standard deviation increase in annual earnings is unique to years of
education, i.e., not shared with IQ (same fact applies to the unstandardized beta weight). With
respect to the IQ standardized beta weight results, a one standard deviation increase in IQ (SD
= 14.93) was associated with a .174 increase in annual earnings (i.e., 43123.58 * .174 =
$7503.50). In practice, the statistical significance associated with standardized beta weights is
implied to be consistent with the unstandardized beta weights. Therefore, researchers simply
interpret a standardized beta weight as statistically significant, if the unstandardized beta
weight was found to be statistically significant (Watch Video 14.2: How to interpret a
standardized beta weight).
As mentioned, one of the benefits associated with standardized beta weights is that
you can compare their importance in a regression equation. For example, the standardized
beta weight associated with IQ was equal to .174, which was numerically smaller than the
standardized beta weight of .232 associated with education. Thus, at least numerically, years
of education completed was a more substantial, unique contributor to the regression equation
than IQ. Although researchers very frequently interpret a numerical difference between two
standardized beta weights as statistical support for interpreting one predictor as a more
importance contributor to the model, the reality is that such numerical differences can easily
arise due to sampling fluctuations. Consequently, it is necessary to test the numerical
difference between two standardized beta weights statistically, before making any
interpretations. Unfortunately, very few, if any, programs test the difference between two
standardized beta weights statistically. In the Advanced Topics section of this chapter, I
demonstrate an arguably appropriate technique to test the difference between two
standardized beta weights in SPSS, using bootstrapping and confidence interval overlap
estimation.

Common Mistakes in Interpreting Standardized Beta Weights


Although a standardized beta weight does share some similarities to a Pearson
correlation, it falls short in one important respect: a standardized beta weight in multiple
regression cannot be converted into a coefficient of determination. That means that you can’t
C14.9
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

square the beta weight and speak of the percentage of variance in the dependent variable
accounted for by the independent variable. In this context, a researcher has the option of
consulting and reporting squared semi-partial (type A) correlations, instead. Consequently, I
recommend that you always report both the standardized beta weights and the semi-partial
(type A) correlations in a multiple regression analysis.3
Another common mistake is to state that the largest standardized beta weight is the
most significant predictor of the dependent variable. In order to make such a statement
justifiably, one would need to test the difference between two beta weights statistically. It is
not enough to simply look at the numerical difference between two weights and assert one to
be the most significant predictor of a dependent variable.
Finally, all too often, researchers fail to specify whether they have reported the
unstandardized (b) or standardized () beta weights in their results. In the behavioral sciences,
it usually only makes sense to report the standardized beta weights, as the variables are
typically measured on an inherently meaningless scale. In some disciplines, however, many of
the independent and dependent variables are associated with meaningful scales (e.g.,
economics). Consequently, the unstandardized beta weights can be interpreted meaningfully
in such cases. Ultimately, it is up to the researcher to decide whether it is best to report the
unstandardized or standardized beta weights (or both). So long as the beta weights are
demarcated clearly as either unstandardized (b) or standardized (), the reader should be able
to follow the results coherently.

Multiple Regression and Semi-Partial Correlations


Although beta weights are informative, they lack the capacity to be squared.
Consequently, researchers cannot speak of percentage of variance in the dependent variable
accounted for by the independent variables. To overcome this limitation in beta weights,
researchers should report the semi-partial correlations (Type A) associated with the
independent variables included in the multiple regression model. The multiple regression
utility in SPSS allows for the estimation of both partial and semi-partial correlations (Watch
Video 14.3: Semi-Partial Correlations & Multiple Regression in SPSS). The key portion of the
output is included in the SPSS table entitled ‘Coefficients’:

3
Journal articles commonly include only the standardized beta weights. Fortunately, beta-weights can
be converted by a reader into squared semi-partial correlations, based on a simple re-arrangement of
the formula specified by Tzelgov and Stern (1978, p. 330). Thus, to derive a semi-partial correlation from
2
a beta-weight, the following formula can be used: 𝑟𝑌1.2 = 𝛽𝑌1.2 ∗ (√1 − 1𝑟12 )
C14.10
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

SPSS refers to semi-partial correlations (i.e., Type A) as Part correlations. As you can
see in the SPSS table entitled ‘Coefficients’, education was associated with a type A semi-
partial correlation of r = .190, or r2 = .036. Thus, 3.6% of the variance in annual earnings was
accounted for uniquely by years of education completed. IQ was associated with a type A
semi-partial correlation of r = .143, or r2 = .020. Thus, 2.0% of the variance in annual earnings
was accounted for uniquely by IQ. It only really makes sense to report type A semi-partial
correlations in the context of multiple regression, not the type B semi-partial correlations (see
Chapter 13). The statistical significance associated with the type A semi-partial correlations is
implied to be the same as the unstandardized beta weights. Consequently, there is no
additional ‘Sig.’ column for the additional correlations included in the ‘Coefficients’ table.
I think it is good practice to always report the semi-partial correlations associated with
the independent variables included in a multiple regression model. An additional interesting
attribute associated with the squared semi-partial correlations is that they can be summed. In
this example, the sum of the squared semi-partial correlations amounted to: .036 + .020 =
.056. Thus, 5.6% of the variance in earnings was accounted for by education and IQ, uniquely.
However, you may recall that the multiple R2 was equal to .130, or 13.0%. What can account
for the difference between 13.0% and 5.6%? Well, the difference of 7.4% (13.0 – 5.6) may be
described as the percentage in earnings variance that was accounted for by education and IQ
in a shared manner, i.e., not uniquely. Recall that, in this example, education and IQ shared
32.7% (r2 = .327) of their variance. That shared education and IQ variance evidenced an impact
on the model R2 equal to 7.4%.

Multiple Regression Assumptions


Multiple regression is a parametric statistical analysis, consequently, it is associated
with a number of assumptions. All of the assumptions are the same as those described in the
chapter devoted to bivariate regression, in addition to one assumption unique to multiple
regression: the absence of multi-collinearity. Consequently, the assumptions in common with
bivariate regression will only be listed and tested here, rather than described in a substantial
amount of detail.

C14.11
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

1: Random sampling
The assumption implies that every case in the population has an equal chance of being
selected into the sample. The data were not obtained via random sampling in Zagorsky (2007).
The impact of such a violation on the results is anyone’s guess.

2: Independence of Observations
See bivariate regression chapter.

3: Dependent Variable Measured on an Interval/Ratio Scale


The assumption that the dependent variable is measured on an interval/ratio scale has
been satisfied, as annual earnings (DV) was measured on a ratio scale.

4: Linearity
See bivariate regression chapter.

5: Normally distributed residuals


As can be seen in Figure C14.1, the assumption that the residuals would be associated
with a normal distribution was not satisfied. Specifically, the skew and kurtosis associated with
the residuals above was 3.49 and 21.00, respectively, which is very substantial. To my
knowledge, in the context of multiple regression, there is no precise rule about how skewed or
kurtotic a distribution of residuals can be, before it becomes problematic. Perhaps the rule of
skew < 2.0 and kurtosis < 9.0 might be applicable here. Fortunately, a multiple regression
analysis can be performed with bootstrapping as an estimation procedure, rather than
asymptotic normal distribution theory. Bootstrapping does not assume any level of normality,
so, it can be potentially applicable to any set of data. I demonstrate a bootstrapped multiple
regression in the Advanced Topics section of this chapter.

Figure C14.1. Histogram of Standardized Residuals

C14.12
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

6: Independence of errors
See the bivariate regression chapter.

7: Homoscedasticity
See the bivariate regression chapter for a full discussion of this assumption. I tested
the homoscedasticity assumption by estimating the correlation between the standardized
predicted values and the absolute standardized residuals. The Pearson and Spearman
correlations were r = .17 (p = .006) and rS = .14 (p = .028), respectively. The statistically
significant correlations suggest that the homoscedasticity assumption has not been satisfied
(Watch Video 14.4: Testing Homoscedasticity via Correlation). In the Advanced Topics section
of this chapter, I describe a more commonly used procedure to test the assumption of
homoscedasticity known as the Koenker test. Unfortunately, the Koenker test cannot be
applied within the menu options of SPSS. I also describe in the Advanced Topics section of this
chapter a procedure that can be implemented to deal with the presence of heteroscedasticity.

8: Absence of Multi-collinearity
Strictly speaking, the absence of multicollinearity is not really an assumption of
multiple regression, as the results obtained from a multiple regression analysis in the presence
of multicollinearity will be accurate. However, it is still important to pay attention to the
possibility that multicollinearity may be present in one’s data, as the stability of the model will
be affected negatively.
In the context of multiple regression, I like to distinguish between collinearity and
multicollinearity. Collinearity is observed when one independent variable correlates
approximately r = .95 or higher with another independent variable. By contrast,
multicollinearity is observed when one independent variable is associated with a multiple
correlation of R = .95 or higher, based on a prediction equation with the other independent
variables included in the multiple regression analysis. Thus, collinearity can be evaluated with
a simple examination of the Pearson correlation matrix. By contrast, multicollinearity cannot
be evaluated with a simple eye-ball examination of the correlation matrix. Instead, more
sophisticated analyses need to be performed such as Tolerance and/or the Variance Inflation
Factor. I describe these next.

Tolerance
Tolerance represents the percentage of variance in a particular independent variable
that is not shared with the other independent variables included in the multiple regression
analysis. Stated alternatively, tolerance represents the percentage of variance that is unique to
a particular independent variable. In multiple regression, it is desirable for all of the
independent variables to be associated with some unique variance. The percentage of
C14.13
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

variance not shared with the other independent variables is estimated by regressing the
independent variable onto the remaining independent variables. Consequently, tolerance may
be formulated as:
Tolerance  1  R 2
Tolerance values are calculated for each independent variable included in the analysis.
Therefore, if five independent variables are included in the model, then five tolerance values
will be estimated.
All other things equal, researchers desire higher levels of tolerance. Tolerance values
of .10 or greater are commonly considered acceptable. Such a value would imply that each
independent variable requires a minimum of 10% unique variance. It would also imply that no
independent variable is associated with a multiple R of .95 or greater (i.e., .952 = .90).

Variance Inflation Factor


The variance inflation factor represents the degree to which a beta weight’s standard
error is larger than it would otherwise be, under the pretense that the particular independent
variable is uncorrelated completely with the other independent variables included in the
analysis. Not many people seem to be aware of this, but the variance inflation factor (VIF) is
correlated perfectly with tolerance. Specifically, the VIF is the reciprocal of tolerance:
1
VIF 
Tolerance
Consequently, a recommendation for an acceptable value of tolerance needs to be consistent
with a recommendation for the VIF. A minimum tolerance value of .10 was recommended
above, which would imply a maximum VIF of 10. Some have recommended a maximum VIF as
small as 4 (Pan & Jackson, 2008), which would imply a minimum tolerance of .25. Like so many
things in statistical life, there is no real or truly meaningful cut-off value. Personally, I’m fine
with a VIF as large as 10, if the sample size is relatively large (> 100). Sample size comes into
play here, as larger standard errors associated with beta weights reduces the chances of
observing the corresponding beta weights as statistically significant.
SPSS allows for the estimation of both Tolerance and VIF through the linear regression
utility (Watch Video 14.5: Evaluating Multicollinearity (Tolerance & Variance Inflation Factor)
in SPSS). SPSS adds the Tolerance and VIF statistics at the right-side end of the ‘Coefficients’
table:

C14.14
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

It can be observed that education and IQ were both associated with the same
Tolerance and VIF values, as expected, because there were only two independent variables in
the analysis. When there are three or more independent variables included in the analysis, the
tolerance/VIF values will almost always be different across the independent variables. In this
example, education and IQ were associated with tolerance of .672, which implies that they
were associated with 67.2% unique variance, i.e., not predicted by the other independent
variable. Such a value exceeds comfortably the minimum acceptable level of tolerance of .10.
Correspondingly, the VIF was equal to 1.487 ( 1/.672 = 1.488), which implies that the beta
weight standard errors were approximately 1.488 times larger than they would otherwise be,
if there was no correlation between the two independent variables included in the analysis. As
the VIF value of 1.488 is much less than the maximum acceptable VIF value of 10, the issue of
multicollinearity (or collinearity, more precisely, in this case) is not a problem, in this example.

Dealing with Multi-Collinearity


The main point of understanding multi-collinearity is that very high levels present in
your data will thwart the chances of identifying a beta-weight as statistically significant. In
practice, there will probably be not much lost by dumping an independent variable from the
analysis, if the independent variable is associated with a large VIF (10+), as the other
predictors will pick-up the variance in the dependent variable that was predicted by that
particular independent variable. However, it should be noted that one does not desire or
expect extremely low levels of VIF either, because one of the main reasons one conducts a
multiple regression analysis is to understand the nature of the effects of two or more inter-
related independent variables on a dependent variable. If the independent variables were not
inter-related (VIF = 0), there would not be much point in conducting the multiple regression
analysis in the first place. For more sophisticated approaches to the evaluation of multi-
collinearity, one might consider condition indices in conjunction with variance-decomposition
proportions (Belsey, Kuh, & Welsch, 2004; see also O’Brien, 2007). SPSS does output such
information after the coefficients table, when the ‘Collinearity diagnostics’ option is selected.
However, I believe that consulting the VIF/tolerance values should suffice, in most cases.

C14.15
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Sample Size Requirements


Researchers commonly ask the question of how many cases they need in order to
conduct a multiple regression. Over the years, a number of rules of thumb have emerged. For
example, Tabachnik and Fidell (1989) recommended a bare minimum of five cases for each
predictor included in the analysis. Therefore, a multiple regression with four predictors would
require a bare minimum of 20 cases. Wampold and Freund (1987) recommended a minimum
of 10 cases per predictor. Finally, Nunnally (1978) advised that a minimum sample size of 300
to 400 would be required, in order to obtain accurate results for a typical multiple regression.
First and foremost, it needs to be acknowledged that the issue of sample size and
multiple regression is a question of statistical power. Consequently, no rule of thumb will ever
be appropriate in all cases, or even many cases. Secondly, one must appreciate that sample
size requirements will depend, in part, on the purpose of conducting the multiple regression.
Specifically, is it simply to estimate a multiple R-value? Or is it to evaluate statistically two or
more hypothesized predictors of a dependent variable? As you will see below, if the purpose is
to address the second question, then much greater sample sizes are required.
Green (1991) evaluated the sample size requirements for multiple regression by
explicitly framing the problem within the context of power. Based on his work, Green (1991)
specified that, if the purpose of conducting the multiple regression is to simply estimate a
multiple R2 value, then the following guideline could be used:
N = 50 + 8*p (2)
where p equals the number of predictors included in the analysis. So, for example, if one
wished to conduct a multiple correlation analysis with four predictors, a sample size of 82
would be required, based on Green’s (1991) guideline. Bear in mind that Green’s (1991)
guideline assumes that a medium sized multiple R2 value of .07 exists in the population. If a
smaller multiple R2 value is hypothesized to exist in the population, then a larger sample size
would be required. Correspondingly, a larger expected multiple R2 value in the population
would entail a smaller required N for the same level of power (i.e., .80).4
In practice, very few researchers are simply interested in the multiple R2 value
associated with a multiple regression analysis. Instead, the vast majority of researchers are
interested in evaluating the statistical significance associated with their hypothesized
predictors included in the model, in addition to the multiple R2. In such cases, again, based on
a hypothesized medium effect size present in the population, Green (1991) proposed the
following guideline:
N = 105 + p

4
Power represents the probability of rejecting the null hypothesis when it is, in fact, false in the
population. A minimum power level of .80 (or 80%) has been recommended (Cohen, 1988).
C14.16
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Thus, based on Green (1991), realistically, a typical multiple regression analysis would require
a minimum of approximately 105 participants. So, for example, if one wished to conduct a
multiple correlation analysis with four predictors, and one was interested in examining the
beta weights, a sample size of 109 would be required, based on Green’s (1991) guidelines.
Green’s (1991) guidelines appear to have become fairly popular over the years, based on the
number of times his paper has been cited.
More recently, Maxwell (2000) extended Green’s (1991) research by pointing out that
the magnitude of the correlations between the predictors will affect the sample size required
to conduct a multiple regression with a specified level of power. Maxwell’s (2000) work is
directly related to the variance inflation factor described in the context of multi-collinearity.
You’ll recall that, as the correlations between the predictors in a multiple regression increase,
so do the standard errors associated with the predictor beta weights. Consequently, Maxwell
(2000) reasoned, larger correlations between the predictors implies the need for greater
samples sizes, all other things equal, if one is interested in both the beta weights and the
multiple R. Maxwell (2000) conducted an extensive simulation study to estimate the sample
sizes required to detect statistically significant beta weights across a variety of conditions,
including the average magnitude of the correlations between the predictors, and a specified
power of .80. I have provided the table below which includes some of the key results reported
by Maxwell (2000) in Table C14.1.

Table C14.1. Sample Sizes Required for Multiple Regression (5 predictors; Power = .80)
pxy
pxx .20 .30 .40
.30 1,070 419 191
.40 1,731 692 328
.50 2,752 1,117 544
Note. Based on results reported in Maxwell (2000); pxx = mean inter-correlation between
predictors; pxy = mean inter-correlation between predictors and dependent variable.

Clearly, very large sample sizes are required, when the correlations between the
predictors are large. For example, if the average correlation between the predictors and the
dependent variable is r = .30 and the average correlation between the predictors is r = .30, a
sample size of 1,117 is required to achieve power of .80. However, if the number of predictors
is reduced to three, the sample size required to achieve power of .80 for the entire analysis is
reduced to a much more manageable 218 (see Table 5 in Maxwell, 2000). I used the words
‘entire analysis’, because Maxwell (2000) estimated power under the pretense that the
researcher was interested in testing the model R for statistical significance, as well as each of
the predictors (i.e., not at least one predictor).
C14.17
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Hierarchical Multiple Regression


The simplest approach to conducting a multiple regression is known as “method
enter”, as it forces all of the independent variables into the model, whether they are
statistically significant contributors to the regression equation or not. The preceding
demonstration was performed in the context of method enter multiple regression.
A slightly more sophisticated approach to multiple regression analysis is known as
‘hierarchical multiple regression’. Hierarchical multiple regression is performed in a sequence
specified by the researcher. Essentially, the independent variables are specified to be included
into the model in a series of blocks. Each block of independent variable(s) is added to the
model, and the increase in model R2 as a result of adding the block is tested for statistical
significance. Each block may have one or more independent variables.
In practice, researchers typically use hierarchical multiple regression such that they
add one or more less interesting independent variables in the first block, and then add one or
more theoretically interesting independent variables in subsequent blocks. My impression is
that, typically, there are only two blocks in published hierarchical multiple regression analyses.
The first block will include “control” variables, such as socioeconomic status, age, and gender,
for example. Then, the theoretically more interesting variables are added in the subsequent
block(s).
In the context of the years of education, intelligence, and earnings example, it may be
suggested that education is a relatively well established predictor of earnings. By contrast, the
hypothesis that intelligence may contribute to a regression equation in a unique manner is less
well-established and, arguably, more interesting, theoretically. Thus, a researcher may conduct
a hierarchical multiple regression analysis with years of education completed entered in block
1 and intelligence entered in block 2. The hypothesis that intelligence can add to the
regression model in a statistically significant manner would be tested by the R2 change
observed from block 1 to block 2. If the R2 squared changes statistically significantly from bock
1 to block 2, then intelligence would be determined to be a statistically significant, unique
contributor to the model.

Hierarchical Multiple Regression: SPSS


In order to conduct a hierarchical multiple regression in SPSS, one can use the regular
linear regression utility (Watch Video 14.6: Hierarchical Multiple Regression in SPSS). The first
table outputted by SPSS is entitled ‘Variables Entered/Removed’. It can be seen that Model 1
includes ‘education’ as an entered variable. Although not obvious, Model 2 includes both
education and IQ.

C14.18
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Next, in a SPSS table entitled ‘Model Summary’, you will find the model R, model R2,
model adjusted R2 values for both Model 1 and Model 2. It can be seen that with the addition
of education to the model (Model 1), the model R2 was equal to .110. By contrast, with the
addition of IQ to the model (education and IQ together), the model R2 increased to .130. On
the right side of the ‘Model Summary’ table you will find the ‘change statistics’. These statistics
correspond to the changes in model R2 values and their corresponding statistical significance.
The change statistics for model 1 correspond to no independent variables included in the
model to the addition of one independent variable, which is education in this example. The R2
change from .000 to .110 was statistically significant, F(1, 248) = 30.69, p < .001. Thus, the
percentage of variance accounted for earnings by the inclusion of education into the model
was statistically significant.
The R2 change associated with model 2 (i.e., the addition to intelligence to the model)
corresponded to .020 and it was statistically significant, F(1, 247) = 5.77, p = .017. Thus, the
addition of intelligence to the model increased the percentage of variance accounted for in
earnings in a statistically significant and unique manner. Clearly, it was not a large increase,
but it was a statistically significant one.

Next, a table entitled ‘ANOVA’ includes the F and p-values associated with the model
R values reported in the ‘Model Summary’ table. Model 1’s R2 value of .110 was statistically
2

significant, F(1, 248) = 30.69, p < .001. Additionally, model 2’s R2 value of .130 was statistically
significant, F(2, 247) = 18.53, p < .001.

C14.19
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Next, a table entitled ‘Coefficients’ includes all of the statistics relevant to the individual
predictors.

Hierarchical Multiple Regression: Comments


You may have noticed that the R2 change p-value associated with the addition
of block 2 to the model (i.e., intelligence predictor) was exactly the same as the p-value
associated with the intelligence beta weight p-value. Consequently, the R2 change statistical
analysis result and the beta weight analysis result were redundant with each other. This will
always be the case when one variable is included in a block. Ultimately, with just two
predictors included in the whole model, a hierarchical multiple regression will not really yield
any more informative results than a method enter multiple regression. I used this example,
here, for the sake of simplicity and continuity. A more cogent example might include
education, socioeconomic status, age, and gender as predictor variables entered in block 1,
and then IQ and perseverance entered in block 2. The key is to have two or more predictor
variables in each block. In such cases, the R2 change p-value will unlikely correspond to any of
the predictor p-values associated with a block. Thus, the test of R2 chance can yield important
non-redundant information, in such a case.

C14.20
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Stepwise Multiple Regression


In some cases, you have a research scenario where you have a large number of
independent variables to include in a multiple regression analysis. Furthermore, the purpose
of the analysis may be to identify a smaller number of variables that predict the dependent
variable in a unique way, i.e., each independent variable contributes at least some statistically
significant variance to the regression equation. In such cases, it has been argued that stepwise
multiple regression can be a useful strategy to conducting the multiple regression analysis
(Pope & Webster, 1972).
The steps involved in conducting a stepwise multiple regression are purely statistical
and “automated”. That is, the process of selecting which predictor variables that will be
included in the regression equation is entirely algorithmic. At step 1, the stepwise analysis will
identify the independent variable that has the largest zero-order Pearson correlation with the
dependent variable. That variable will be added to the regression model as step 1. Then, the
stepwise procedure will look for the independent variable that has the largest partial
correlation with the dependent variable, controlling for the variance accounted for by the first
independent variable selected at step 1. If the second independent variable’s first-order partial
correlation is found to be significant statistically, the second independent variable will be
added to the regression equation. Next, the analysis looks for the independent variable with
the largest second-order partial correlation amongst the remaining independent variables,
controlling for the variance accounted for by the first two predictors (i.e., included in the
regression equation) and the candidate third predictor. If the second-order partial correlation
is statistically significant, the stepwise procedure will add the predictor to the model.
Eventually, the procedure is terminated, once a non-significant partial correlation is identified
as the next largest.
It is important to note that, at each step, previously entered predictors are evaluated
to determine whether they have maintained their statistically significant contribution to the
regression equation. If a previously entered variable loses its statistically significant
contribution, it will be dropped from model. Consequently, you may come across an example
where the first variable entered into the equation/model, i.e., the one with the largest zero-
order correlation with the dependent variable, is not included in the final regression equation.
This would happen if that first variable shared a fairly substantial amount of variance with the
other independent variables. So, just because an independent variable is not included in a
stepwise multiple regression equation, you should not claim that it is unrelated to the
dependent variable: it may, in fact, have a large, or even the largest, Pearson correlation with
the dependent variable.
In practice, a stepwise multiple regression may involve adding 10 candidate
independent variables to the multiple regression analysis; however, only 2 or 3 are kept in the

C14.21
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

regression equation, because they are the only ones that are found to be significant
statistically and unique contributors to the regression equation.

Stepwise Multiple Regression Example: SPSS


Heart disease is one of the most substantial culprits of death amongst adult humans.
Much research money continues to be poured into discovering why heart disease develops, as
well as variables that predict heart disease. One commonly discussed predictor of heart
disease is cholesterol. Although not everyone agrees5, individual differences in cholesterol
levels is commonly considered a predictor of heart disease. Consequently, many people are
recommended by their medical practitioners to obtain a cholesterol test, which involves
getting a blood test and some analysis of the blood. Another doctor’s appointment is typically
required to obtain the results. This is expensive when multiplied across millions and millions of
people around the world. Imagine if cholesterol levels could be predicted with appreciable
accuracy, based on variables that are easily and inexpensively assessed.
Based on the results reported by Bonaa and Thelle (1991), in addition to a couple of
other studies, I have simulated (N = 1,000) some data (Date File: predicting_cholesterol) to
correspond to a possible study designed to try to predict cholesterol levels with the following
candidate independent variables: age, sex, blood pressure (SBP), body mass index (BMI),
cigarette smoking (Cig), physical activity (PhysAct), alcohol consumption (Alc), family history of
cholesterol problems (famhistory), and hours of sleep per day (sleep). Thus, a total of nine
candidate independent variables. One approach to predicting cholesterol would be to simply
enter (force) all nine independent variables into the model. However, such an approach would
almost certainly lead to the inclusion of several non-significant contributors to the equation.
This is where stepwise multiple regression could be potentially used. That is, why not let the
statistical algorithm decide which variables to enter and exclude in the model on the basis of
the correlations between the variables?
When I conducted the analysis in SPSS, I obtained the key results below (Watch Video
14.7: Stepwise Multiple Regression in SPSS). As can be seen in the SPSS table entitled ‘Model
Summary’, seven models were estimated; furthermore, each successive model incorporated
an additional predictor. In the first model, ‘age’ was included as the predictor variable,
because it had the largest zero-order correlation with the dependent variable (r = .44). The
model R2 was .194, thus, 19.4% of the variance in cholesterol was accounted for by age alone.
Next, the stepwise algorithm selected ‘sleep’ as the next variable to include in the model,
because it had the largest partial correlation with the dependent variable. It can be seen that
the model R2 increased by .070, which implies that an additional 7% of the variance in

5
You might be surprised to learn just how much disagreement there is among respected researchers
with respect to the role of cholesterol in heart disease.
C14.22
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

cholesterol was accounted for by the inclusion of sleep. The stepwise analysis terminated after
the inclusion of seven predictor variables, which accounted for 36.0% (model R2 = .360) of the
variance in cholesterol. Also note that the final variable included in the model (‘sex’) increased
the percentage of variance accounted for by only 1% (R2 change = .010).

Like any other hierarchical-type regression, SPSS then reports a table entitled
‘ANOVA’, which tests for statistical significance each of the Model R values (from Model 1 R =
.440 to Model 7 R = .600). I have not reported the SPSS table of results, here, as the table is
large. Suffice it to say that all of the Model R values were found to be significant statistically.
Next, SPSS reports the beta-weights associated with the variables included in the model. As
can be seen in the SPSS table entitled ‘Coefficients’, age was associated with a standardized
beta-weight of .430 and the largest t-value. By comparison, sex (coded 1 = female; 2 = male)
was associated with a standardized beta-weight of .102. Most of the semi-partial correlations
are of a similar magnitude to the standardized beta-weights, as the independent variables
were not particularly inter-correlated. An exception to that statement was BMI.

C14.23
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Next, SPSS reports the independent variables excluded from the regression equation in
a table entitled, appropriately enough, ‘Excluded variables’. Again, because the table is so long,
I reported only the top and bottom of the table. It can be seen that, in the final model (model
7), two variables were left out of the regression equation: PhysAct (Physical Activity) and Alc
(Alcohol consumption). It can be seen that the partial correlations are rather small at -.062 and
.016, respectively, neither of which were significant statistically. Consequently, they were
omitted from the regression equation.

Thus, with respect to the prospect of predicting a person’s cholesterol level without
taking a blood test, more work needs to be done, as only 36% of the variance in cholesterol
was accounted for by the regression equation that included seven predictors. Naturally, many
other variables could be potentially added to the regression equation, which may increase the
model R2. Conceivably, hundreds of variables could be added to the regression equation to
help increase model R2. With a sufficient sample size (in the hundreds of thousands), the
results may prove robust. Who knows? Maybe one day.
Importantly, it would be unjustified to suggest that the variables omitted from the
regression equation were unrelated to cholesterol levels. As can be seen in the Pearson
correlation table below, both PhysAct and Alc were correlated statistically significantly with
cholesterol (all correlations greater than |.06| were significant statistically). The multiple
regression results simply suggest that they were not related to the cholesterol in a unique way,
i.e., independently of the effects attributable to the other independent variables. I also note
that PhysAct was very nearly included in the model with a p-value of .051. Of course, another
C14.24
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

sample of 1,000 people may yield a slightly different correlation matrix, which may have
resulted in PhysAct included in the model and, possibly, sex excluded. Thus, strong arguments
about the value of a variable as a predictor should not be made on the basis of a stepwise
multiple regression analysis (or any multiple regression analysis for that matter). Large
samples sizes and replication is crucial, with respect to interpreting the nature and strength of
the predictors. The model R2, on the other hand, would be expected to be more stable from
sample to sample.

Criticisms of Stepwise Multiple Regression


Few statistical procedures have received more critiques than stepwise multiple
regression. Although I can appreciate most of the arguments levelled against the procedure, I
believe contentions to the effect that it should never be used are overstated. Virtually all
statistical analyses have limitations. Correspondingly, most statistical analyses have a time and
place for useful application. From my review, there are a number of arguments that have been
articulated against the use of stepwise regression. I discuss a couple of the most important
ones here.
First, some have argued that stepwise regression is too empirically driven, with no
theory considered in the selection of the variables to the model (e.g., Lewis-Beck, 1978). If
researchers had enough theory to specify and test a specific model, I suspect they would do
so. Instead, researchers tend to use stepwise multiple regression more as an exploratory
approach to the generation of multiple regression equation, because that is the stage of
research in which they find themselves. In other cases, a research may be interested
exclusively in the solution to an applied problem (e.g., predicting cholesterol levels
inexpensively). In my view, that’s fine, so long as one is upfront about it. Curiously, exploratory
factor analysis (see Chapter 18) relatively rarely gets criticized for lacking theory; so, why
criticize exploratory stepwise regression so harshly for lacking theory?
Another criticism of stepwise regression pertains to the inappropriateness of
researchers declaring that one predictor is more important than another, with respect to

C14.25
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

predicting the dependent variable (Lewis-Beck, 1978). For example, a researcher may state
that the first variable entered into the model is more important than the second variable
entered into the model. This is a criticism of interpretation, not the stepwise procedure itself.
From an inferential statistical perspective, differences between point-estimates (e.g., beta-
weights) should be tested statistically. Consequently, it is inappropriate to declare the
difference between two beta weights as significant, simply because a stepwise procedure
entered them into a model at different steps. Although entirely true, such a criticism can be
leveled against any approach to multiple regression equation building, from the ‘method:
enter’ approach to the fully specified path analytic approach. Consequently, I believe this
criticism has also been levelled against the stepwise approach, unfairly.
As a general statement, I would encourage users of stepwise multiple regression to,
first, verify that at least two (or more) of the candidate predictors are correlated statistically
significantly with the dependent variable, as data sampled randomly from a population of non-
correlated variables can yield a regression equation with “statistically significant” (p < .05) beta
weights (Antonakis & Dietz, 2011). Again, this recommendation could be applied to virtually
any multiple regression procedure, unless suppressor effects were hypothesized (in which
case, one would probably test suppression directly). My recommendation, here, is consistent
with applying Bartlett’s test of sphericity on a correlation matrix, prior to conducting an
exploratory factor analysis (Dziuban & Shirkey, 1974).

Curvilinear Multiple Regression


In a previous chapter, it was mentioned that bivariate correlation (e.g., Pearson
correlation) and bivariate regression assume that the association between the independent
variable and the dependent variable is linear. A linear effect implies that the direction and
strength of the association is the same across the whole spectrum of the scores. In some
cases, this assumption will not be satisfied. In practice, researchers will examine visually the
scatter plot to detect the possibility of the violation of the assumption of linearity. In other
cases, however, researchers will generate a hypothesis that the association between the
independent and dependent variables is non-linear (a.k.a., curvilinear). The hypothesis may be
generated on the basis of theory or previous research. In this section of the chapter, I will
demonstrate how to test the most commonly observed curvilinear effect: a quadratic effect.
Pryzbylski and Weinstein (2017) were interested in examining the association between
the amount of time adolescents play video games per day and their mental well-being. Based
on several considerations, Pryzbylski and Weinstein (2017) were open to the possibility that
the association might be curvilinear. That is, they hypothesized that there would not be any
cost to well-being for those adolescents who reported playing a small to a moderate amount
of video games, but there would be a negative impact on mental well-being for those

C14.26
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

adolescents who reported large amounts of video game playing. To test the hypothesis,
Pryzbylski and Weinstein (2017) obtained data from a very large sample of 15-year-old
adolescents who reported the amount of time they played video games per day (in hours). 6
The adolescents also completed a 14-item self-report measure of mental well-being
(happiness, life satisfaction, psychological functioning, and social functioning). I have simulated
some data to correspond to the essence of the results reported by Pryzbylski and Weinstein
(2017) (Data File: video_games_well_being).

Figure C14.2. Idealised Lines of Best Fit: Four Possible Curvilinear Quadratic Effects
Panel A: U-Shape Panel B: Inverted U-Shape

Panel C: Positive Threshold Panel D: Negative Threshold

The key to conducting a curvilinear regression is that you need to transform your
predictor variable in such a way as to reflect the nature of the curvilinear effect in which you
are interested. As mentioned, a quadratic effect is by far the most commonly observed and
tested curvilinear regression. A quadratic curvilinear effect implies one bend in the regression
line of best fit (a linear effect implies a straight line of best fit). The bend in the line of best fit
may reflect a U-shaped pattern (Figure 14.2, Panel A) or an inverted U-shaped pattern (Figure
14.2, Panel B). Often, the quadratic curvilinear pattern will not represent a perfect U-shape.
Instead, for example, the line of best fit may increase across most of the first half of the scores

6
Pryzbylski and Weinstein (2017) also had data relevant to TV watching, computer use, and smartphone
use, if you’re interested.
C14.27
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

on the X-axis and then plateau across the second-half of the X-axis (Figure 14.2, Panel C).
Finally, the opposite pattern may also be observed from time to time (Figure 14.2, Panel D).
Prior to conducting the curvilinear regression, I’ll note that the Pearson correlation
between hours of video-game playing and mental well-being was estimated at r = -.116, p =
.002. Such a result suggests that higher levels of video game playing is associated with lower
levels of mental well-being. As I show below, it would be a bit mistake to end the analysis
there.

The first step to conducting a quadratic curvilinear analysis is to transform the predictor
variable scores into squared variables (Watch Video 14.8: Square a variable (via transform) in
SPSS). Once you create the new (squared variable), you can conduct a hierarchical multiple
regression analysis. In the first block, include the original independent variable. In the second
block, include the squared independent variable. If the inclusion of the squared independent
variable into the model is found to be associated with a statistically significant increase in
model R2, then you can suggest that there was a curvilinear effect between the independent
variable and the dependent variable. For the video game playing and mental well-being
example, I obtained the following results in SPSS (Watch Video 14.9: Curvilinear (Non-Linear)
Regression in SPSS).

As can be seen in the SPSS table entitled ‘Model Summary’, the model R and R square
values associated with Model 1 (the inclusion of only video game playing as a predictor) were
estimated at .116 and .014, respectively, which implies that 1.4% of the variance in the
dependent variable (mental well-being) was accounted for by the independent variable (video
C14.28
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

game playing hours). Furthermore, the model R2 was significant statistically, F(1, 732) = 10.06,
p = .002. It is not a coincidence that the model R2 p-value was identical to the Pearson
correlation p-value. They correspond to essentially the same analysis. More importantly,
Model 2, which includes both the gaming hours independent variable and the squared gaming
hours independent variable, yielded Model R and R2 values of .252 and .063, respectively,
which implies that 6.3% of the variance in mental well-being was accounted for by the
combination of the linear (video game hours) and quadratic effects (squared video game
hours). Furthermore, the R2 change (.050; i.e., .063 - .014, within rounding) was found to be
significant statistically, F(1, 731) = 38.90, p < .001. Thus, the inclusion of the squared video
game hours variable increased the predictive capacity of the regression equation. Such a result
implies that there is at least one bend in the line of best fit, i.e., the effect is not linear.
Prior to examining the scatter plot, you can see in the SPSS table entitled ‘Coefficients’
that both the gaming hours (gaming_hours) and squared gaming hours (gaming_hours2)
variables were statistically significant contributors to the model, as each variable was
associated with a statistically significant beta-weight: gaming_hours, b = .182, β = .634, t =
5.05, p < .001; gaming_hours2, b = -.028, β = -.783, t = -6.237, p < .001. Unfortunately, beta-
weights, even standardized beta-weights, are not particularly interpretable in a curvilinear
regression analysis. In the context of a quadratic curvilinear regression, the most you can really
say is the following. If the quadratic effect beta-weight is positive in direction, it suggests a U-
shaped pattern in the scatter plot. By contrast, if the beta-weight is negative in direction, it
suggests an inverted U-shaped pattern in the scatter plot.
Some would recommend transforming the independent variable into z-scores and
then square the z-scores to create a standardized variable to represent the quadratic effect in
a standardized way. The argument, here, is that these standardized variables, when analyzed
in the hierarchical multiple regression, will yield more interpretable beta-weights. I’m not
entirely convinced of this approach, as I do not believe the standardized beta-weights become
fully interpretable, in the same way that standardized beta-weights can be interpreted in a
more typical (linear) multiple regression.
Instead of focusing on the beta-weights in a curvilinear linear regression, one might
consider focusing on the semi-partial correlations, as they appear to be more interpretable. In
this example, the quadratic term was associated with a semi-partial correlation of r = -.223
(see SPSS table entitled ‘Coefficients’). The negative semi-partial correlation implies an
inverted U-shaped effect, which is consistent with the scatter plot depicted in Figure C14.3.
Furthermore, the squared semi-partial correlation of .050 (i.e., -.2232) implies that 5.0% of the
variance in mental well-being was accounted for by the curvilinear effect. The linear effect
accounted for 1.3% of the variance in mental well-being, on the basis of the squared semi-
partial correlation estimated from model 1 (i.e., .1162 = .013). I do not believe the linear effect
semi-partial correlation from model 2 is interpretable in this way (so, it should probably be
C14.29
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

ignored). Note that the sum of the squared semi-partial correlation associated with the linear
effect (model 1) and the squared semi-partial correlation associated with the quadratic effect
(model 2) essentially equals the model 2 R2 of .063 (.013 + .051 = .064). I don’t know for
certain if this will always be the case, but it (i.e., summing model 1 linear squared and model 2
quadratic squared) does seem to apply across several data files I have.
Before moving onto the all-important scatter plot, I’d like to point out that the
collinearity statistics (Tolerance and VIF) suggest that collinearity is a problem in the data, as
the tolerance is less than .10 and the VIF is greater than 10, in this example. There is much
literature on the issue that the linear effect term and the quadratic effect term necessarily
correlate with each very substantially (in this case, they correlated r = .958). The concern is
that the substantial correlation will cause unacceptably high multi-collinearity. In this example,
you will note that tolerance was reported at .081 and the VIF was reported at 12.29 (see
‘Coefficients’ SPSS table). Such values suggest a problem (recall that tolerance should not
exceed .10 and VIF should be less than 10). However, in a curvilinear regression analysis, you
should not trust these multicollinearity values. They are misleading and untrustworthy. If
multicollinearity were really as bad as VIF = 12.29, the standard errors associated with the
unstandardized beta weights would be larger than they are (e.g., gaming_hours2 had a tiny
unstandardized standard error of .004). Furthermore, neither the linear nor the quadratic
effects would be significant statistically; instead, they were both significant statistically. Again,
multicollinearity is not a problem. Ignore it as an issue in regressions that include product
terms, such as curvilinear regression and moderated regression (Disatnik & Sivan, 2016).
Consequently, feel free to proceed with the analysis.

It is almost always essential to evaluate a curvilinear regression with a scatter plot and
line of best fit (Watch Video 14.10: Curvilinear (Non-Linear) Scatter Plot in SPSS). As can be
seen in Figure C14.3, the scatter plot depicted an inverted U-shaped relationship between
video game playing hours and mental well-being. Specifically, well-being tended to increase
from about 0 hours to 3 hours of video game playing, then started to decrease beyond 3 hours

C14.30
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

of video game playing per day.7 Clearly, relying only upon the Pearson correlation of -.116
would mislead people in a serious way, as it failed to reflect the curvilinear nature of the
effect: small to moderate amounts of video game playing appears8 to be a good thing for the
mental well-being of 15-year-olds!

Figure C14.3. Scatter Plot Depicting the Curvilinear (Quadratic) Association Between Video
Game Playing (Hours Per Day) and Mental Well-Being

Finally, whether it is permissible to interpret a statistically significant linear effect in


the presence of a statistically significant quadratic effect appears to be discussed rarely. In my
view, a linear effect should not be interpreted in the presence of a statistically significant
quadratic effect, as it would be misleading. Consider the current study relevant to video-game
playing and mental well-being. If someone asked if video game playing was related positively
or negatively to mental well-being in 15-year-olds, one could not say, categorically, ‘yes,
positively’ or ‘yes, negatively’. The only way to describe the association, justifiably, is to state
that the association is consistent with an inverted U-shaped effect: the association is positive
from 0 to about 3 hours per day and then negative from beyond 3 hours onwards. In other
words, one would have suggest that the nature and/or strength of the association depends on
where along the X-axis one is considering. Recall from the factorial ANOVA chapter that when
one has to state that the nature and/or strength of the effect depends on the values of

7
My data don’t quite replicate the results reported by Pryzbylski and Weinstein (2017). They reported
peaks in well-being at 2 hours per day, for weekdays, and 3 hours per day, for the weekends.
8
Based on these results, one could not contend justifiably that playing a small to moderate amount of
video games per day causes an increase in mental well-being. These data are non-experimental in
nature, consequently, one could only suggest the possibility of a causal effect. When I was 15, I played
about 1 hour per day on my commodore 128 (in commodore 64 mode). Those were the days!
C14.31
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

another variable, one is dealing with an interaction. Correspondingly, you may regard a
curvilinear regression as the most basic type of interaction in a regression context: the nature
and/or strength of the association between X and Y depends on the values of X. In the next
section of this chapter, I describe moderated (interaction) multiple regression with two
different independent variables and one dependent variable.

Moderated Multiple Regression


We probably all know a few people with a large amount of motivation to succeed in
life, but not much in the way of brains, and, so, they achieve a certain amount in life, but not
really an impressive amount. Conversely, we probably all know a few people with a high level
of intelligence, but little to no motivation to succeed, and they achieve very little over the
years. Research has shown that both intelligence and motivation are ingredients for success
(e.g., Duckworth et al., 2011). However, some have theorized that, when you combine
elevated levels of intelligence and elevated levels of motivation in the same person, something
special happens: on average, you observe a disproportionate amount of success from this type
of person. That is, more so than you would predict from their intelligence and motivation
levels added-up as individual predictors. Instead, the combination of the two variables,
intelligence and motivation, has been theorized to have a multiplicative (“synergistic”) effect
on success. Anyway, that’s theory. It has been difficult to find empirical evidence for this
theory. In the area of academic success, However, Hirschfeld, Lawson and Mossholder (2004)
argued that it was important to measure motivation in a context-specific way, not general
motivation. Consequently, they obtained scores of general intelligence from 364
undergraduate university students. They also had the students complete a short academic
achievement motivation self-report questionnaire (e.g., ‘I always strive to achieve excellence
in my school work.’). Finally, the dependent variable was grade point average (GPA). Thus,
each of the three variables was measured on a continuous scale.
Prior to getting on with the analysis, I’ll note that, conceptually, you should think of
moderated multiple regression as similar to interactions in the context of factorial ANOVA.
Recall that an interaction was described as consistent with the observation of a difference in
the magnitude of the differences between means across levels of an independent variable. The
main difference is that in ANOVA, the independent variables are measured on a nominal scale,
whereas in multiple regression the independent variables are measured on a continuous
(ratio/interval/ordinal) scale. Of course, the dependent variable is measured on a continuous
scale for both types of analyses.
From a procedural perspective, the key element to conducting a moderated multiple
regression is the requirement to multiply the two independent variables together. Such a
process creates a product term (the multiplication of two scores yields a ‘product’). The

C14.32
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

product represents the interaction term in the analysis. As per a curvilinear analysis, one can
test a hypothesized interaction effect via a hierarchical multiple regression. At step 1, the two
independent variables are entered into the equation. Then, at step 2, the product variable is
entered into the equation. If a statistically significant R2 change (as tested via F-change) is
observed between step 1 and step 2, a statistically significant interaction may be declared.
I simulated some data to correspond to the results reported by Hirschfeld et al. (2004)
(Data File: IQ_motivation_gpa). First, I created the product term by multiplying the
intelligence scores by the academic achievement scores. In SPSS this can be done via the
‘Compute variable’ utility (Watch Video 14.11: Create Product Term Variable in SPSS). Next, I
examined the Pearson correlations between the variables, which is always good practice for
virtually any multiple regression analysis.

As can be seen in the SPSS table entitled ‘Correlations’, both intelligence (r = .42, p <
.001) and academic motivation (r = .35, p < .001) correlated positively with academic
performance (GPA). However, intelligence and academic motivation did not correlate with
each other significantly (.06, p = .254). Furthermore, the intelligence by motivation product
term correlated positively with GPA (r = .51, p < .001). Finally, the product term correlated
with intelligence and academic achievement at .56 and .86. Am I worried about the large
correlation of .86 between the product term and academic motivation? Not one bit. As far as
multicollinearity is concerned, it is not a problem. It’s an illusion.
Next, I conducted the hierarchical multiple regression in SPSS (Watch Video 14.12:
Moderated Multiple Regression in SPSS). There are three key tables to consult to interpret the
results fully. The ‘Model Summary’ table, the ‘ANOVA’ table, and the ‘Coefficients’ table. As
can be seen in the ‘Model Summary’ table below, the Model 1 results are reported, which
includes the model R and R square associated with the two predictors included in the analysis.

C14.33
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Thus, 28.2% of the variance in GPA was accounted for by intelligence and academic
achievement, which was significant statistically, F(2, 361) = 70.99, p < .001. We don’t know
which variable contributed to the equation, yet. Those results are reported in a table
‘Coefficients’ table, where it can be seen that both intelligence and academic motivation were
statistically significant contributors to the model. Specifically, intelligence: b = .002, β = .40, t =
8.97, p < .001; academic achievement: b = .24, β = .33, t = 7.30, p < .001.
The most important piece of information in the SPSS ‘Model Summary’ table is the ‘R
square change’ and associated F- and p-value. In this case, the R square change was equal to
.010, which was statistically significant, F(1, 360) = 4.97, p = .026. The statistically significant R
square change implies that the moderator hypothesis was supported: it does appear to be the
case that the association between intelligence and GPA interacts with academic motivation.
One could just as appropriately state that the association between academic motivation and
GPA interacts with intelligence. They are two sides of the same coin. As can be seen in the
SPSS table entitled ‘Coefficients’, the intelligence by academic motivation product term was
associated with a statistically significant beta-weight, b = .001, β = 1.06, t = 2.23, p = .026. The
overall model including all there independent variables (i.e., Model 2) was associated with a
model R2 .292 (model R = .540), which was significant statistically, F = 49.50, p < .001. It’s not a
coincidence that the sum of squares ‘Regression’ divided by the sum of squares ‘Total’ is equal
to the model R2 value (i.e., 30.913 / 105.851 = .292). As I describe further below, there is a
close correspondence between ANOVA and multiple regression. SPSS makes use of an ANOVA
framework to test the final model R2 for statistical significance, hence the inclusion of a table
entitled ‘ANOVA’.
Interpreting a statistically significant moderator effect in multiple regression is not
easy, especially if all you rely upon is the beta-weight. As per curvilinear regression, a
positively directed interaction term beta-weight implies that the association between the
independent variable and the dependent variable becomes progressively stronger across the
spectrum of the moderator variable values. In this study, which variable you would consider
the independent variable and which variable the moderator is arbitrary. Theoretically and
empirically, it is not possible to assert whether intelligence is the moderator of the effect
between academic achievement and GPA, or whether academic achievement is the moderator
of the effect between intelligence and GPA. It’s best to think about the moderator effect as a
mutual, multiplicative (synergistic) effect.
Prior to demonstrating a graphical method to depict the nature of a moderated effect,
it is useful to point out that neither the unstandardized nor the standardized beta-weights
associated with the interaction term are readily interpretable. For example, in this example
investigation, the standardized beta-weight associated with the interaction term exceeded 1.0,
which is possible, mathematically, it’s just not easy to interpret in an insightful way. Instead, I
recommend that you focus upon the semi-partial correlations associated with a moderated
C14.34
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

multiple regression analysis, as per my recommendation for a curvilinear regression. That is, by
summing the squared semi-partial correlations associated with the main effects included in
model 1 (i.e., .4002 + .3252) and the product term squared semi-partial correlation reported in
model 2 (i.e., .102), you get close to the overall model 2 R square of .292 (i.e., .160 + .106 +
.010 = .276). The difference between the two R square values, .292 - .276 = .016, represents
the percentage of variance accounted for in the dependent variable due to the shared variance
between the predictor variables. Recall that intelligence and motivation were inter-correlated
at only r = .060, so, it makes sense that the common variance predictive influence on the
dependent variable was fairly small, in this example. I also note, briefly, that presence of a
suppressor effect would complicate the interpretation of the difference between the .292 and
.276 R square values. I discuss suppressor effects further in this chapter.

C14.35
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

An arguably more sophisticated, but more difficult to understand, approach to


evaluating the relative influence of predictors in a moderated multiple regression is a
commonality analysis (a.k.a., element analysis). Commonality analysis decomposes the unique
and common percentages of variance accounted for by each individual variable, as well as all
possible combinations of the variables, included in the model. When there are three variables
entered into the multiple regression analysis, there are 7 possible variable combinations that
may have accounted for a proportion of the explained variance attributable to the regression
equation. In this example study, 29.2% of the variance in GPA was accounted for by the
multiple regression equation. However, it is not known how that percentage of variance arose.
Unfortunately, it is not as simple as summing the squared semi-partial correlations associated
with each variable included in the analysis, as the inter-predictor variable correlations would
imply that some of the overall R2 arose due to combinations of the predictor variables, hence
the term ‘commonality analysis’. Fortunately, Nimon (2010) developed a script that can be run
within SPSS to conduct a commonality regression analysis, which I applied to the current
intelligence, academic achievement and GPA data. As can be seen in Table C14.2 (left-side),
the 29.2% of the variance shared between the predictors (IQ and motivation) and the
dependent variable was not much due to the unique effects of IQ, motivation, or the
moderator term. Instead, the vast majority of the shared variance was due to the common IQ
and IQ*motivation effect (54.50%), as well as the common motivation and IQ*motivation
effect (34.96%). Only a relatively small percentage (8.68%) of the shared variance was due to
the common IQ and motivation effect.

C14.36
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Table C14.2. Results of Commonality Regression Analysis: IQ, Motivation, GPA


Individual Combinations Individual Variables

Moderation: Spotlight Analysis and Plots


In addition to a commonality regression analysis, it is recommendable to create a
visual representation of a moderator effect. The essential strategy associated with depicting a
moderated regression effect is to represent the association between the independent variable
and the dependent variable across two or more levels of the moderator variable. Such a
strategy may be considered consistent with a ‘spotlight test’ (Spiller, Fitzsimons, Lynch, &
McClelland, 2013), which may incorporate the estimation of the correlation between the
independent and dependent variables across the different levels of the moderator. When
there is a statistically significant moderator effect, the magnitude/direction of the correlations
will be unequal.
In practice, given that a moderator is often measured on a continuous scale in a typical
moderated multiple regression, the spotlight analysis levels need to be created artificially. For
example, one may split the sample into two groups: low-moderate versus moderate-high. In
practice, applying such a strategy would involve calculating the median value on the
moderator and giving all cases with a value less than and equal to the median on the
moderator a ‘1’ and all cases with a value greater than the median on the moderator a value of
‘2’. Then, create two scatter plots with the inclusion of a line of best fit, as well as the
correlation for each of the two portions of the sample.
I conducted such a spotlight analysis with the IQ, motivation and GPA data (Watch
Video 14.13: Spotlight Analysis in SPSS - Basic). As can be seen in Figure C14.4, the two scatter
plots depicting the association between intelligence and GPA across the two levels of
academic motivation (low to moderate versus moderate to high) were a little different.
Specifically, in the low to moderate portion of the sample, the angle of the slope was slightly
lower than the angle of the slope for the moderate to high academic motivation group.
Correspondingly, the Pearson correlation between intelligence and GPA was larger (r = .46) for
the moderate to high portion of the sample, in comparison to the corresponding Pearson
correlation for the low to moderate motivation portion of the sample (r = .38). The statistically
C14.37
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

significant interaction detected by the moderated multiple regression (R2 change = .010, F =
4.97, p = .026), implies, at least in rough terms, that the two correlations are statistically
significantly different from each other. However, this will not necessarily be the case, as there
are many ways to split-up the sample across the moderator continuum. In this example, the
statistically significant interaction implies that the correlation between intelligence and GPA
was not equal across all levels of the academic motivation. In practice, we do not know exactly
where the split-sample correlations are unequal across the spectrum of the moderator.
Splitting the sample into low and high groups is good start, and my demonstration above will
hopefully give you some insight into the graphical spotlight analysis.
Fortunately, rather than estimate the median and create the two groups manually,
O’Connor (1998) created a nice, little SPSS syntax file that will conduct a ‘split slope’ analysis in
a relatively efficient way (Watch Video 14.14: Spotlight Analysis in SPSS - Extreme). The default
in the O’Connor (1998) syntax file is to estimate the independent variable and dependent
variable slopes across three levels of the moderator: (1) one standard deviation below the
moderator mean, (2) the moderator mean, and (3) one standard deviation above the
moderator mean. Additionally, the X-axis data points are generated by the syntax file on the
basis of one standard deviation below and above the independent variable mean. When I ran
the O’Connor (1998) syntax with the default settings, I obtained the chart depicted in Figure
C14.5 (left-side). It can be seen that the degree of positive slope increased across the three
levels of academic motivation. Thus, the magnitude of the association between intelligence
and academic achievement (GPA) increased across the continuum of academic motivation.
Thus, the combination of intelligence and academic motivation impacts academic performance
in such a way that we would not predict, based only on an examination of the intelligence and
motivation scores, separately. Something a little “special” appears to happen when relatively
elevated levels of intelligence are combined with relatively elevated levels of academic
motivation.
As mentioned, the default setting in the O’Connor (1998) syntax is to split the
moderator data across three groups, two of which are one standard deviation above and
below the mean (get syntax here). You can manipulate the O’Connor (1998) syntax such that
the moderator data are split across three groups, two of which are two (rather than one)
standard deviations above and below the mean. Increasing the standard deviation from one to
two standard deviations has the effect of accentuating the magnitude of the interaction effect.
As can be seen in Figure C14.5 (right side), the difference in the degree of slope is more
substantial for two standard deviation criterion, in comparison to the one standard deviation
criterion. Consequently, in my view, the split slopes analysis commonly used by researchers
tends to exaggerate the magnitude of the interaction effect. By comparison, my original split
slopes analysis based on low-moderate and moderate high groups (Figure C14.4) is arguably a
more appropriate (i.e., realistic) method to represent a two-way interaction for a moderator
C14.38
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

analysis (you can also present a single scatter plot with two lines of best fit, as I demonstrate in
the video), as it is more consistent with the moderator multiple regression results, which is
based on all of the data (not extreme groups data). Finally, you might find it beneficial to split
the sample into a larger number of finer portions for a more comprehensive and refined
‘floodlight analysis’ (see Spiller et al., 2013).

Figure C14.4. Scatter Plots for Split Slope Analysis (Raw Data)
Low to Moderate Motivation (r = .38) Moderate to High Motivation (r = .46)

Figure C14.5. Scatter Plot


Moderator: 1 SD Above/Below Mean Moderator: 2 SD Above/Below Mean

Multicollinearity in Moderated Regression


At face value, multicollinearity should be considered a problem in moderated multiple
regression, as one or both of the predictors in a moderated multiple regression will necessarily
correlate very highly (.80+) with the product term derived from the two independent variables.
C14.39
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Correspondingly, the tolerance and variance inflation factor indicators will almost always
suggest that multicollinearity is grossly excessive. Consequently, researchers tend to apply
strategies to reduce the degree of shared variance between the predictors and the product
term. One commonly observed strategy involves subtracting each predictor variable’s mean
from the case values for a given variable, prior to calculating the product term to represent the
moderator effect. Such a strategy is known as ‘mean centering’.
Although mean centering will often reduce multicollinearity to some degree, there is
much debate on whether multicollinearity is truly a problem for moderated multiple
regression. Personally, I’m in the ‘it’s not a problem camp’. Like others, I believe the perceived
multicollinearity is really just an illusion. As demonstrated by Disatnik and Sivan (2016), for
example, the multicollinearity apparent when a product term is included in a multiple
regression is due to interval scaling. Furthermore, despite the observation of a very large
variance inflation factor in a moderated regression, multicollinearity due to interval scaling
does not impact (increase) standard errors for the beta-weights (Disatnik & Sivan, 2016).
Again, this sort of multicollinearity is an illusion. It should be kept in mind, however, that
multicollinearity due to a substantial amount of shared variance between separately measured
independent variables (i.e., not product terms) should be evaluated and taken into account,
when it is observed.

Mediated Multiple Regression


The association between two continuously scored variables may be estimated via a
Pearson correlation or a bivariate regression (standardized) coefficient. As you may know,
both analyses yield the same results. Although a Pearson correlation and a bivariate regression
(standardized) will yield the same result, they can be considered, theoretically, to represent
two different types of associations. With respect to the Pearson correlation, there is no
suggestion of causality: the two variables are simply suggested to be associated. By contrast, a
bivariate regression result may be interpreted to imply that the independent variable causes,
at least to some degree, the dependent variable. Importantly, there is no way to know
whether the independent variable causes the dependent variable with a bivariate regression.
I’m only saying, here, that a researcher may imply the possibility of a causal effect between X
and Y in the context of a bivariate regression. Given the different approaches to interpreting
the nature of a Pearson correlation and a bivariate regression, they can also be represented
visually in their respective ways. As can be seen in Figure C14.6 (Panel A), a Pearson
correlation is represented by a curved arrow between two squares. One of the squares
represents the independent variable (X) and one variable represents the dependent variable
(Y). By comparison, a bivariate regression is represented by a straight arrow between the two
squares, where the arrow leads from the independent variable to the dependent variable

C14.40
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

(Figure 14.6, Panel B). In the context of mediation analyses, a bivariate regression coefficient
(B) between X and Y is often referred to as a total effect. The direction of the arrow may be
considered to imply that the independent variable causes the dependent variable. Again, there
is no way to establish causality with a bivariate regression. However, more complicated
models can be tested to help suggest more convincingly the possibility of causality.

Figure C14.6 Visual Depictions of the Pearson Correlation and Bivariate Regression Models
Panel A Panel B

One such model is a mediation regression model. In its simplest form, the purpose of
mediated regression is to help understand the nature of an association between two variables.
Specifically, a researcher may hypothesize that the total effect between two variables (X and Y)
may be due, partially or fully, to a third variable. A variable hypothesized to account, partially
or fully, the association between two variables is known as a mediator (M). When a variable is
found to account, partially or fully, the association between X and Y, researchers will often
state that there is an indirect effect between X on Y via M. In practice, the generation of a
hypothesis that includes a mediator should be supported by theoretical considerations.
For example, Stoeber and Corr (2016) were interested in the association between self-
oriented perfectionism and positive affect. As the name implies, self-oriented perfectionism
pertains to having high expectations for oneself, rather than the sort of perfectionism that is
perceived by a person to be imposed on them by others. Theoretically, one may hypothesize
that self-oriented perfectionism (X) influences individual differences in state positive affect (Y)
(feelings of enthusiasm, pride, etc.). Furthermore, it may be hypothesized that the association
between self-oriented perfectionism and positive affect may be due (mediated), partially or
fully, to individual differences in flourishing (M) (self-perceived success).
The hypothesized mediation models are depicted visually in Figure C14.7 (Watch
Video 14.15: Mediation - Purpose & Key Terms). In the partial mediation model (Panel A),
there is a straight arrow leading from X to Y (c). Additionally, there is an arrow leading from X
to M (a) and an arrow leading from M to Y (b). The coefficient associated with path (a) is
simply a bivariate regression coefficient. Paths (b) and (c) represent multiple regression beta
weights. Prior to testing the model depicted in Panel A, one would expect to see a statistically
significant total effect between self-oriented perfectionism and positive affect (based on a
Pearson correlation or a bivariate regression). When partial mediation is observed, the direct

C14.41
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

effect (c) between X and Y will reduce in magnitude, but remain significant statistically (p <
.05). Thus, the observation of partial mediation will involve three direct effects that are
statistically significant (a, b, and c; see Figure C14.7, Panel A). In addition to the three direct
effects, there is also one indirect effect associated with the partial mediation model, but you
can’t see it obviously in the model. The indirect effect corresponds to the product of the X-->M
(a) and M-->Y (b) direct effect coefficients. Stated alternatively, the indirect effect represents
the association between X and Y that is due to the influence of M. In practice, testing the
statistical significance of the indirect effect is the tricky part, as most statistical programs do
not test them in an automatic way. However, in order for partial (or full) mediation to be
supported, the indirect effect must be found to be significant statistically.

Figure C14.7 Visual Depictions of Partial (Panel A) and Full (Panel B) Mediation Models
Panel A Panel B

In contrast to partial mediation, full mediation is observed when there is a significant


total effect between X and Y (c). However, once the mediator is added to the model, the direct
effect between X and Y (c) is not found to be significant statistically. Additionally, the indirect
effect between X and Y (a*b) must be observed to be significant statistically to support full
mediation. The direct effect (c) does not have to be exactly zero to suggest full mediation. The
direct effect coefficient just has to be non-significant statistically (with a corresponding
significant indirect effect). When full mediation is observed, one could represent the nature of
the associations between the variables as per Figure C14.7 (Panel B), because the model
implies that there is no direct effect between X and Y. That is, the entire total effect between X
and Y is presumed to be due to the mediator. Consequently, X cannot be considered a cause of
Y. Stated alternatively, one might suggest the possibility that the association between X and Y
is caused by M.
To demonstrate a mediation analysis with one hypothesized mediator, I return to the
Stoeber and Corr (2016) study, where the association between self-oriented perfectionism and
positive affect is hypothesized to be mediated, partially or fully, by individual differences in
flourishing (Data File: perfectionism_flourishing). First, I estimated the bivariate regression
coefficient between self-oriented perfectionism and positive affect (Watch Video 14.16:
C14.42
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Testing Mediation in SPSS - Example 1). As can be seen in the SPSS table entitled ‘Coefficients’,
the unstandardized (.102) and standardized (.140) beta-weights were significant statistically, p
= .006. Thus, there was a total effect to potentially mediate.

Next, I regressed the hypothesized mediator (flourishing) onto the self-oriented


perfectionism variable, in order to estimate the coefficient (a) depicted in Panel A of Figure
C14.7. As can be seen in the SPSS table entitled ‘Coefficients’ below, the effect was positive
and significant statistically: unstandardized beta-weight= .197; standardized beta-weight =
.190; both p < .001.

Next, I submitted the three variables to a multiple regression analysis, in order to


estimate the (b) and (c) coefficients depicted in the Panel A model of Figure C14.7. As can be
seen in the SPSS table below entitled ‘Coefficients’, the self-oriented perfectionism was not
observed to be associated with a statistically significant direct effect onto positive affect,
unstandardized beta weight = .017, standardized beta-weight = .023, p = .573. However, the
unique effect (b) of flourishing onto positive affect was significant statistically: unstandardized
beta weight = .430, standardized beta-weight = .616, p < .001. The fact that the total effect
between self-oriented perfectionism and positive effect was significant statistically, but the
corresponding direct effect with the inclusion of flourishing as a mediator was not, suggests
that fully mediation may have occurred. To ensure that full mediation can be interpreted
convincingly, it is necessary to demonstrate that the indirect effect between self-oriented
perfectionism and positive affect was significant statistically. Prior to testing the indirect effect
for statistical significance, I will note that the unstandardized indirect effect (a*b) point-
estimate corresponded to .085 (.197*.430). Correspondingly, the standardized indirect effect
point-estimate corresponded to .117 (.190*.616).

C14.43
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Unfortunately, SPSS does not provide a test of the indirect effect in an automatic way.
Fortunately, there are SPSS syntax/macro files that can be used to test an indirect effect for
statistical significance. A particular popular approach was published by Preacher and Hayes
(2004). You can download the syntax file here: https://tinyurl.com/y9h3bqlx. Preacher and
Hayes’ (2004) program can test the significance of an indirect effect in two ways: (1) Sobel
test; and (2) bootstrapping. With respect to the Sobel test, I obtained the following key results
from the Preacher and Hayes (2004) SPSS macro:

It can be seen that unstandardized indirect effect was estimated at .0849 (same as I calculated
by hand). Furthermore, the standard error was .0231, which yielded a z-value of 3.6702 (i.e.,
.0849/.0231) and a p-value of .0002. Thus, the indirect effect between self-oriented
perfectionism and positive affect via flourishing was significant statistically. Technically, these
results correspond only to the unstandardized effects. You would only be able to imply that
the standardized indirect effect (i.e., .117) was also significant statistically, based on the Sobel
test.
Fortunately, Preacher and Hayes’ macro (2004) also provides the option to test the
statistical significance of an indirect effect via bootstrapping. Some arguments have been
made that the Sobel test does not provide accurate results in all circumstances, particularly
when the sample size is small. Additionally, there is reason to believe that the sampling
distribution of the product of two coefficients may not be normal. Consequently, Preacher and
Hayes (2004) recommend testing the statistical significance of indirect effect via
bootstrapping. As you may know, already, bootstrapping does not assume any level of
normality. When I applied the Preacher and Hayes (2004) macro based on 2,000 bootstrapped
samples, I obtained the following results.

C14.44
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

It can be seen that the unstandardized indirect effect standard error is only slightly
different, in this case. Furthermore, the 95% confidence intervals associated with the point-
estimate (.0849) did not intersect with zero, which is consistent with a statistically significant
indirect effect. Mediation models with only one hypothesized mediator are known as ‘simple’
mediation models. Often, a researcher may hypothesize the possibility of two or more
mediators in the same model. In my opinion, with such relatively complicated models, I would
encourage researchers to move onto a path analytic framework, which can be employed with
a program such as Amos, for example. If you do not have access to a path analysis program,
you may also consider Hayes’ more comprehensive mediation testing program known as
PROCESS (a google search should find the most up-to-date version).
To provide an example of partial mediation, I have conducted an additional mediation
analysis on the following variables included in the Stoeber and Corr (2016) study: social-
oriented perfectionism (a bad thing; perfectionism perceived to be imposed by others),
flourishing, and negative affect. In this case, one may hypothesize that a negative association
(total effect) may be present between socially-oriented perfectionism and negative affect.
Furthermore, flourishing may mediate, partially or fully, the total effect between socially-
oriented perfectionism and negative affect. First, I estimated the total effect between X and Y,
which was found to be significant statistically, unstandardized beta-weight = .390,
standardized beta-weight = .430, p < .001. Thus, there was the possibility of observing
mediation.

Next, the beta-weight between the independent variable (X) and the mediator (M) was
estimated (i.e., coefficient c), which was also significant statistically: unstandardized beta-
weight = -.299; standardized beta-weight = -.240, p < .001.

Finally, a multiple regression was performed by regressing the negative affect dependent
variable (Y) onto the social-perfectionism independent variable (X) and the flourishing

C14.45
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

mediator variable (M). As can be seen in the SPSS table entitled ‘Coefficients’, the flourishing
mediator was associated with negative affect significantly: unstandardized beta-weight = -
.167; standardized beta-weight = -.230, p < .001. Importantly, there was also a statistically
significant direct effect between social-perfectionism and negative affect, unstandardized
beta-weight = -.167 and standardized beta-weight = -.230, p < .001.

Thus, at best, partially mediation may be declared. In order to support a partial


mediation interpretation, the indirect effect between social-perfectionism (X) and negative
affect (Y), which, in unstandardized form, corresponded to .050 (-.299 *-.167), needs to be
tested for statistical significance. As per the full mediation result above, I tested the indirect
effect for significance with the Preacher and Hayes (2004) SPSS macro (Watch Video 14.17:
Testing Mediation in SPSS - Example 2). As can be seen below, the indirect effect was
statistically significant via the Sobel test, z = 3.44, p = .0006, as well as the bootstrapped
estimation approach, 95%CI: .02/.08. Thus, as the direct effect between social-perfectionism
and negative affect was significant and the indirect effect between social-perfectionism and
negative affect was statistically significant, it may be suggested that the partial mediation
hypothesis was supported. From a psychological perspective, social-perfectionism appears to
be a pretty bad phenomenon, as it has a direct effect on negative affect, beyond the indirect
effect via a negative influence on flourishing.

Does the total effect between X and Y need to be significant?


If the total effect between X and Y is not significant statistically in the first instance
(Figure C14.6, Panel B), there is no possibility of observing partial or total mediation: there is
C14.46
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

nothing to mediate. As I describe in the Advanced Topics section of this chapter devoted to
suppressor effects, however, it is sometimes useful to test the possibility that a total effect will
increase in magnitude by the inclusion of one or more other variables, even if the total effect
is near zero! Such an analysis is not a mediation analysis. It is a suppression analysis.

Mediation versions Moderation


Although mediated multiple regression is distinctly different to moderated multiple
regression, I know of no other two statistical analyses that are confused more commonly, even
amongst researchers with decades of research experience. Obviously, there is something
about these two concepts that are difficult to distinguish for most people. My tip to you is to
always keep in mind that the concept of moderation is identical to the concept of interaction
in the context of ANOVA. So, in for both moderation and interaction, the magnitude and/or
direction of an effect depends upon the levels of another variable. A moderator is an
interacting variable.
In contrast to moderation, mediation is about testing the possibility that one or more
‘third’ variables may account for the effect between X and Y, either partially or fully. The
notion of an indirect effect is unique to mediation. There are no indirect effects in a
moderation analysis. Mediation within the context of multiple regression is connected to the
ANOVA framework through ANCOVA. That is, the difference between two means may be
reduced (partial mediation) or reduced entirely (full mediation) in an ANCOVA. In the
Advanced Topics section of this chapter, I show how ANCOVA and partial correlation are
exactly the same analyses.

C14.47
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Advanced Topics

Proving that Multiple R is Really Just a Pearson r


To help understand the nature of multiple R, it is a good exercise to calculate the
predicted values of Y and correlate those values with the actual Y values. In the context of the
education, IQ and earnings example, the unstandardized regression equation can be used to
estimate each case’s predicted Y value:
y  56261 .421  3756 .289 X 1  501 .978 X 2
where X1 = years of education completed, and X2 = IQ. To calculate each case’s predicted
annual earnings value efficiently in SPSS, you can use the Transform -> Computer Variable
utility, or you can use the ‘Save’ utility.
If you run a Pearson correlation between the predicted values and the annual earnings
values, you should get a correlation of r = .361, which corresponds exactly to the multiple R
value of .361 obtained from the multiple regression analysis. Consequently, multiple R may be
regarded as a Pearson correlation. Historically, some people have made the distinction
between “multiple regression” and “multiple correlation”. Multiple regression refers to all of
the values relevant to the intercepts and beta weights. Multiple correlation refers to all of the
values relevant to the model R and model R2 values (Watch Video 14.18: Multiple R and
Pearson's r - The Same?).

Multiple Regression: Bootstrapping


In the Foundations section of the textbook, I noted that the earnings dependent
variable was substantially skewed positively. However, I carried on with the normal theory
multiple regression analysis for the purposes of illustration. When independent or dependent
variable scores, or the residuals, are substantially skewed and/or kurtotic, an attractive and
viable estimation technique is bootstrapping, as it does not assume normally distributed data,
as described in the chapter devoted to Pearson correlation. In light of the non-normally
distributed data, I re-performed the education, IQ and earnings multiple regression analysis in
SPSS with the bootstrap estimation technique. Note that bootstrapping is an additional
module in SPSS, which may have to be purchased separately. Currently, the SPSS bootstrap
utility applied to a multiple regression analysis provides bootstrapped 95% confidence
intervals only for the unstandardized beta-weights. If you desire 95% confidence intervals for
the standardized beta-weights, I recommend you use the SPSS syntax Lorenzo-Seva, Ferrando,
and Chico (2010) published to conduct a relative importance analysis, as it also includes
standardized beta-weight CIs (see relative importance analysis topic later in this chapter).
When I ran the bootstrapped multiple regression, SPSS produced a table entitled
‘Bootstrap for Coefficients’ (Watch Video 14.19: Bootstrapping Multiple Regression in SPSS).

C14.48
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

It can in the table entitled ‘Bootstrap for Coefficients’ that education was found to be a
statistically significant contributor to the model (b = 3756.29, p < .001, 95%CI: 1919.96 and
5354.43). However, IQ was not observed to be a statistically significant contributor to the
model (b = 501.98, p = .060, 95%CI: 53.37 and 1057.54). Very curiously, it will be noted that
the IQ variable 95% lower-bound and upper-bound confidence intervals were both positive in
value, which would suggest a statistically significant effect. Thus, the non-significant p-value
estimated via the bootstrap is not consistent with the confidence intervals estimated via the
bootstrap. Although it is tempting to think the reason this has happened is because SPSS used
two different types of bootstrapping. That is, the p-value is based on the percentile method,
while the confidence intervals are estimated based on the bias corrected accelerated method
(BCa). However, you will get the same (inconsistent) result if you selected the percentile 95%
confidence interval approach. I suspect part of the problem might relate to the fact that these
data are slightly heteroscedastic (as shown in the Foundations section of this chapter).
Conventional bootstrapping can deal with non-normality, it does not deal with violation of the
homoscedasticity assumption. Below, I demonstrate how to conduct a Wild bootstrap, an
estimation method that can accommodate both non-normality and heteroscedasticity.

Multiple R2 Confidence Intervals


Few (if any) statistical packages appear to provide the option to calculate confidence
intervals for multiple R2, even though it would be useful if they did. I suspect the reason they
do not is because it is relatively more complicated to calculate confidence intervals for
multiple R2 (and even r2 from a bivariate association, for that matter). The reason it is more
complicated is that squared values are one-sided. That is, there is no possibility for both
positive and negative values. Consequently, you need to think about confidence intervals
around point-estimates that can only take positive (or only negative) values a little differently.
When a point-estimate can, theoretically, take-on both positive and negative values, a non-
significant effect implies that the lower-bound and upper-bound estimates straddle zero (i.e.,

C14.49
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

one positive and one negative). By contrast, with values such as multiple R2 and bivariate r2, a
non-significant effect implies that the lower-bound reaches zero (it can’t be negative). To
overcome this distinction, statisticians have developed methods to calculate confidence
intervals for values such as R2 and r2 by using the non-central F-distribution (e.g., Lee, 1971). In
practice, this involves more complicated calculations, in comparison to calculating confidence
intervals for R or r. Fortunately, Zou (2007) developed a SPSS syntax file that can be used to
calculate confidence intervals for either multiple R2 or bivariate r2. You can download the
syntax file here: multiple R2 CIs. You can also learn how to use the syntax file by watching me
use it (Watch Video 14.20: Multiple R2 Confidence Intervals in SPSS). When I used Zou’s (2007)
SPSS syntax file to calculate the confidence intervals for the model R2 I reported for the annual
earnings, education and IQ multiple regression example described in the Foundations section
of this chapter (multiple R2 = .130, N = 205, two predictors), I obtained the following results.

Thus, if this study were repeated many times over with different samples all sized N =
250, we would expect 95% of the multiple R2 point estimates to be between .068 and .197. It is
important to note that I specified 90% confidence intervals in the SPSS syntax file (i.e., conlev =
.9000). That wasn’t a mistake. For point-estimates that can only take-on positive values, you
need to specify 90% confidence intervals to correspond to a statistical test with alpha = .05.

Multiple Regression Scatter Plot


Researchers often include a scatter plot in their bivariate regression reports. For
example, the bivariate regression scatter plots associated with the education and IQ predictor
variables with earnings as the dependent variable are depicted in Figure C14.2 (Panels A and
B). In addition to bivariate regression scatter plots, it is possible to create a multiple regression
scatter plot (Watch Video 14.21: Multiple Correlation Scatter Plot in SPSS). As can be seen in
Panel C (Figure C14.8), the multiple regression scatter plot includes the predicted earnings
variable on the X-axis and the actual earnings variable on the Y-axis. Across the three panels, it
can be seen that R2 was reported at .110, .094 and .130, respectively. Rarely does one ever see
a multiple regression scatter plot in the literature, however, I think it can be a useful method
to demonstrate the utility of the multiple regression equation. Theoretically, the multiple
regression scatter plot will be associated with values more tightly centered around the fit line,
in comparison to the bivariate regression scatter plots. However, in practice, there may not be
much difference between the most substantial individual predictor bivariate regression scatter
plot and the multiple regressions scatter plot. A lot will depend on the number of independent

C14.50
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

variables included in the model, as well as the magnitude of the correlations between the
independent variables.

Figure C14.8. Bivariate Regression (Panels A and B) and Multiple Regression (Panel C) Scatter Plots
Panel A Panel B Panel C

Testing the Difference between Two Beta Weights


I think it is plausible to suggest that a very large percentage of researchers conduct a
multiple regression analysis to determine which one of the independent variables is the best
unique predictor of a dependent variable. As mentioned above, it is insufficient to simply
report a numerical difference between the beta weights as meaningful statistically. Instead,
the numerical difference between two beta weights must be tested statistically.
One method to test the difference between two standardized beta weights is to
estimate the 95% confidence intervals associated with the standardized beta weights, and
then determine if the amount of overlap between the intersecting confidence intervals is less
than 50%. If the amount of overlap is less than 50%, then you would have some justification to
suggest that the difference in the standardized beta weight point-estimates is statistically
significant (p < .05). The principle of such a suggested procedure is based on Cumming (2009)
who demonstrated that 95% confidence intervals can be used to test the difference between a
variety of different statistical point-estimates. In addition to the typical assumptions
associated with multiple regression analysis, the procedure described here assumes that the
confidence intervals are roughly the same size. Specifically, Cumming (2009) recommended
that the larger confidence interval should be less than twice the size of the smaller confidence
interval.
In order to obtain the appropriate 95% confidence intervals to apply the test described
in this section, it is necessary to transform the independent and dependent variables into z-
scores. Then, run the multiple regression analysis through the bootstrap utility with the
request for 95% confidence selected. This is precisely the same procedure described in the
C14.51
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

bivariate regression chapter. The distinction in this example is that there are two beta weights
for which 95% confidence intervals will be estimated. The confidence intervals can then be
used to test the difference between the standardized beta-weight point-estimates. When I ran
the analysis, I obtained the following result (Watch Video 14.22: Standardized Beta Weight
Confidence Intervals in SPSS).

It can be seen that the education predictor standardized beta weight was associated
with the 95% confidence intervals of .090 and .375. Similarly, the IQ predictor standardized
beta weight 95% confidence intervals were .031 and .316. Note that, if you perform the
analysis yourself, you may not get exactly the same results as reported above. The reason is
that your 1000 bootstrapped re-samples are very unlikely to be exactly the same 1000 re-
samples I obtained, when I performed in the analysis in SPSS. However, the results should be
fairly consistent across re-analyses, particularly if a minimum of 2,000 re-samples is selected.
In order to display the results, if you wished to do so, the lower-bound, point-estimate,
and upper-bound estimates can be displayed visually in a HILO chart in SPSS. I used the
following syntax (Data File: dif_between_b) (Watch Video 14.23: Test the Difference Between
two Beta Weights in SPSS).
GRAPH
/HILO(SIMPLE)=VALUE(UB LB Point).

The chart displayed in Figure 14.9 (Panel A) was produced by SPSS. Based on the visual
inspection of the confidence intervals, it is clear that there was much too much overlap to
suggest that there was a statistically significant difference between the standardized beta-
weight point-estimates (.232 vs. 174). In fact, the confidence intervals intersect with the point-
estimates, so, the p-value associated with the difference in the beta weights would be much
greater than .05, in this case. Consequently, it may not be concluded that years of education
was a more substantial contributor to the regression equation than IQ.

C14.52
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Figure C14.9. Standardardized Beta Weights and Confidence Intervals


Panel A Panel B

For the purposes of demonstration, I have created a fictitious set of data that were
consistent with a statistically significant difference in the beta-weights, as the confidence
intervals overlapped less than 50% (Data File: dif_between_b_sig). The chart displayed in
Figure C14.3 (Panel B) was created based on that fictitious data.
It can be seen that the lower-bound of education intersected with the upper-bound of
IQ. However, the amount of overlapped was less than 50%; consequently, it may be suggested
that the standardized beta weight point-estimates (.232 vs. .174) were statistically significantly
different from each other (p < .05), in this supplementary fictitious case.
The exact calculations of the procedure involve two steps: (1) calculate the average
length of the two overlapping arms; and (2) calculate the amount of overlap. If the amount of
overlap is less than 50% of the length of the arms, reject the null hypothesis of equal
standardized beta weights, p < .05. Thus, in this case, the average length of the overlapping
arms (step 1) corresponded to (.037 + .031) / 2 = .034. The amount of overlap corresponded to
.205 - .195 = .010. Thus, because .010 is less than 50% of the average arm length (.010 / .034 =
.29 or 29%), the null hypothesis was rejected, p < .05. Finally, I’ll note that the procedure
described here assumes that the sample size is at least 30 (Cumming, 2009).

Koenker Test of Homoscedasticity


The Koenker test is described in more detail in the bivariate regression chapter
(Chapter 12). The same SPSS syntax file described in the bivariate regression chapter can also
be applied to the multiple regression context. As can be seen below, the Koenker test was
significant statistically (p = .025), which implies that the assumption of homoscedasticity was
not satisfied (Watch Video 14.24: Breusch-Pagan and Koenker in Multiple Regression via
SPSS). The Koenker test result is very similar to the test of homoscedasticity result reported in
the Foundations section of this chapter, based on the Pearson/Spearman correlation analysis
C14.53
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

(i.e., correlation between the standardized predicted values and the absolute standardized
residuals). As the assumption of homoscedasticity has not be satisfied, I can trust the standard
errors and/or the p-values associated with the multiple regression beta-weights.
Consequently, in this case, I need to try to deal with that somehow (e.g., conduct a Wild
bootstrap multiple regression; adjust the standard errors; see remaining topics in this chapter).
Hypothetically, had the preferred Koenker test not produced a statistically significant result
(i.e., p > .05), I wouldn’t have to do any extra work. Instead, I could potentially trust the non-
adjusted standard errors and p-values outputted by SPSS (or any other program).

Dealing with Heteroscedasticity: Wild Bootstrap


An interesting question is whether bootstrapping assumes homoscedasticity. Based on
the current literature, it would appear that bootstrapping does assume homoscedasticity,
unless one uses a special type of bootstrapping known as the wild bootstrap (Wu, 1986;
Flachaire, 2005). The bootstrapping performed by SPSS by default is not based on the wild
bootstrap (neither the p-value nor the confidence intervals). Fortunately, it is possible to
perform a wild bootstrap analysis in SPSS, however, one must use syntax to do so. I used the
education, intelligence, and earnings data file for this example (Data File:
education_earnings_N_250).
In order to conduct a wild bootstrap multiple regression analysis in SPSS, one must
first conduct the multiple regression analysis (via normal theory is fine) and save the
unstandardized residuals. SPSS will create a new variable named ‘RES_1’. Next, perform the
wild bootstrap analysis. Specifically, re-run the multiple regression analysis, however, this
time, click on the Bootstrap button. Select the ‘bias corrected accelerated’ confidence intervals
option. Click Continue. Click the ‘Save’ button and deselect the unstandardized residuals
option. Click Continue. Do not click Ok. Instead, click on Paste, because you’re going to have to
modify the syntax in a minor way. When you click Paste, you should get the following syntax
file:

C14.54
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

BOOTSTRAP
/SAMPLING METHOD=SIMPLE
/VARIABLES TARGET=earnings INPUT= education IQ
/CRITERIA CILEVEL=95 CITYPE=BCA NSAMPLES=2000
/MISSING USERMISSING=EXCLUDE.
REGRESSION
/MISSING LISTWISE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT earnings
/METHOD=ENTER education IQ.

You need to change the second line of syntax (i.e., /SAMPLING METHOD=SIMPLE) to
/SAMPLING METHOD=WILD(RESIDUALS=RES_1). When you run the above syntax, you should
get the key multiple regression results in a SPSS table entitled ‘Bootstrap for Coefficients’:

It can be observed that, based on the wild bootstrap, IQ was not a statistically
significant contributor to the model, p = .086. Furthermore, the lower and upper bound 95%
confidence intervals were consistent with a non-significant p-value, as one value was negative
(-207.72) and one value was positive (862.38). Thus, application of the wild bootstrap, at least
in this case, overcame the awkward situation of a non-significant p-value, in conjunction with
confidence intervals suggestive of a statistically significant effect, which was observed in the
original bootstrap analysis above. (Watch Video 14.25: Wild Bootstrap Multiple Regression in
SPSS)
I’m not aware of any simulation research that has compared the wild bootstrap versus
Hayes and Cai’s (2007) adjusted standard error technique for dealing with heteroscedasticity.
My hunch, however, is that you will be steered in the right direction with the wild bootstrap, in
most cases. Plus, you get to use the term ‘wild bootstrap’ in your report, which is pretty cool.

C14.55
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

A substantial amount of time and space was devoted to dealing with the problem of
homoscedasticity in this chapter, because it is a serious problem, and one which I believe
occurs fairly frequently with field data. However, researchers rarely test for it.

Dealing with Heteroscedasticity: Adjusted Standard Errors


Given that the assumption of homoscedasticity was not satisfied, one may be inclined
to follow-up the analysis with an application of Hayes and Cai (2007) heteroscedasticity
corrected standard errors SPSS macro, as described in the bivariate regression chapter. You
can download the Hayes and Cai SPSS macro here: https://tinyurl.com/ya5939sy I obtained the
following heteroscedastic corrected results (Watch Video 14.26: Heteroscedastic Adjusted
Multiple Regression Standard Errors in SPSS):

It can be observed that education remained a statistically significant contributor to the


equation (b = 3756.29, t = 4.34, p < .001), however, IQ was no longer a statistically significant
contributor to the regression equation. Recall that in the normal theory estimation approach,
IQ was associated with a standard error of 208.97 and a p-value of .017. However, based on
the Hayes and Cai (2007) corrected approach, IQ’s standard error was estimated to be 262.31.
Thus, the normal theory estimation approach underestimated the IQ variable’s standard error.
When estimated more accurately with Hayes/Cai approach, IQ’s unstandardized beta weight
of b = 501.98 was found to be associated with a t = 1.91, p = .0568, which is not significant
statistically. Again, if I had to choose, I would opt for the wild bootstrap results. However,

C14.56
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Hayes and Cai’s (2007) method is a viable option, as well, when the assumption of
homoscedasticity has been violated.

Multiple R2 Increased but p-value Decreased?


Multiple R2 is tested for statistical significance via an ANOVA table in many statistical
programs. However, do not get mixed up here. A multiple R2 is really just a “special” Pearson
correlation coefficient (squared), not an ANOVA. As demonstrated above, multiple R was equal
to the Pearson correlation between the predicted values of Y and the actual values of Y.
One phenomenon of multiple R2 that often surprises people is that the p-value
associated with a multiple R value can actually increase, even when the inclusion of an
additional independent variable is associated with a statistically significant increase in multiple
R. Although counter-intuitive, this is phenomenon is entirely possible. Think of it this way.
Suppose you had a really solid predictor of Y in a bivariate regression analysis which yielded a
multiple R of .65. The bivariate regression analysis, in this case, will have a good level of
confidence that the multiple R point estimate is somewhere around .65. However, suppose
you added a really crappy predictor of Y to the regression equation (say, b = .08). Yes, the
additional predictor might be a statistically significant contributor to the regression equation,
but it’s still relatively crappy. And, yes, the multiple R2 value may increase in numerical value
very slightly with the addition of the relatively crappy predictor to the model. However, the
confidence interval around the multiple R will blow out as a consequence. Therefore, the p-
value associated with the multiple R-value will increase, even if the multiple R value also
increases. Essentially, when you add a relatively crappy predictor to a regression equation, you
dilute the quality of the regression equation with respect to confidence in the overall
prediction of the model. Most researchers accept the price of less confidence in the multiple
R-value, if it means they have more statistically significant predictors to report and discuss,
even if one or more of the predictors are relatively weak contributors to the model.
It is also worth noting that you can have a case where the multiple R-value is
statistically significant, but none of the individual beta weights are statistically significant.
Recall that the multiple R-value is based on a Pearson correlation between an optimally
weighted variable (i.e., predicted Y values) and the actual Y values. The analysis used to
determine whether any of the independent variables were statistically significant contributors
to the regression equation on their own is distinct to the analysis relevant to the statistical
significance of a multiple R. Typically, when there is a disconnect between the multiple R-value
and the individual beta weights with respect to statistical significance, one or more of the beta
weights were likely nearly statistically significant. In my experience, the inconsistency is always
where the multiple R-value is statistically significant, but none of the beta weights are. I’ve
never come across a multiple regression where the multiple R-value was not statistically
significant, but one or more of the beta weights were.
C14.57
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Evaluating the Importance of Predictors: Relative Importance Analysis


Thus far, I have emphasized the interpretation of standardized-beta weights and semi-
partial correlations, with respect to evaluating the importance of predictors in a multiple
regression analysis. As far as evaluating the unique contributions of each predictor to a
regression equation, I believe the semi-partial correlation is probably a person’s best option.
However, because the semi-partial correlation is an approximate representation of only
unique variance contributed to a multiple regression equation, and, typically, multiple R2 is not
simply a function of the sum of the squared semi-partial correlations, something is lost by
evaluating only the unique contributions of each predictor. Specifically, what is lost is the
common variance contributed to the regression equation by the predictors. This can be
considered a big deal, as there are cases where a predictor variable is found to be the weakest
unique contributor to a regression equation (i.e., smallest semi-partial r2), but the variable
shares the most amount of variance with the regression equation predicting the dependent
variable. Arguably, such a superficially viewed weakest contributor to the regression should be
given further consideration, in a full evaluation of the multiple regression results.
Considering the above, one method to help evaluate the overall (not just unique)
importance of predictors included in a multiple regression equation is to conduct a relative
importance analysis, a.k.a., relative weights analysis (Johnson, 2000).9 Unfortunately, most
popular statistical programs do not include the option to include a relative importance analysis
as part of a multiple regression. Fortunately, however, Lorenzo-Seva, Ferrando, and Chico
(2010) published SPSS syntax that can conduct a relative importance analysis (you can
download it here). In practice, it doesn’t really add much to conduct a relative importance
analysis with just two predictors, as the amount of common variance contributed to the
regression equation will be equal across both predictors. Thus, I have not applied a relative
importance analysis to the education, intelligence, and earnings multiple regression example
discussed, thus far. However, things become more interesting, when there are three or more
predictors in the multiple regression.
For example, Zhao, Golde, and McCormick (2007) were interested in predicting
graduate student satisfaction with thesis supervisors. To that effect, they had graduate
students rate their perception of their supervisor’s academic advice giving ability, their
supervisor’s personal touch, their supervisor’s career development advice, and the student’s
perception that they were being treated like slave-labor by their supervisor. I have simulated
some data to correspond closely to the results reported by Zhao et al. (2007), although I

9
There is also Budesco’s (1993) dominance analysis to consider. However, the results of a dominance
analysis and a relative importance analysis almost always agree perfectly (Budescu & Azen, 2004).
Shapley’s Value Regression also provides results very similar to a Johnson’s relative weights analysis, but
Shapley’s method is more complicated to compute, and, so, is less popular.
C14.58
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

generated the data to have a sample size of 135, rather than 3,568, to make the results more
interesting (Data File: phd_student_satisfaction). When I ran the relative importance analysis
with the Lorenzo-Seva et al. (2010) syntax file (Watch: Relative Weights Analysis), I obtained
the following results.
First, the program produced a correlation matrix. As can be seen below, the academic
advice variable (‘aca_advi’) was associated with the largest zero-order Pearson correlation
(i.e., r = .681) with student satisfaction (‘satisfac’). However, the remaining three predictors
were also non-negligibly correlated with the dependent variable, as well as inter-correlated
with each other.

Next, the program reported the multiple R and multiple R2, with corresponding 95%
confidence intervals (a really nice touch). It can be seen below that 53.5% of the variance in
graduate student satisfaction with their supervisor was accounted for by the four predictors.

After the unstandardized beta-weights (not included here), the program reported the
standardized beta-weights and corresponding 95% confidence intervals. As can be seen below,
the academic advice predictor variable was associated with the numerically largest
standardized beta-weight (β = .423). The perceived cheap labor variable (‘cheap_la’) variable
was the only one of the four that was not found to be a statistically significant contributor to
the multiple regression equation, as the lower and upper bound 95% confidence intervals
overlapped zero. However, as discussed further below, it would be a mistake to think that a
supervisor could simply treat their students as cheap labor and think it wouldn’t have a
negative impact.

C14.59
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Next, the program reported the structure coefficients. In multiple regression, a structure
coefficient represents the linear association between a predictor and the predicted values
derived from a multiple regression equation. It may be regarded as a competitor to the
relative importance analysis approach to evaluating the importance of predictors. As can be
seen below, the academic advice variable correlated .93 with the regression equation function,
which was the numerically largest correlate. Interestingly, despite the fact that the cheap labor
variable was not found to be a statistically significant unique contributor to the regression
equation (i.e., non-significant beta-weight), the cheap labor variable, nonetheless, correlated
with the regression equation function in a statistically significant way (lower and upper-bound
estimates were both negative). This means that treating graduate students as cheap labor
might impact their perception of their supervisors in a negative way. However, the impact
might be via its influence on another variable (e.g., personal touch?).

Finally, after a couple of preliminary relative importance analysis results (not included here),
the program reported the key relative weights. As can be seen below, the academic advice
variable was associated with a relative weight of 42.0%, which implies that of the 53.5%
percentage of variance accounted for in graduate student satisfaction with their supervisor,
academic advice produced 42.0% of that predicted variance. Next, personal touch and career
development produced nearly identical amounts of the predicted variance (≈ 25%). Finally,
cheap labor accounted for 7.0% of the predicted variance, an amount of predicted variance
that was statistically significant, as the lower-bound exceeded zero. The reason that cheap
labor’s standardized beta-weight was not significant was because it failed to contribute unique
variance to the regression equation. However, the relative importance analysis suggests that
cheap labor did contribute predictive variance that was common to one or more of the other
predictors included in the model.
C14.60
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

A relative importance analysis is similar to a commonality analysis, which I described in the


moderated regression section of this chapter. A commonality analysis is more detailed, as it
partitions the common predictive variance across all of the possible terms (i.e., combinations
of predictor variables). By contrast, a relative importance analysis combines the unique and
common variance explained into a single term, and it ensures that the sum of the relative
weights does not exceed 100%. For further details on the various methods to interpret the
importance of predictors in multiple regression, consult Nimon and Oswald (2013).
Suppressor Effects
One of the more difficult statistical concepts to grasp is suppression, as I suspect it is
considered a counter-intuitive phenomenon for most people. Generally speaking, suppression
is considered observed when the magnitude of the effect between X and Y increases, by
adding one or more variables to the (usually regression) analysis. Thus, in one approach to the
analysis, the beta-weight between X and Y increases, by adding one or more additional
variables to the model. Such an effect is a bit baffling, as adding one or more variables to a
regression-type analysis would be more naturally be expected to reduce the magnitude of the
beta-weight between X and Y, consistent with partial or full mediation. Thus, in some sense,
suppression may be considered the opposite of mediation. However, you should not consider
suppression as a form of moderation, as there are no product terms in a suppression analysis.
Ultimately, moderation, mediation, and suppression are three different types of ‘process
analysis’ (Hayes, 2013).
In the most typical scenario (classical suppression), suppression occurs when there is
small amount of shared variance between the independent variable (X) and the dependent
variable (Y) (which may or not be significant statistically). Additionally, there is (typically) a
moderate amount of shared variance between another variable (suppressor, S) and X. Finally,
and this is the surprising bit, there is little to no shared variance between S and Y. This pattern
of correlation coefficients tends to produce evidence for a suppressor effect. Essentially, by
controlling statistically for the shared variance between S and X, the nature of X becomes
‘purified’ with respect to its association with Y. Consequently, by including S in the model, the
magnitude of the association between X and Y increases, in comparison to the zero-order
association between X and Y. At this point, it is probably best to consider an example.

C14.61
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

It has been well-established that people can, to some degree, report their own level of
intelligence accurately (Freund & Kasten, 2012). Typically, researchers report a correlation of
about .30 between self-reported IQ and objective intelligence measures (task-based IQ).
Gignac (2018) hypothesized that a construct known as self-deceptive enhancement (similar to
narcissism) would correlate positively with self-reported IQ, but not with task-based IQ. Based
on such a pattern of correlations, one would expect that the association between self-reported
IQ and task-based IQ would increase, if the influence of self-deceptive enhancement on the
self-reported IQ scores were taken into consideration. Stated alternatively, if self-reported IQ
scores could be ‘decontaminated’ of self-deceptive enhancement, then the association
between a decontaminated self-reported IQ variable and task-based IQ would be expected to
be larger than the association between the original (contaminated) self-reported IQ scores and
the task-based IQ scores.
To test the hypothesis, Gignac (2018) administered a self-report IQ questionnaire
(SRIQ), a task-based IQ test (TBIQ), and a self-report measure of self-deceptive enhancement
(SDE) to a sample of 253 young adults. Gignac (2018) found the following Pearson correlations:
SRIQ and TBIQ r = .16, p = .011; SRIQ and SDE r = .44, p < .001; SDE and TBIQ r = -.05, p = .428.
Thus, self-reported IQ was only relatively weakly associated with task-based IQ. However,
because self-deceptive enhancement was fairly substantially correlated positively with self-
reported IQ and very weakly with task-based IQ, there was the possibility of a suppressor
situation.
It should be acknowledged that there are several different types of suppressor
situations (Paulhus, Robins, Trzesniewski & Tracy, 2004). The Gignac (2018) suppressor
situation is consistent with ‘classical suppression’, where X and Y are (usually) relatively weakly
inter-related, X and S are inter-related positively, and Y and S are essentially or entirely
unrelated.10 Thus, the Gignac (2018) study may be regarded as consistent with classical
suppression. Classical suppression may be argued to be best tested by comparing the
difference between the squared zero-order correlation between X and Y and the squared
semi-partial correlation between X and Y, controlling for the effects of S on X (McGrath, 2010;
Smith, Ager Jr & Williams, 1992; Velicer, 1978).
Recall that the zero-order Pearson correlation between self-reported IQ and task-
based IQ was r = .16, thus r2 = .026. In order to obtain the semi-partial correlation between
self-reported IQ and task-based IQ in SPSS, I entered SRIQ and SDE as independent variables
and TBIQ as a dependent variable in a multiple regression. Based on the output from SPSS, I
obtained a semi-partial correlation of r = .202 (see column ‘Part’ in the SPSS table below

10
A classical suppression situation would also be observed where X and S are unrelated, but Y and S are
related. The main issue is that either the independent variable or the dependent variable is
decontaminated of a source of “nuisance” variance (nuisance at least with respect to the association
between X and Y).
C14.62
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

entitled ‘Coefficients’). Thus, the squared semi-partial correlation was equal to r2 = .041. To my
knowledge, a method to test statistically the difference between a squared zero-order
correlation and a squared semi-partial correlation has not yet been established. I suspect the
sample sizes required to detect a significant difference in the r-squared values with
respectable power will be forbidding.
Fortunately, the results of Paunonen and Lebel’s (2012) simulation work can offer
guidance with respect to evaluating possible suppressor effects from the perspective of effect
size. Specifically, based on the results of Paunonen and Lebel (2012), moderate and substantial
suppressor situations may be regarded as consistent with an increase of 1% and 2% in
predictive validity, respectively. Thus, in the Gignac (2018) SRIQ and TBIQ case, the difference
in the semi-partial r2 and the zero-order r2 was Δr2 = .015 (i.e., .041 - .026 = .015)11, which
corresponds to a 1.5% increase, i.e., consistent with somewhere between a moderate and
substantial classical suppressor effect.

Cooperative Suppression
It should be noted that many researchers would test a classical suppression hypothesis
via multiple regression, by regressing Y onto both X and S, with the expectation that the X
beta-weight will be observed to be statistically significantly larger than the X and Y zero-order
correlation. Based on both theoretical and statistical grounds, I do not believe this approach is
entirely appropriate for testing classical suppression hypothesis. Instead, the multiple
regression approach is optimal to test a suppression situation known a cooperative
suppression (a.k.a., reciprocal suppression; Paulhus, 2004). The cooperative suppression
situation is consistent with the following pattern of zero-order correlations: X and Y relatively
weakly inter-related; S and Y relatively weakly inter-related; and X and S inter-related
negatively. Thus, with respect to their associations with Y, both the X and S variables are
mutually suppressed. Therefore, when both X and S are included in a multiple regression to
predict Y, their beta-weights are observed to increase in magnitude, in comparison to their

11
The results reported in Gignac (2018) are slightly different, as the analyses were conducted with
maximum likelihood estimation in Amos with single-indicator latent variables; whereas, here, they have
been conducted with ordinary least squares estimation in SPSS, based on simulated data generated to
correspond closely to the results reported in Gignac (2018).
C14.63
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

respective zero-order correlations with Y. It should be noted that, in the context of


cooperative suppression, it is not entirely appropriate to label one variable the independent
variable (X) and one variable the suppressor variable (S). Instead, both variables are
independent variables and both variables are suppressor variables.
To provide an example of cooperative suppression, consider an investigation by Krus
and Hoehl (1994) who noted that there were substantial inter-nation differences in
incarceration rates. At the extremes, the United States of America had 455 people per 100,000
incarcerated, whereas India had 34 people incarcerated per 100,000. Krus and Hoehl (1994)
found that unequal distribution of national income (i.e., ratio of national income received by
the top 10% to the bottom 20%) and family disintegration (i.e., divorce rates) were correlated
positively with national incarceration rates. Importantly, the two independent variables were
inter-correlated negatively. Such a pattern of correlations is consistent with the possibility of
cooperative suppression. In my view, cooperative suppression hypotheses are best tested with
multiple regression. In this example, it may be expected that a multiple regression with
incarceration rates regressed onto unequal income distribution and family disintegration will
reveal predictor variable standardized beta weights greater than their corresponding zero-
order correlations. Such a pattern of results would be consistent with cooperative suppression.
To demonstrate the analyses associated with testing for cooperative suppression, I
have simulated data to correspond closely to the results reported by Krus and Hoehl (1994)
(Data File: incarceration_rates). Based on a Pearson correlation analysis, I obtained the results
reported in the SPSS table entitled ‘Correlations’.

It can be seen that incarceration rate correlated positively with both income ratio (income
inequality) and divorce rate at r = .549 and r = .464, respectively. Thus, greater levels of income
inequality and greater levels of divorce were associated with greater levels of incarceration
rates. Furthermore, the income ratio variable correlated negatively with divorce rate at r = -
.214. Consequently, I regressed the income ratio and the divorce rate independent variables
C14.64
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

onto incarceration rate and obtained the following key results reported in the SPSS table
entitled ‘Coefficients’:

It can be seen that the income ratio variable was associated with a standardized beta-weight
equal to .680, which was numerically larger than the zero-order Pearson correlation between
income ratio and incarceration rate (i.e., .549). Such a result suggests the presence of a
suppression situation. Furthermore, the divorce rate variable was associated with a
standardized beta-weight equal to .610, which was numerically larger than the zero-order
Pearson correlation between divorce rate and incarceration rate (i.e., .464). Given both beta-
weights were numerically larger than the respective zero-order Pearson correlation, there was
evidence to suggest cooperative suppression.
In practice, researchers do not test the difference between a standardized-beta weight
and a zero-order correlation statistically. I suspect the reasons is due to that fact that such a
statistical test has not been well-established, yet. Instead, researchers simply interpret the
larger standardized beta-weights than the corresponding zero-order correlations as evidence
for cooperative suppression. Based on such a practice, it may be contended that income
inequality and divorce rate suppress, in a mutual way, their respective statistical prediction of
incarceration rate. Ideally, a relatively easy to implement statistical test will be established, in
the near future, to test the numerical difference between a standardized beta-weight and a
zero-order correlation.
In light of the above, it may be best to focus upon the difference between the
independent variable squared semi-partial correlation and the corresponding squared zero-
order correlation. In this example, income inequality was associated with a squared semi-
partial correlation and a zero-order correlation of .441 and .301, respectively, which is equal to
a difference of .140. A value of .140 is much greater than the .02 expectation for a large
suppressor effect identified by Paunonen and Lebel’s (2012) simulation work. Similarly, divorce
rate was associated with a squared semi-partial correlation and a zero-order correlation of
.355 and .215, respectively. Again, the difference of .140 would suggest a large suppressor
effect.
In contrast to the example above, cooperative suppression can be observed when the
two independent variables have differentially directed zero-order correlations (one positive

C14.65
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

and one negative), but are inter-correlated positively (Paulhus, 2004). Such a suppression
situation can also be found to be associated with larger standardized beta-weights than zero-
order correlations. Finally, there is the suppression situation where X1 and is relatively weakly,
positively inter-related with Y (say, r = .10) and X2 is fairly substantially, positively inter-related
with Y (say, r = .45). Finally, X1 and X2 are also fairly substantially, positively inter-related with
each other. Such a pattern of correlations will result in increased validity coefficients for both
X1 and X2; however, X1 is found to be negatively associated with Y, controlling for the effects of
X2 in a multiple regression. Such a suppression situation is known as negative (or net)
suppression (Paulhus et al., 2004). My impression, is that instances of negative suppression are
far less than common than classical and cooperative suppression. Overall, contrary to popular
belief, my experience suggests that instances of suppressor situations are somewhat more
commonly observed in behavioral science data. Consequently, it is important to keep
suppression as a possibility in mind, when conducting analyses. Additionally, the possibility of
suppression is a reason to not cease any hypothesized mediation analyses, simply because X
and Y are not found to be associated with a statistically significant zero-order correlation. True,
there is no possibility of mediation, when X and Y are not associated with a significant zero-
order correlation. However, there is the possibility of a suppressor situation, which may be
explicable on theoretical grounds.

C14.66
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

Practice Questions

1: Does spanking young children lead to more behavioral problems?


Barnes, Boutwell, Beaver, and Gibson (2013) obtained some impressive, longitudinal
data relevant to spanking and behavioral problems in children. At 2 years of age, individual
differences in the participants’ self-regulation was measured. At 4-years of age, externalizing
behaviors and number of spankings were measured. Finally, at 5-years of age, externalizing
behaviors were measured again. Test the hypothesis that number of spankings a child received
at age 4 contributed statistically significantly to a regression equation designed to predict
externalizing behaviors at age 5. Include the following additional variables into the multiple
regression equation, as well: self-regulation, spankings at age 4, and externalizing at age 4
(Data File: spanking_externalizing) (Watch: Spanking & Externalizing).

2: How well can overall job satisfaction be predicted?


What predicts job satisfaction? Is it the commonly discussed rate of pay? Gruenberg
(1980) was interested to understand the nature of job satisfaction. He collected data on job
satisfaction among 792 employees. Specifically, 792 employees responded to the following
question: "All things considered, how satisfied would you say that you are with your job?
Would you say that you are completely satisfied, pretty satisfied, not very satisfied, or not at
all satisfied? Additionally, the employees responded to items relevant to their rate of pay, job
security, the quality of the people they worked with, adequacy of workplace, freedom to plan
work, chance to learn new things, and chance to use abilities. Conduct a multiple regression
analysis to determine which of the variables contribute statistically significantly to a regression
equation created to predict job satisfaction (Data File: job_satisfaction_predictors) (Watch:
Predicting Job Satisfaction).

3: The role of optimism and your future health


Chopik, Kim, and Smith (2015) were interested in investigating whether individual
differences in optimism can have an impact on a person’s future health, controlling for several
other variables. The participants started the study at age 51 (or older). In the first year of the
study, the participants provided information relevant to their gender, age, and general health.
They also completed an optimism scale (example item: “In uncertain times, I usually expect the
best.”). Two years later, the participants provided information about their general health and
optimism, again. Conduct a hierarchical multiple regression with gender, age, general health1
(time 1), and general health2 (time 2) in block 1. Then, add optimism1 (time 1) and optimism2
(time 2) in block 2. Did the second block increase the multiple R2 in a statistically significant?
Were both optimism1 and optimism2 statistically significant contributors to the regression
equation? (Data File: optimism_health) (Watch: Optimism and Health).
C14.67
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

4: Predicting serum cholesterol with a multiple regression equation


Bønaa, K. H., & Thelle, D. S. (1991). Association between blood pressure and serum lipids in a
population. The Tromsø Study. Circulation, 83(4), 1305-1314.

Advanced Practice Questions

5: Absence frequency versus total number of days absent and manager impression.
What has a bigger impact on managers’ impression of employee absenteeism, the
frequency of employee absent periods, or the total number of days the employee is absent?
Breaugh (1981) collected official human resources data on absenteeism from 112 employees.
Absenteeism was measured in two ways: (1) total number of days absent; and (2) total
number of periods absent. Thus, a person may be absent for only one period within a year, but
it could be for several days in a row. By contrast, a person may miss 5 days of work throughout
a year, but each of those days were across the year (5 separate periods). Breaugh (1981) also
collected day from each employee’s manager with respect to their rating (impression) of work
attendance. Consequently, these data can help answer the question of which type of
absenteeism impacts a manager’s impression more: total number of absent days versus total
number of absent periods? Run a multiple regression with manager ratings as the dependent
variable. Total number of days and total number of periods should be included as the
independent variables. Test the difference between the two standardized beta-weights for
statistically significance (Data File: absence_frequency_total) (Watch: Absence Frequency).

C14.68
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

References
Antonakis, J., & Dietz, J. (2011). Looking for validity or testing it? The perils of stepwise
regression, extreme-scores analysis, heteroscedasticity, and measurement
error. Personality and Individual Differences, 50(3), 409-415.
Barnes, J. C., Boutwell, B. B., Beaver, K. M., & Gibson, C. L. (2013). Analyzing the origins of
childhood externalizing behavioral problems. Developmental Psychology, 49(12), 2272.
Belsey, D. A., Kuh, E., & Welsch, R. E. (2004). Regression diagnostics: Identifying
influential data and sources of collinearity. New York: Wiley.
Bonaa, K. H. & Thelle, D. S. (1991). Association between blood pressure and serum lipids in a
population: the Tromso Study. Circulation, 83, 1305-1314.
Breaugh, J. A. (1981). Predicting absenteeism from prior absenteeism and work
attitudes. Journal of Applied Psychology, 66(5), 555-560.
Budescu, D. V. (1993). Dominance analysis: A new approach to the problem of relative
importance of predictors in multiple regression. Psychological Bulletin, 114, 542-551.
Budescu, D. V., & Azen, R. (2004). Beyond global measures of relative importance: Some
insights from dominance analysis. Organizational Research Methods, 7, 341-350.
Chopik, W. J., Kim, E. S., & Smith, J. (2015). Changes in optimism are associated with changes in
health over time among older adults. Social Psychological and Personality
Science, 6(7), 814-822.
Cumming, G. (2009). Inference by eye: Reading the overlap of independent confidence
intervals. Statistics in Medicine, 28(2), 205-220.
Davidson, R., & Flachaire, E. (2008). The wild bootstrap, tamed at last. Journal of
Econometrics, 146(1), 162-169.
Disatnik, D., & Sivan, L. (2016). The multicollinearity illusion in moderated regression
analysis. Marketing Letters, 27(2), 403-408.
Dziuban, C. D., & Shirkey, E. C. (1974). When is a correlation matrix appropriate for factor
analysis? Some decision rules. Psychological Bulletin, 81(6), 358-361.
Duckworth, A. L., Quinn, P. D., Lynam, D. R., Loeber, R., & Stouthamer-Loeber, M. (2011). Role
of test motivation in intelligence testing. Proceedings of the National Academy of
Sciences, 108(19), 7716-7720.
Flachaire, E. (2005). Bootstrapping heteroskedastic regression models: wild bootstrap
vs. pairs bootstrap. Computational Statistics & Data Analysis, 49(2), 361-376.
Freund, P.A., & Kasten, N. (2012). How smart do you think you are? A meta-analysis
on the validity of self-estimates of cognitive ability. Psychological Bulletin, 138,
296-321.
Gignac, G. E. (2018). Socially desirable responding suppresses the association between self-
C14.69
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

assessed intelligence and task-based intelligence. Under review.


Green, S. B. (1991). How many subjects does it take to do a regression
analysis. Multivariate Behavioral Research, 26(3), 499-510.
Gruenberg, B. (1980). The happy worker: An analysis of educational and occupational
differences in determinants of job satisfaction. American Journal of Sociology, 86(2),
247-271.
Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A
regression-based approach. New York, NY: The Guilford Press.
Johnson, J. W. (2000). A heuristic method for estimating the relative weight of predictor
variables in multiple regression. Multivariate Behavioral Research, 35, 1-19.
Lee, Y-S. (1971). Some results on the sampling distribution of the multiple correlation
coefficient. Journal of the Royal Statistical Society, Series B, 33, 117-130.
Lewis-Beck, M. S. (1978). Stepwise regression: A caution. Political Methodology, 213-240.
Lorenzo-Seva, U., Ferrando, P. J., & Chico, E. (2010). Two SPSS programs for interpreting
multiple regression results. Behavior Research Methods, 42(1), 29-35.
Maxwell, S. E. (2000). Sample size and multiple regression analysis. Psychological
Methods, 5(4), 434-458.
McGrath, R.E., Mitchell, M., Kim, B.H., & Hough, L. (2010). Evidence for response bias as a
source of error variance in applied assessment. Psychological Bulletin, 136, 450–470.
Nimon, K. (2010). Regression commonality analysis: Demonstration of an SPSS
solution. Multiple Linear Regression Viewpoints, 36(1), 10-17.
Nimon, K. F., & Oswald, F. L. (2013). Understanding the results of multiple linear regression:
Beyond standardized regression coefficients. Organizational Research Methods, 16(4),
650-674.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
O'Connor, B. P. (1998). SIMPLE: All-in-one programs for exploring interactions in moderated
multiple regression. Educational and Psychological Measurement, 58(5), 836-840.
O'Brien, R. M. (2007). A caution regarding rules of thumb for variance inflation
factors. Quality & Quantity, 41, 673-690.
Pan, Y, & Jackson, R. T. (2008). Ethnic difference in the relationship between acute
inflammation and serum ferritin in US adult males. Epidemiology and Infection, 136,
421-431.
Paunonen, S.V., & Lebel, E.P. (2012). Socially desirable responding and its elusive effects on the
validity of personality assessments. Journal of Personality and Social Psychology, 103,
158-175.
Paulhus, D.L., Robins, R.W., Trzesniewski, K.H., & Tracy, J.L. (2004). Two replicable suppressor
situations in personality research. Multivariate Behavioral Research, 39, 303-328.
Pope, P. T., & Webster, J. T. (1972). The use of an F-statistic in stepwise regression
C14.70
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION

procedures. Technometrics, 14(2), 327-340.


Preacher, K. J., & Hayes, A. F. (2004). SPSS and SAS procedures for estimating indirect
effects in simple mediation models. Behavior Research Methods, Instruments, &
Computers, 36(4), 717-731.
Smith, R.L., Ager Jr, J. W., & Williams, D.L. (1992). Suppressor variables in multiple
regression/correlation. Educational and Psychological Measurement, 52, 17-29.
Spiller, S. A., Fitzsimons, G. J., Lynch Jr, J. G., & McClelland, G. H. (2013). Spotlights, floodlights,
and the magic number zero: Simple effects tests in moderated regression. Journal of
Marketing Research, 50(2), 277-288.
Tabachnik, B. G., & Fidell, L. S. (1989). Using multivariate statistics (2nd. ed.)
Cambridge, MA: Harper & Row.
Velicer, W.F. (1978). Suppressor variables and the semipartial correlation coefficient.
Educational and Psychological Measurement, 38, 953-958.
Wampold, B. E., & Freund, R. D. (1987). Use of multiple regression in counseling
psychology research: A flexible data-analytic strategy. Journal of Counseling
Psychology, 34, 372-382.
Wu, C. F. J. (1986). Jackknife, bootstrap and other resampling methods in regression
analysis. the Annals of Statistics, 1261-1295.
Wong, J. S., & Penner, A. M. (2016). Gender and the returns to attractiveness. Research in
Social Stratification and Mobility, 44, 113-123.
Zagorsky, J. L. (2007). Do you have to be smart to be rich? The impact of IQ on
wealth, income and financial distress. Intelligence, 35(5), 489-501.
Zhao, C. M., Golde, C. M., & McCormick, A. C. (2007). More than a signature: How advisor
choice and advisor behaviour affect doctoral student satisfaction. Journal of Further
and Higher Education, 31(3), 263-281.
Zou, G. Y. (2007). Toward using confidence intervals to compare correlations. Psychological
Methods, 12(4), 399-413.

C14.71
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.

You might also like