Chapter 14 - Multiple Regression - 2019
Chapter 14 - Multiple Regression - 2019
14
Multiple Regression
Contents
2
Multiple R versus Unique Effects ................................................................................................. 2
Multiple Regression Equation ....................................................................................................... 3
Standardized Versus Unstandardized Beta Weights ................................................................. 4
Multiple R (and Multiple R2)...................................................................................................... 4
Multiple Regression – Method Enter: SPSS .............................................................................. 5
Common Mistakes in Interpreting Standardized Beta Weights................................................ 9
Multiple Regression and Semi-Partial Correlations .................................................................... 10
Multiple Regression Assumptions ............................................................................................... 11
Tolerance ................................................................................................................................. 13
Variance Inflation Factor ......................................................................................................... 14
Dealing with Multi-Collinearity ................................................................................................... 15
Sample Size Requirements .......................................................................................................... 16
Hierarchical Multiple Regression ................................................................................................ 18
Hierarchical Multiple Regression: SPSS ................................................................................... 18
Hierarchical Multiple Regression: Comments ......................................................................... 20
Stepwise Multiple Regression ..................................................................................................... 21
Stepwise Multiple Regression Example: SPSS ......................................................................... 22
Does the total effect between X and Y need to be significant? .............................................. 46
Mediation versions Moderation ............................................................................................. 47
Advanced Topics ......................................................................................................................... 48
Proving that Multiple R is Really Just a Pearson r ................................................................... 48
Multiple Regression: Bootstrapping........................................................................................ 48
Multiple R2 Confidence Intervals............................................................................................. 49
Multiple Regression Scatter Plot ............................................................................................. 50
Testing the Difference between Two Beta Weights ............................................................... 51
Koenker Test of Homoscedasticity .......................................................................................... 53
Dealing with Heteroscedasticity: Wild Bootstrap ................................................................... 54
Dealing with Heteroscedasticity: Adjusted Standard Errors ................................................... 56
Multiple R2 Increased but p-value Decreased? ....................................................................... 57
Evaluating the Importance of Predictors: Relative Importance Analysis ................................ 58
Suppressor Effects ................................................................................................................... 61
Cooperative Suppression ........................................................................................................ 63
Practice Questions ...................................................................................................................... 67
Advanced Practice Questions...................................................................................................... 68
References................................................................................................................................... 69
CHAPTER 14: MULTIPLE REGRESSION
1
65.99*365 = 24,086
C14.2
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
on a dependent variable. In some cases, one might have 5 or more hypothesized predictors of
a dependent variable. Multiple regression will help determine which of the 5 hypothesized
predictors contribute unique predictive capacity in predicting the dependent variable. In the
years of education completed, IQ, and annual earnings example, a multiple regression will help
determine whether years of education completed and IQ can both predict annual earnings
statistically significantly and uniquely. If that sounds like a type A semi-partial correlation to
you, you’re on the right track. The main difference is that multiple regression builds a
regression equation that yields beta weights to maximize the prediction of the dependent
variable (i.e., maximize multiple R2). By contrast, multiple regression is not directly concerned
with semi-partial correlations. So, multiple regression beta weights and semi-partial
correlations (type A) have similarities, but they are also somewhat different, as I do describe in
further detail below.
Y a b1 X 1 b2 X 2 ... bn X n (1)
Where Y = the predicted value of the dependent variable, = the intercept, b1 = the beta
weight associated with the first independent variable, b2 = the beta weight associated with the
second independent variable, bn = the beta weight associated with the nth independent
variable, and ε = error. The X1, X2, Xn values correspond to specified values associated with the
independent variables (i.e., particular raw scores). Recall that the purpose of a multiple
regression equation is to predict a particular case’s value on Y. Once the multiple regression
equation has been solved, each case’s predicted value on Y can be estimated based on their
scores on the independent variables and the application of the regression equation.
Consider the example of years of education completed, IQ, and annual earnings.
Suppose someone had 14 years of education completed and an IQ of 112, what would I predict
their annual earnings to be? Suppose someone had 8 years of education completed and an IQ
of 79, what would I predict their annual earnings to be? I can’t answer this until I conduct a
multiple regression, because I need an intercept, and two beta weights: one for years of
education completed and one for IQ. Before I conduct the multiple regression analysis, I’d like
to point out the two different types of beta weights that can be estimated.
C14.3
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
C14.4
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
R will necessarily be larger, at least numerically. In the context of this example, recall that the
model R associated with the bivariate regression with education as the independent variable
and earnings as the dependent variable was estimated at .337 (as described in the Bivariate
Regression chapter). One of the questions posed by conducting a multiple regression is
whether adding IQ to the regression equation will increase model R in a statistically significant
manner. Next, I conduct the multiple regression analysis in SPSS.
In order to get additional important descriptive statistics, such as skew and kurtosis
values, you will need to use the Frequency utility in SPSS which will yield the SPSS table
entitled ‘Statistics’ (see Chapter 2).
The main element of the table to point out is that the skew associated with the
earnings variable is both positive and very large (3.45), as was observed when the analysis was
conducted in the bivariate regression example. For the purposes of learning, I will carry on
with the analysis as if the data were sufficiently normally distributed. I will then follow-up with
the bootstrapped version of the analysis, which does not assume normally distributed data.
2
Note that the sample size used in the correlation and bivariate regression chapters was N =
40. Thus, the results reported in this chapter differ slightly to the results reported in chapter
12.
C14.5
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
The next table outputted by SPSS is the correlation matrix table entitled ‘Correlations’.
It can be observed in the SPSS table entitled ‘Correlations’ that there were positive
correlations between all three of the variables included in the analysis. In particular, both
education (r = .332) and IQ (r = 307) correlated positively with earnings. Furthermore,
education and IQ were correlated positively with each other (r = .572).
The next table outputted by SPSS is entitled ‘Variables Entered/Removed’. It includes
the names of the variables included in the analysis. Additionally, it specifies the type of
method that was used, which, in this case, was ‘Enter’. The ‘Variables Removed’ column only
comes into play when one uses multiple regression methods such as ‘stepwise’, for example,
rather than method enter.
C14.6
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Next up is the ‘Model Summary’ table which provides information relevant to multiple R.
It can be seen in the SPSS ‘Model Summary’ table that the multiple R was estimated at
.361, which corresponds to a multiple R2 value of .130. Thus, 13.0% of the variance in annual
earnings was accounted for by a regression equation which included both years of education
completed and IQ as predictors. There is also an adjusted R2 value reported which
corresponded to .123. The adjusted R2 represents a more accurate estimate of the effect in the
population (you can square root the adjusted R2 to get the adjusted model R). Thus, from this
more accurate perspective, 12.3% of the variance in earnings would be expected to be
accounted for in the population by the multiple regression equation. Finally, the standard
error of the estimate was estimated at $40,375.37, which corresponds to the standard
deviation associated with the residuals (i.e., difference between observed dependent variable
scores and predicted dependent variable scores). A standard deviation in residuals equal to
$40,375 is massive, considering the mean in annual earnings was only $43,314. Thus, on
average, the prediction equation makes an error in predicting Y (the dependent variable,
annual earnings) equal to $40,375. This is the reality of doing science, in most cases. Recall
that the percentage of variance accounted in earnings was only about 13%, so a massive 87%
of the variance was not accounted for by education and IQ. Still, it is a numerical improvement
over the inclusion of only education as a predictor, which yielded an adjusted model R2 = .090
(reported in chapter 12).
Next up is the ‘ANOVA‘ table which tests the statistical significance of the multiple R
value:
It can be observed in the ‘ANOVA’ table that the multiple R value of .361 was associated with
an F-value of 18.525. With 2 and 247 degrees of freedom, the F-value was statistically
significant, p < .001. Note that the F-value corresponds to the unadjusted R value (not the
adjusted R value; SPSS does not report an F-value for the adjusted R2 estimate).
C14.7
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Next, SPSS reports a table entitled ‘Coefficients’ which includes the values associated
with the multiple regression equation.
It can be observed in the table entitled ‘Coefficients’ that the intercept (or what SPSS
refers to as the ‘Constant’) was equal to -56261.42 which is an impossible amount of money to
earn in a year (no matter how badly someone does their job). Also, note that none of the
values in the data file were negative. It is simply a reality that multiple regression will
sometimes estimate an “impossible” intercept value, if that is the best way to come up a
multiple regression equation that maximally predicts the dependent variable based on the
data.
In the same column as the intercept are the unstandardized beta weights. It can be
observed that education was associated with an unstandardized beta weight of b = 3756.289.
Therefore, a one unit increase in education (i.e., one extra year of education completed) was
associated with an approximate unique (not shared with IQ) increase of $3,756 in annual
earnings. To determine if the unstandardized beta weight was statistically significant, you need
to look at the ‘Sig.’ column. You’ll see that the education beta weight was associated with a p-
value of .002, based on a t-value of 3.211. Furthermore, the unstandardized beta weight of
3756.289 was associated with 95% confidence intervals equal to 1451.938 and 6060.639.
The IQ independent variable was also a statistically significant contributor to the
regression equation: t = 2.402, p =.017. Furthermore, the unstandardized beta weight was
equal to b = 501.978. Thus, a one unit increase in IQ was associated with an approximate
unique (not shared with education) increase of $501.98 in annual earnings. The 95%
confidence intervals for the IQ unstandardized beta weight were equal to 90.383 and 913.573.
I haven’t mentioned anything about the ‘Std. Error column’. The main interesting bit about
that column is that if you divide the unstandardized beta weight by its corresponding standard
error, you get the t-value. So, for IQ, 501.978 / 208.972 = 2.402.
In this example, reporting and interpreting the unstandardized beta weights made a
lot of sense, because they were naturally interpretable. People know what an extra year of
education implies, for example. Also, people understand annual earnings expressed as dollars.
However, all too often, the unstandardized beta weights are not naturally interpretable,
because the independent variables were measured on an unnatural metric (e.g., sum of items
C14.8
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
scored on a 5-point Likert scale), therefore, the standardized beta weights become very useful
to interpret in such cases. Also, although the education and IQ unstandardized beta-weights
are readily interpretable, they can’t be compared in terms of size (i.e., relative contribution to
the regression equation). One must consult the standardized beta-weights for such purposes.
In this example, education was associated with a standardized beta weight of =.232.
You cannot square a standardized beta weight and speak of percentage of variance accounted
for. However, you can say that a one standard deviation increase in years of education
completed (SD = 2.67) corresponded to .232 of a standard deviation increase in annual
earnings (i.e., 43123.58 * .232 = $10,004.67). Note that, in the context of this regression
equation, the .232 of a standard deviation increase in annual earnings is unique to years of
education, i.e., not shared with IQ (same fact applies to the unstandardized beta weight). With
respect to the IQ standardized beta weight results, a one standard deviation increase in IQ (SD
= 14.93) was associated with a .174 increase in annual earnings (i.e., 43123.58 * .174 =
$7503.50). In practice, the statistical significance associated with standardized beta weights is
implied to be consistent with the unstandardized beta weights. Therefore, researchers simply
interpret a standardized beta weight as statistically significant, if the unstandardized beta
weight was found to be statistically significant (Watch Video 14.2: How to interpret a
standardized beta weight).
As mentioned, one of the benefits associated with standardized beta weights is that
you can compare their importance in a regression equation. For example, the standardized
beta weight associated with IQ was equal to .174, which was numerically smaller than the
standardized beta weight of .232 associated with education. Thus, at least numerically, years
of education completed was a more substantial, unique contributor to the regression equation
than IQ. Although researchers very frequently interpret a numerical difference between two
standardized beta weights as statistical support for interpreting one predictor as a more
importance contributor to the model, the reality is that such numerical differences can easily
arise due to sampling fluctuations. Consequently, it is necessary to test the numerical
difference between two standardized beta weights statistically, before making any
interpretations. Unfortunately, very few, if any, programs test the difference between two
standardized beta weights statistically. In the Advanced Topics section of this chapter, I
demonstrate an arguably appropriate technique to test the difference between two
standardized beta weights in SPSS, using bootstrapping and confidence interval overlap
estimation.
square the beta weight and speak of the percentage of variance in the dependent variable
accounted for by the independent variable. In this context, a researcher has the option of
consulting and reporting squared semi-partial (type A) correlations, instead. Consequently, I
recommend that you always report both the standardized beta weights and the semi-partial
(type A) correlations in a multiple regression analysis.3
Another common mistake is to state that the largest standardized beta weight is the
most significant predictor of the dependent variable. In order to make such a statement
justifiably, one would need to test the difference between two beta weights statistically. It is
not enough to simply look at the numerical difference between two weights and assert one to
be the most significant predictor of a dependent variable.
Finally, all too often, researchers fail to specify whether they have reported the
unstandardized (b) or standardized () beta weights in their results. In the behavioral sciences,
it usually only makes sense to report the standardized beta weights, as the variables are
typically measured on an inherently meaningless scale. In some disciplines, however, many of
the independent and dependent variables are associated with meaningful scales (e.g.,
economics). Consequently, the unstandardized beta weights can be interpreted meaningfully
in such cases. Ultimately, it is up to the researcher to decide whether it is best to report the
unstandardized or standardized beta weights (or both). So long as the beta weights are
demarcated clearly as either unstandardized (b) or standardized (), the reader should be able
to follow the results coherently.
3
Journal articles commonly include only the standardized beta weights. Fortunately, beta-weights can
be converted by a reader into squared semi-partial correlations, based on a simple re-arrangement of
the formula specified by Tzelgov and Stern (1978, p. 330). Thus, to derive a semi-partial correlation from
2
a beta-weight, the following formula can be used: 𝑟𝑌1.2 = 𝛽𝑌1.2 ∗ (√1 − 1𝑟12 )
C14.10
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
SPSS refers to semi-partial correlations (i.e., Type A) as Part correlations. As you can
see in the SPSS table entitled ‘Coefficients’, education was associated with a type A semi-
partial correlation of r = .190, or r2 = .036. Thus, 3.6% of the variance in annual earnings was
accounted for uniquely by years of education completed. IQ was associated with a type A
semi-partial correlation of r = .143, or r2 = .020. Thus, 2.0% of the variance in annual earnings
was accounted for uniquely by IQ. It only really makes sense to report type A semi-partial
correlations in the context of multiple regression, not the type B semi-partial correlations (see
Chapter 13). The statistical significance associated with the type A semi-partial correlations is
implied to be the same as the unstandardized beta weights. Consequently, there is no
additional ‘Sig.’ column for the additional correlations included in the ‘Coefficients’ table.
I think it is good practice to always report the semi-partial correlations associated with
the independent variables included in a multiple regression model. An additional interesting
attribute associated with the squared semi-partial correlations is that they can be summed. In
this example, the sum of the squared semi-partial correlations amounted to: .036 + .020 =
.056. Thus, 5.6% of the variance in earnings was accounted for by education and IQ, uniquely.
However, you may recall that the multiple R2 was equal to .130, or 13.0%. What can account
for the difference between 13.0% and 5.6%? Well, the difference of 7.4% (13.0 – 5.6) may be
described as the percentage in earnings variance that was accounted for by education and IQ
in a shared manner, i.e., not uniquely. Recall that, in this example, education and IQ shared
32.7% (r2 = .327) of their variance. That shared education and IQ variance evidenced an impact
on the model R2 equal to 7.4%.
C14.11
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
1: Random sampling
The assumption implies that every case in the population has an equal chance of being
selected into the sample. The data were not obtained via random sampling in Zagorsky (2007).
The impact of such a violation on the results is anyone’s guess.
2: Independence of Observations
See bivariate regression chapter.
4: Linearity
See bivariate regression chapter.
C14.12
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
6: Independence of errors
See the bivariate regression chapter.
7: Homoscedasticity
See the bivariate regression chapter for a full discussion of this assumption. I tested
the homoscedasticity assumption by estimating the correlation between the standardized
predicted values and the absolute standardized residuals. The Pearson and Spearman
correlations were r = .17 (p = .006) and rS = .14 (p = .028), respectively. The statistically
significant correlations suggest that the homoscedasticity assumption has not been satisfied
(Watch Video 14.4: Testing Homoscedasticity via Correlation). In the Advanced Topics section
of this chapter, I describe a more commonly used procedure to test the assumption of
homoscedasticity known as the Koenker test. Unfortunately, the Koenker test cannot be
applied within the menu options of SPSS. I also describe in the Advanced Topics section of this
chapter a procedure that can be implemented to deal with the presence of heteroscedasticity.
8: Absence of Multi-collinearity
Strictly speaking, the absence of multicollinearity is not really an assumption of
multiple regression, as the results obtained from a multiple regression analysis in the presence
of multicollinearity will be accurate. However, it is still important to pay attention to the
possibility that multicollinearity may be present in one’s data, as the stability of the model will
be affected negatively.
In the context of multiple regression, I like to distinguish between collinearity and
multicollinearity. Collinearity is observed when one independent variable correlates
approximately r = .95 or higher with another independent variable. By contrast,
multicollinearity is observed when one independent variable is associated with a multiple
correlation of R = .95 or higher, based on a prediction equation with the other independent
variables included in the multiple regression analysis. Thus, collinearity can be evaluated with
a simple examination of the Pearson correlation matrix. By contrast, multicollinearity cannot
be evaluated with a simple eye-ball examination of the correlation matrix. Instead, more
sophisticated analyses need to be performed such as Tolerance and/or the Variance Inflation
Factor. I describe these next.
Tolerance
Tolerance represents the percentage of variance in a particular independent variable
that is not shared with the other independent variables included in the multiple regression
analysis. Stated alternatively, tolerance represents the percentage of variance that is unique to
a particular independent variable. In multiple regression, it is desirable for all of the
independent variables to be associated with some unique variance. The percentage of
C14.13
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
variance not shared with the other independent variables is estimated by regressing the
independent variable onto the remaining independent variables. Consequently, tolerance may
be formulated as:
Tolerance 1 R 2
Tolerance values are calculated for each independent variable included in the analysis.
Therefore, if five independent variables are included in the model, then five tolerance values
will be estimated.
All other things equal, researchers desire higher levels of tolerance. Tolerance values
of .10 or greater are commonly considered acceptable. Such a value would imply that each
independent variable requires a minimum of 10% unique variance. It would also imply that no
independent variable is associated with a multiple R of .95 or greater (i.e., .952 = .90).
C14.14
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
It can be observed that education and IQ were both associated with the same
Tolerance and VIF values, as expected, because there were only two independent variables in
the analysis. When there are three or more independent variables included in the analysis, the
tolerance/VIF values will almost always be different across the independent variables. In this
example, education and IQ were associated with tolerance of .672, which implies that they
were associated with 67.2% unique variance, i.e., not predicted by the other independent
variable. Such a value exceeds comfortably the minimum acceptable level of tolerance of .10.
Correspondingly, the VIF was equal to 1.487 ( 1/.672 = 1.488), which implies that the beta
weight standard errors were approximately 1.488 times larger than they would otherwise be,
if there was no correlation between the two independent variables included in the analysis. As
the VIF value of 1.488 is much less than the maximum acceptable VIF value of 10, the issue of
multicollinearity (or collinearity, more precisely, in this case) is not a problem, in this example.
C14.15
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
4
Power represents the probability of rejecting the null hypothesis when it is, in fact, false in the
population. A minimum power level of .80 (or 80%) has been recommended (Cohen, 1988).
C14.16
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Thus, based on Green (1991), realistically, a typical multiple regression analysis would require
a minimum of approximately 105 participants. So, for example, if one wished to conduct a
multiple correlation analysis with four predictors, and one was interested in examining the
beta weights, a sample size of 109 would be required, based on Green’s (1991) guidelines.
Green’s (1991) guidelines appear to have become fairly popular over the years, based on the
number of times his paper has been cited.
More recently, Maxwell (2000) extended Green’s (1991) research by pointing out that
the magnitude of the correlations between the predictors will affect the sample size required
to conduct a multiple regression with a specified level of power. Maxwell’s (2000) work is
directly related to the variance inflation factor described in the context of multi-collinearity.
You’ll recall that, as the correlations between the predictors in a multiple regression increase,
so do the standard errors associated with the predictor beta weights. Consequently, Maxwell
(2000) reasoned, larger correlations between the predictors implies the need for greater
samples sizes, all other things equal, if one is interested in both the beta weights and the
multiple R. Maxwell (2000) conducted an extensive simulation study to estimate the sample
sizes required to detect statistically significant beta weights across a variety of conditions,
including the average magnitude of the correlations between the predictors, and a specified
power of .80. I have provided the table below which includes some of the key results reported
by Maxwell (2000) in Table C14.1.
Table C14.1. Sample Sizes Required for Multiple Regression (5 predictors; Power = .80)
pxy
pxx .20 .30 .40
.30 1,070 419 191
.40 1,731 692 328
.50 2,752 1,117 544
Note. Based on results reported in Maxwell (2000); pxx = mean inter-correlation between
predictors; pxy = mean inter-correlation between predictors and dependent variable.
Clearly, very large sample sizes are required, when the correlations between the
predictors are large. For example, if the average correlation between the predictors and the
dependent variable is r = .30 and the average correlation between the predictors is r = .30, a
sample size of 1,117 is required to achieve power of .80. However, if the number of predictors
is reduced to three, the sample size required to achieve power of .80 for the entire analysis is
reduced to a much more manageable 218 (see Table 5 in Maxwell, 2000). I used the words
‘entire analysis’, because Maxwell (2000) estimated power under the pretense that the
researcher was interested in testing the model R for statistical significance, as well as each of
the predictors (i.e., not at least one predictor).
C14.17
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
C14.18
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Next, in a SPSS table entitled ‘Model Summary’, you will find the model R, model R2,
model adjusted R2 values for both Model 1 and Model 2. It can be seen that with the addition
of education to the model (Model 1), the model R2 was equal to .110. By contrast, with the
addition of IQ to the model (education and IQ together), the model R2 increased to .130. On
the right side of the ‘Model Summary’ table you will find the ‘change statistics’. These statistics
correspond to the changes in model R2 values and their corresponding statistical significance.
The change statistics for model 1 correspond to no independent variables included in the
model to the addition of one independent variable, which is education in this example. The R2
change from .000 to .110 was statistically significant, F(1, 248) = 30.69, p < .001. Thus, the
percentage of variance accounted for earnings by the inclusion of education into the model
was statistically significant.
The R2 change associated with model 2 (i.e., the addition to intelligence to the model)
corresponded to .020 and it was statistically significant, F(1, 247) = 5.77, p = .017. Thus, the
addition of intelligence to the model increased the percentage of variance accounted for in
earnings in a statistically significant and unique manner. Clearly, it was not a large increase,
but it was a statistically significant one.
Next, a table entitled ‘ANOVA’ includes the F and p-values associated with the model
R values reported in the ‘Model Summary’ table. Model 1’s R2 value of .110 was statistically
2
significant, F(1, 248) = 30.69, p < .001. Additionally, model 2’s R2 value of .130 was statistically
significant, F(2, 247) = 18.53, p < .001.
C14.19
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Next, a table entitled ‘Coefficients’ includes all of the statistics relevant to the individual
predictors.
C14.20
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
C14.21
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
regression equation, because they are the only ones that are found to be significant
statistically and unique contributors to the regression equation.
5
You might be surprised to learn just how much disagreement there is among respected researchers
with respect to the role of cholesterol in heart disease.
C14.22
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
cholesterol was accounted for by the inclusion of sleep. The stepwise analysis terminated after
the inclusion of seven predictor variables, which accounted for 36.0% (model R2 = .360) of the
variance in cholesterol. Also note that the final variable included in the model (‘sex’) increased
the percentage of variance accounted for by only 1% (R2 change = .010).
Like any other hierarchical-type regression, SPSS then reports a table entitled
‘ANOVA’, which tests for statistical significance each of the Model R values (from Model 1 R =
.440 to Model 7 R = .600). I have not reported the SPSS table of results, here, as the table is
large. Suffice it to say that all of the Model R values were found to be significant statistically.
Next, SPSS reports the beta-weights associated with the variables included in the model. As
can be seen in the SPSS table entitled ‘Coefficients’, age was associated with a standardized
beta-weight of .430 and the largest t-value. By comparison, sex (coded 1 = female; 2 = male)
was associated with a standardized beta-weight of .102. Most of the semi-partial correlations
are of a similar magnitude to the standardized beta-weights, as the independent variables
were not particularly inter-correlated. An exception to that statement was BMI.
C14.23
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Next, SPSS reports the independent variables excluded from the regression equation in
a table entitled, appropriately enough, ‘Excluded variables’. Again, because the table is so long,
I reported only the top and bottom of the table. It can be seen that, in the final model (model
7), two variables were left out of the regression equation: PhysAct (Physical Activity) and Alc
(Alcohol consumption). It can be seen that the partial correlations are rather small at -.062 and
.016, respectively, neither of which were significant statistically. Consequently, they were
omitted from the regression equation.
Thus, with respect to the prospect of predicting a person’s cholesterol level without
taking a blood test, more work needs to be done, as only 36% of the variance in cholesterol
was accounted for by the regression equation that included seven predictors. Naturally, many
other variables could be potentially added to the regression equation, which may increase the
model R2. Conceivably, hundreds of variables could be added to the regression equation to
help increase model R2. With a sufficient sample size (in the hundreds of thousands), the
results may prove robust. Who knows? Maybe one day.
Importantly, it would be unjustified to suggest that the variables omitted from the
regression equation were unrelated to cholesterol levels. As can be seen in the Pearson
correlation table below, both PhysAct and Alc were correlated statistically significantly with
cholesterol (all correlations greater than |.06| were significant statistically). The multiple
regression results simply suggest that they were not related to the cholesterol in a unique way,
i.e., independently of the effects attributable to the other independent variables. I also note
that PhysAct was very nearly included in the model with a p-value of .051. Of course, another
C14.24
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
sample of 1,000 people may yield a slightly different correlation matrix, which may have
resulted in PhysAct included in the model and, possibly, sex excluded. Thus, strong arguments
about the value of a variable as a predictor should not be made on the basis of a stepwise
multiple regression analysis (or any multiple regression analysis for that matter). Large
samples sizes and replication is crucial, with respect to interpreting the nature and strength of
the predictors. The model R2, on the other hand, would be expected to be more stable from
sample to sample.
C14.25
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
predicting the dependent variable (Lewis-Beck, 1978). For example, a researcher may state
that the first variable entered into the model is more important than the second variable
entered into the model. This is a criticism of interpretation, not the stepwise procedure itself.
From an inferential statistical perspective, differences between point-estimates (e.g., beta-
weights) should be tested statistically. Consequently, it is inappropriate to declare the
difference between two beta weights as significant, simply because a stepwise procedure
entered them into a model at different steps. Although entirely true, such a criticism can be
leveled against any approach to multiple regression equation building, from the ‘method:
enter’ approach to the fully specified path analytic approach. Consequently, I believe this
criticism has also been levelled against the stepwise approach, unfairly.
As a general statement, I would encourage users of stepwise multiple regression to,
first, verify that at least two (or more) of the candidate predictors are correlated statistically
significantly with the dependent variable, as data sampled randomly from a population of non-
correlated variables can yield a regression equation with “statistically significant” (p < .05) beta
weights (Antonakis & Dietz, 2011). Again, this recommendation could be applied to virtually
any multiple regression procedure, unless suppressor effects were hypothesized (in which
case, one would probably test suppression directly). My recommendation, here, is consistent
with applying Bartlett’s test of sphericity on a correlation matrix, prior to conducting an
exploratory factor analysis (Dziuban & Shirkey, 1974).
C14.26
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
adolescents who reported large amounts of video game playing. To test the hypothesis,
Pryzbylski and Weinstein (2017) obtained data from a very large sample of 15-year-old
adolescents who reported the amount of time they played video games per day (in hours). 6
The adolescents also completed a 14-item self-report measure of mental well-being
(happiness, life satisfaction, psychological functioning, and social functioning). I have simulated
some data to correspond to the essence of the results reported by Pryzbylski and Weinstein
(2017) (Data File: video_games_well_being).
Figure C14.2. Idealised Lines of Best Fit: Four Possible Curvilinear Quadratic Effects
Panel A: U-Shape Panel B: Inverted U-Shape
The key to conducting a curvilinear regression is that you need to transform your
predictor variable in such a way as to reflect the nature of the curvilinear effect in which you
are interested. As mentioned, a quadratic effect is by far the most commonly observed and
tested curvilinear regression. A quadratic curvilinear effect implies one bend in the regression
line of best fit (a linear effect implies a straight line of best fit). The bend in the line of best fit
may reflect a U-shaped pattern (Figure 14.2, Panel A) or an inverted U-shaped pattern (Figure
14.2, Panel B). Often, the quadratic curvilinear pattern will not represent a perfect U-shape.
Instead, for example, the line of best fit may increase across most of the first half of the scores
6
Pryzbylski and Weinstein (2017) also had data relevant to TV watching, computer use, and smartphone
use, if you’re interested.
C14.27
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
on the X-axis and then plateau across the second-half of the X-axis (Figure 14.2, Panel C).
Finally, the opposite pattern may also be observed from time to time (Figure 14.2, Panel D).
Prior to conducting the curvilinear regression, I’ll note that the Pearson correlation
between hours of video-game playing and mental well-being was estimated at r = -.116, p =
.002. Such a result suggests that higher levels of video game playing is associated with lower
levels of mental well-being. As I show below, it would be a bit mistake to end the analysis
there.
The first step to conducting a quadratic curvilinear analysis is to transform the predictor
variable scores into squared variables (Watch Video 14.8: Square a variable (via transform) in
SPSS). Once you create the new (squared variable), you can conduct a hierarchical multiple
regression analysis. In the first block, include the original independent variable. In the second
block, include the squared independent variable. If the inclusion of the squared independent
variable into the model is found to be associated with a statistically significant increase in
model R2, then you can suggest that there was a curvilinear effect between the independent
variable and the dependent variable. For the video game playing and mental well-being
example, I obtained the following results in SPSS (Watch Video 14.9: Curvilinear (Non-Linear)
Regression in SPSS).
As can be seen in the SPSS table entitled ‘Model Summary’, the model R and R square
values associated with Model 1 (the inclusion of only video game playing as a predictor) were
estimated at .116 and .014, respectively, which implies that 1.4% of the variance in the
dependent variable (mental well-being) was accounted for by the independent variable (video
C14.28
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
game playing hours). Furthermore, the model R2 was significant statistically, F(1, 732) = 10.06,
p = .002. It is not a coincidence that the model R2 p-value was identical to the Pearson
correlation p-value. They correspond to essentially the same analysis. More importantly,
Model 2, which includes both the gaming hours independent variable and the squared gaming
hours independent variable, yielded Model R and R2 values of .252 and .063, respectively,
which implies that 6.3% of the variance in mental well-being was accounted for by the
combination of the linear (video game hours) and quadratic effects (squared video game
hours). Furthermore, the R2 change (.050; i.e., .063 - .014, within rounding) was found to be
significant statistically, F(1, 731) = 38.90, p < .001. Thus, the inclusion of the squared video
game hours variable increased the predictive capacity of the regression equation. Such a result
implies that there is at least one bend in the line of best fit, i.e., the effect is not linear.
Prior to examining the scatter plot, you can see in the SPSS table entitled ‘Coefficients’
that both the gaming hours (gaming_hours) and squared gaming hours (gaming_hours2)
variables were statistically significant contributors to the model, as each variable was
associated with a statistically significant beta-weight: gaming_hours, b = .182, β = .634, t =
5.05, p < .001; gaming_hours2, b = -.028, β = -.783, t = -6.237, p < .001. Unfortunately, beta-
weights, even standardized beta-weights, are not particularly interpretable in a curvilinear
regression analysis. In the context of a quadratic curvilinear regression, the most you can really
say is the following. If the quadratic effect beta-weight is positive in direction, it suggests a U-
shaped pattern in the scatter plot. By contrast, if the beta-weight is negative in direction, it
suggests an inverted U-shaped pattern in the scatter plot.
Some would recommend transforming the independent variable into z-scores and
then square the z-scores to create a standardized variable to represent the quadratic effect in
a standardized way. The argument, here, is that these standardized variables, when analyzed
in the hierarchical multiple regression, will yield more interpretable beta-weights. I’m not
entirely convinced of this approach, as I do not believe the standardized beta-weights become
fully interpretable, in the same way that standardized beta-weights can be interpreted in a
more typical (linear) multiple regression.
Instead of focusing on the beta-weights in a curvilinear linear regression, one might
consider focusing on the semi-partial correlations, as they appear to be more interpretable. In
this example, the quadratic term was associated with a semi-partial correlation of r = -.223
(see SPSS table entitled ‘Coefficients’). The negative semi-partial correlation implies an
inverted U-shaped effect, which is consistent with the scatter plot depicted in Figure C14.3.
Furthermore, the squared semi-partial correlation of .050 (i.e., -.2232) implies that 5.0% of the
variance in mental well-being was accounted for by the curvilinear effect. The linear effect
accounted for 1.3% of the variance in mental well-being, on the basis of the squared semi-
partial correlation estimated from model 1 (i.e., .1162 = .013). I do not believe the linear effect
semi-partial correlation from model 2 is interpretable in this way (so, it should probably be
C14.29
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
ignored). Note that the sum of the squared semi-partial correlation associated with the linear
effect (model 1) and the squared semi-partial correlation associated with the quadratic effect
(model 2) essentially equals the model 2 R2 of .063 (.013 + .051 = .064). I don’t know for
certain if this will always be the case, but it (i.e., summing model 1 linear squared and model 2
quadratic squared) does seem to apply across several data files I have.
Before moving onto the all-important scatter plot, I’d like to point out that the
collinearity statistics (Tolerance and VIF) suggest that collinearity is a problem in the data, as
the tolerance is less than .10 and the VIF is greater than 10, in this example. There is much
literature on the issue that the linear effect term and the quadratic effect term necessarily
correlate with each very substantially (in this case, they correlated r = .958). The concern is
that the substantial correlation will cause unacceptably high multi-collinearity. In this example,
you will note that tolerance was reported at .081 and the VIF was reported at 12.29 (see
‘Coefficients’ SPSS table). Such values suggest a problem (recall that tolerance should not
exceed .10 and VIF should be less than 10). However, in a curvilinear regression analysis, you
should not trust these multicollinearity values. They are misleading and untrustworthy. If
multicollinearity were really as bad as VIF = 12.29, the standard errors associated with the
unstandardized beta weights would be larger than they are (e.g., gaming_hours2 had a tiny
unstandardized standard error of .004). Furthermore, neither the linear nor the quadratic
effects would be significant statistically; instead, they were both significant statistically. Again,
multicollinearity is not a problem. Ignore it as an issue in regressions that include product
terms, such as curvilinear regression and moderated regression (Disatnik & Sivan, 2016).
Consequently, feel free to proceed with the analysis.
It is almost always essential to evaluate a curvilinear regression with a scatter plot and
line of best fit (Watch Video 14.10: Curvilinear (Non-Linear) Scatter Plot in SPSS). As can be
seen in Figure C14.3, the scatter plot depicted an inverted U-shaped relationship between
video game playing hours and mental well-being. Specifically, well-being tended to increase
from about 0 hours to 3 hours of video game playing, then started to decrease beyond 3 hours
C14.30
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
of video game playing per day.7 Clearly, relying only upon the Pearson correlation of -.116
would mislead people in a serious way, as it failed to reflect the curvilinear nature of the
effect: small to moderate amounts of video game playing appears8 to be a good thing for the
mental well-being of 15-year-olds!
Figure C14.3. Scatter Plot Depicting the Curvilinear (Quadratic) Association Between Video
Game Playing (Hours Per Day) and Mental Well-Being
7
My data don’t quite replicate the results reported by Pryzbylski and Weinstein (2017). They reported
peaks in well-being at 2 hours per day, for weekdays, and 3 hours per day, for the weekends.
8
Based on these results, one could not contend justifiably that playing a small to moderate amount of
video games per day causes an increase in mental well-being. These data are non-experimental in
nature, consequently, one could only suggest the possibility of a causal effect. When I was 15, I played
about 1 hour per day on my commodore 128 (in commodore 64 mode). Those were the days!
C14.31
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
another variable, one is dealing with an interaction. Correspondingly, you may regard a
curvilinear regression as the most basic type of interaction in a regression context: the nature
and/or strength of the association between X and Y depends on the values of X. In the next
section of this chapter, I describe moderated (interaction) multiple regression with two
different independent variables and one dependent variable.
C14.32
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
product represents the interaction term in the analysis. As per a curvilinear analysis, one can
test a hypothesized interaction effect via a hierarchical multiple regression. At step 1, the two
independent variables are entered into the equation. Then, at step 2, the product variable is
entered into the equation. If a statistically significant R2 change (as tested via F-change) is
observed between step 1 and step 2, a statistically significant interaction may be declared.
I simulated some data to correspond to the results reported by Hirschfeld et al. (2004)
(Data File: IQ_motivation_gpa). First, I created the product term by multiplying the
intelligence scores by the academic achievement scores. In SPSS this can be done via the
‘Compute variable’ utility (Watch Video 14.11: Create Product Term Variable in SPSS). Next, I
examined the Pearson correlations between the variables, which is always good practice for
virtually any multiple regression analysis.
As can be seen in the SPSS table entitled ‘Correlations’, both intelligence (r = .42, p <
.001) and academic motivation (r = .35, p < .001) correlated positively with academic
performance (GPA). However, intelligence and academic motivation did not correlate with
each other significantly (.06, p = .254). Furthermore, the intelligence by motivation product
term correlated positively with GPA (r = .51, p < .001). Finally, the product term correlated
with intelligence and academic achievement at .56 and .86. Am I worried about the large
correlation of .86 between the product term and academic motivation? Not one bit. As far as
multicollinearity is concerned, it is not a problem. It’s an illusion.
Next, I conducted the hierarchical multiple regression in SPSS (Watch Video 14.12:
Moderated Multiple Regression in SPSS). There are three key tables to consult to interpret the
results fully. The ‘Model Summary’ table, the ‘ANOVA’ table, and the ‘Coefficients’ table. As
can be seen in the ‘Model Summary’ table below, the Model 1 results are reported, which
includes the model R and R square associated with the two predictors included in the analysis.
C14.33
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Thus, 28.2% of the variance in GPA was accounted for by intelligence and academic
achievement, which was significant statistically, F(2, 361) = 70.99, p < .001. We don’t know
which variable contributed to the equation, yet. Those results are reported in a table
‘Coefficients’ table, where it can be seen that both intelligence and academic motivation were
statistically significant contributors to the model. Specifically, intelligence: b = .002, β = .40, t =
8.97, p < .001; academic achievement: b = .24, β = .33, t = 7.30, p < .001.
The most important piece of information in the SPSS ‘Model Summary’ table is the ‘R
square change’ and associated F- and p-value. In this case, the R square change was equal to
.010, which was statistically significant, F(1, 360) = 4.97, p = .026. The statistically significant R
square change implies that the moderator hypothesis was supported: it does appear to be the
case that the association between intelligence and GPA interacts with academic motivation.
One could just as appropriately state that the association between academic motivation and
GPA interacts with intelligence. They are two sides of the same coin. As can be seen in the
SPSS table entitled ‘Coefficients’, the intelligence by academic motivation product term was
associated with a statistically significant beta-weight, b = .001, β = 1.06, t = 2.23, p = .026. The
overall model including all there independent variables (i.e., Model 2) was associated with a
model R2 .292 (model R = .540), which was significant statistically, F = 49.50, p < .001. It’s not a
coincidence that the sum of squares ‘Regression’ divided by the sum of squares ‘Total’ is equal
to the model R2 value (i.e., 30.913 / 105.851 = .292). As I describe further below, there is a
close correspondence between ANOVA and multiple regression. SPSS makes use of an ANOVA
framework to test the final model R2 for statistical significance, hence the inclusion of a table
entitled ‘ANOVA’.
Interpreting a statistically significant moderator effect in multiple regression is not
easy, especially if all you rely upon is the beta-weight. As per curvilinear regression, a
positively directed interaction term beta-weight implies that the association between the
independent variable and the dependent variable becomes progressively stronger across the
spectrum of the moderator variable values. In this study, which variable you would consider
the independent variable and which variable the moderator is arbitrary. Theoretically and
empirically, it is not possible to assert whether intelligence is the moderator of the effect
between academic achievement and GPA, or whether academic achievement is the moderator
of the effect between intelligence and GPA. It’s best to think about the moderator effect as a
mutual, multiplicative (synergistic) effect.
Prior to demonstrating a graphical method to depict the nature of a moderated effect,
it is useful to point out that neither the unstandardized nor the standardized beta-weights
associated with the interaction term are readily interpretable. For example, in this example
investigation, the standardized beta-weight associated with the interaction term exceeded 1.0,
which is possible, mathematically, it’s just not easy to interpret in an insightful way. Instead, I
recommend that you focus upon the semi-partial correlations associated with a moderated
C14.34
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
multiple regression analysis, as per my recommendation for a curvilinear regression. That is, by
summing the squared semi-partial correlations associated with the main effects included in
model 1 (i.e., .4002 + .3252) and the product term squared semi-partial correlation reported in
model 2 (i.e., .102), you get close to the overall model 2 R square of .292 (i.e., .160 + .106 +
.010 = .276). The difference between the two R square values, .292 - .276 = .016, represents
the percentage of variance accounted for in the dependent variable due to the shared variance
between the predictor variables. Recall that intelligence and motivation were inter-correlated
at only r = .060, so, it makes sense that the common variance predictive influence on the
dependent variable was fairly small, in this example. I also note, briefly, that presence of a
suppressor effect would complicate the interpretation of the difference between the .292 and
.276 R square values. I discuss suppressor effects further in this chapter.
C14.35
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
C14.36
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
significant interaction detected by the moderated multiple regression (R2 change = .010, F =
4.97, p = .026), implies, at least in rough terms, that the two correlations are statistically
significantly different from each other. However, this will not necessarily be the case, as there
are many ways to split-up the sample across the moderator continuum. In this example, the
statistically significant interaction implies that the correlation between intelligence and GPA
was not equal across all levels of the academic motivation. In practice, we do not know exactly
where the split-sample correlations are unequal across the spectrum of the moderator.
Splitting the sample into low and high groups is good start, and my demonstration above will
hopefully give you some insight into the graphical spotlight analysis.
Fortunately, rather than estimate the median and create the two groups manually,
O’Connor (1998) created a nice, little SPSS syntax file that will conduct a ‘split slope’ analysis in
a relatively efficient way (Watch Video 14.14: Spotlight Analysis in SPSS - Extreme). The default
in the O’Connor (1998) syntax file is to estimate the independent variable and dependent
variable slopes across three levels of the moderator: (1) one standard deviation below the
moderator mean, (2) the moderator mean, and (3) one standard deviation above the
moderator mean. Additionally, the X-axis data points are generated by the syntax file on the
basis of one standard deviation below and above the independent variable mean. When I ran
the O’Connor (1998) syntax with the default settings, I obtained the chart depicted in Figure
C14.5 (left-side). It can be seen that the degree of positive slope increased across the three
levels of academic motivation. Thus, the magnitude of the association between intelligence
and academic achievement (GPA) increased across the continuum of academic motivation.
Thus, the combination of intelligence and academic motivation impacts academic performance
in such a way that we would not predict, based only on an examination of the intelligence and
motivation scores, separately. Something a little “special” appears to happen when relatively
elevated levels of intelligence are combined with relatively elevated levels of academic
motivation.
As mentioned, the default setting in the O’Connor (1998) syntax is to split the
moderator data across three groups, two of which are one standard deviation above and
below the mean (get syntax here). You can manipulate the O’Connor (1998) syntax such that
the moderator data are split across three groups, two of which are two (rather than one)
standard deviations above and below the mean. Increasing the standard deviation from one to
two standard deviations has the effect of accentuating the magnitude of the interaction effect.
As can be seen in Figure C14.5 (right side), the difference in the degree of slope is more
substantial for two standard deviation criterion, in comparison to the one standard deviation
criterion. Consequently, in my view, the split slopes analysis commonly used by researchers
tends to exaggerate the magnitude of the interaction effect. By comparison, my original split
slopes analysis based on low-moderate and moderate high groups (Figure C14.4) is arguably a
more appropriate (i.e., realistic) method to represent a two-way interaction for a moderator
C14.38
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
analysis (you can also present a single scatter plot with two lines of best fit, as I demonstrate in
the video), as it is more consistent with the moderator multiple regression results, which is
based on all of the data (not extreme groups data). Finally, you might find it beneficial to split
the sample into a larger number of finer portions for a more comprehensive and refined
‘floodlight analysis’ (see Spiller et al., 2013).
Figure C14.4. Scatter Plots for Split Slope Analysis (Raw Data)
Low to Moderate Motivation (r = .38) Moderate to High Motivation (r = .46)
Correspondingly, the tolerance and variance inflation factor indicators will almost always
suggest that multicollinearity is grossly excessive. Consequently, researchers tend to apply
strategies to reduce the degree of shared variance between the predictors and the product
term. One commonly observed strategy involves subtracting each predictor variable’s mean
from the case values for a given variable, prior to calculating the product term to represent the
moderator effect. Such a strategy is known as ‘mean centering’.
Although mean centering will often reduce multicollinearity to some degree, there is
much debate on whether multicollinearity is truly a problem for moderated multiple
regression. Personally, I’m in the ‘it’s not a problem camp’. Like others, I believe the perceived
multicollinearity is really just an illusion. As demonstrated by Disatnik and Sivan (2016), for
example, the multicollinearity apparent when a product term is included in a multiple
regression is due to interval scaling. Furthermore, despite the observation of a very large
variance inflation factor in a moderated regression, multicollinearity due to interval scaling
does not impact (increase) standard errors for the beta-weights (Disatnik & Sivan, 2016).
Again, this sort of multicollinearity is an illusion. It should be kept in mind, however, that
multicollinearity due to a substantial amount of shared variance between separately measured
independent variables (i.e., not product terms) should be evaluated and taken into account,
when it is observed.
C14.40
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
(Figure 14.6, Panel B). In the context of mediation analyses, a bivariate regression coefficient
(B) between X and Y is often referred to as a total effect. The direction of the arrow may be
considered to imply that the independent variable causes the dependent variable. Again, there
is no way to establish causality with a bivariate regression. However, more complicated
models can be tested to help suggest more convincingly the possibility of causality.
Figure C14.6 Visual Depictions of the Pearson Correlation and Bivariate Regression Models
Panel A Panel B
One such model is a mediation regression model. In its simplest form, the purpose of
mediated regression is to help understand the nature of an association between two variables.
Specifically, a researcher may hypothesize that the total effect between two variables (X and Y)
may be due, partially or fully, to a third variable. A variable hypothesized to account, partially
or fully, the association between two variables is known as a mediator (M). When a variable is
found to account, partially or fully, the association between X and Y, researchers will often
state that there is an indirect effect between X on Y via M. In practice, the generation of a
hypothesis that includes a mediator should be supported by theoretical considerations.
For example, Stoeber and Corr (2016) were interested in the association between self-
oriented perfectionism and positive affect. As the name implies, self-oriented perfectionism
pertains to having high expectations for oneself, rather than the sort of perfectionism that is
perceived by a person to be imposed on them by others. Theoretically, one may hypothesize
that self-oriented perfectionism (X) influences individual differences in state positive affect (Y)
(feelings of enthusiasm, pride, etc.). Furthermore, it may be hypothesized that the association
between self-oriented perfectionism and positive affect may be due (mediated), partially or
fully, to individual differences in flourishing (M) (self-perceived success).
The hypothesized mediation models are depicted visually in Figure C14.7 (Watch
Video 14.15: Mediation - Purpose & Key Terms). In the partial mediation model (Panel A),
there is a straight arrow leading from X to Y (c). Additionally, there is an arrow leading from X
to M (a) and an arrow leading from M to Y (b). The coefficient associated with path (a) is
simply a bivariate regression coefficient. Paths (b) and (c) represent multiple regression beta
weights. Prior to testing the model depicted in Panel A, one would expect to see a statistically
significant total effect between self-oriented perfectionism and positive affect (based on a
Pearson correlation or a bivariate regression). When partial mediation is observed, the direct
C14.41
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
effect (c) between X and Y will reduce in magnitude, but remain significant statistically (p <
.05). Thus, the observation of partial mediation will involve three direct effects that are
statistically significant (a, b, and c; see Figure C14.7, Panel A). In addition to the three direct
effects, there is also one indirect effect associated with the partial mediation model, but you
can’t see it obviously in the model. The indirect effect corresponds to the product of the X-->M
(a) and M-->Y (b) direct effect coefficients. Stated alternatively, the indirect effect represents
the association between X and Y that is due to the influence of M. In practice, testing the
statistical significance of the indirect effect is the tricky part, as most statistical programs do
not test them in an automatic way. However, in order for partial (or full) mediation to be
supported, the indirect effect must be found to be significant statistically.
Figure C14.7 Visual Depictions of Partial (Panel A) and Full (Panel B) Mediation Models
Panel A Panel B
Testing Mediation in SPSS - Example 1). As can be seen in the SPSS table entitled ‘Coefficients’,
the unstandardized (.102) and standardized (.140) beta-weights were significant statistically, p
= .006. Thus, there was a total effect to potentially mediate.
C14.43
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Unfortunately, SPSS does not provide a test of the indirect effect in an automatic way.
Fortunately, there are SPSS syntax/macro files that can be used to test an indirect effect for
statistical significance. A particular popular approach was published by Preacher and Hayes
(2004). You can download the syntax file here: https://tinyurl.com/y9h3bqlx. Preacher and
Hayes’ (2004) program can test the significance of an indirect effect in two ways: (1) Sobel
test; and (2) bootstrapping. With respect to the Sobel test, I obtained the following key results
from the Preacher and Hayes (2004) SPSS macro:
It can be seen that unstandardized indirect effect was estimated at .0849 (same as I calculated
by hand). Furthermore, the standard error was .0231, which yielded a z-value of 3.6702 (i.e.,
.0849/.0231) and a p-value of .0002. Thus, the indirect effect between self-oriented
perfectionism and positive affect via flourishing was significant statistically. Technically, these
results correspond only to the unstandardized effects. You would only be able to imply that
the standardized indirect effect (i.e., .117) was also significant statistically, based on the Sobel
test.
Fortunately, Preacher and Hayes’ macro (2004) also provides the option to test the
statistical significance of an indirect effect via bootstrapping. Some arguments have been
made that the Sobel test does not provide accurate results in all circumstances, particularly
when the sample size is small. Additionally, there is reason to believe that the sampling
distribution of the product of two coefficients may not be normal. Consequently, Preacher and
Hayes (2004) recommend testing the statistical significance of indirect effect via
bootstrapping. As you may know, already, bootstrapping does not assume any level of
normality. When I applied the Preacher and Hayes (2004) macro based on 2,000 bootstrapped
samples, I obtained the following results.
C14.44
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
It can be seen that the unstandardized indirect effect standard error is only slightly
different, in this case. Furthermore, the 95% confidence intervals associated with the point-
estimate (.0849) did not intersect with zero, which is consistent with a statistically significant
indirect effect. Mediation models with only one hypothesized mediator are known as ‘simple’
mediation models. Often, a researcher may hypothesize the possibility of two or more
mediators in the same model. In my opinion, with such relatively complicated models, I would
encourage researchers to move onto a path analytic framework, which can be employed with
a program such as Amos, for example. If you do not have access to a path analysis program,
you may also consider Hayes’ more comprehensive mediation testing program known as
PROCESS (a google search should find the most up-to-date version).
To provide an example of partial mediation, I have conducted an additional mediation
analysis on the following variables included in the Stoeber and Corr (2016) study: social-
oriented perfectionism (a bad thing; perfectionism perceived to be imposed by others),
flourishing, and negative affect. In this case, one may hypothesize that a negative association
(total effect) may be present between socially-oriented perfectionism and negative affect.
Furthermore, flourishing may mediate, partially or fully, the total effect between socially-
oriented perfectionism and negative affect. First, I estimated the total effect between X and Y,
which was found to be significant statistically, unstandardized beta-weight = .390,
standardized beta-weight = .430, p < .001. Thus, there was the possibility of observing
mediation.
Next, the beta-weight between the independent variable (X) and the mediator (M) was
estimated (i.e., coefficient c), which was also significant statistically: unstandardized beta-
weight = -.299; standardized beta-weight = -.240, p < .001.
Finally, a multiple regression was performed by regressing the negative affect dependent
variable (Y) onto the social-perfectionism independent variable (X) and the flourishing
C14.45
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
mediator variable (M). As can be seen in the SPSS table entitled ‘Coefficients’, the flourishing
mediator was associated with negative affect significantly: unstandardized beta-weight = -
.167; standardized beta-weight = -.230, p < .001. Importantly, there was also a statistically
significant direct effect between social-perfectionism and negative affect, unstandardized
beta-weight = -.167 and standardized beta-weight = -.230, p < .001.
nothing to mediate. As I describe in the Advanced Topics section of this chapter devoted to
suppressor effects, however, it is sometimes useful to test the possibility that a total effect will
increase in magnitude by the inclusion of one or more other variables, even if the total effect
is near zero! Such an analysis is not a mediation analysis. It is a suppression analysis.
C14.47
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Advanced Topics
C14.48
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
It can in the table entitled ‘Bootstrap for Coefficients’ that education was found to be a
statistically significant contributor to the model (b = 3756.29, p < .001, 95%CI: 1919.96 and
5354.43). However, IQ was not observed to be a statistically significant contributor to the
model (b = 501.98, p = .060, 95%CI: 53.37 and 1057.54). Very curiously, it will be noted that
the IQ variable 95% lower-bound and upper-bound confidence intervals were both positive in
value, which would suggest a statistically significant effect. Thus, the non-significant p-value
estimated via the bootstrap is not consistent with the confidence intervals estimated via the
bootstrap. Although it is tempting to think the reason this has happened is because SPSS used
two different types of bootstrapping. That is, the p-value is based on the percentile method,
while the confidence intervals are estimated based on the bias corrected accelerated method
(BCa). However, you will get the same (inconsistent) result if you selected the percentile 95%
confidence interval approach. I suspect part of the problem might relate to the fact that these
data are slightly heteroscedastic (as shown in the Foundations section of this chapter).
Conventional bootstrapping can deal with non-normality, it does not deal with violation of the
homoscedasticity assumption. Below, I demonstrate how to conduct a Wild bootstrap, an
estimation method that can accommodate both non-normality and heteroscedasticity.
C14.49
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
one positive and one negative). By contrast, with values such as multiple R2 and bivariate r2, a
non-significant effect implies that the lower-bound reaches zero (it can’t be negative). To
overcome this distinction, statisticians have developed methods to calculate confidence
intervals for values such as R2 and r2 by using the non-central F-distribution (e.g., Lee, 1971). In
practice, this involves more complicated calculations, in comparison to calculating confidence
intervals for R or r. Fortunately, Zou (2007) developed a SPSS syntax file that can be used to
calculate confidence intervals for either multiple R2 or bivariate r2. You can download the
syntax file here: multiple R2 CIs. You can also learn how to use the syntax file by watching me
use it (Watch Video 14.20: Multiple R2 Confidence Intervals in SPSS). When I used Zou’s (2007)
SPSS syntax file to calculate the confidence intervals for the model R2 I reported for the annual
earnings, education and IQ multiple regression example described in the Foundations section
of this chapter (multiple R2 = .130, N = 205, two predictors), I obtained the following results.
Thus, if this study were repeated many times over with different samples all sized N =
250, we would expect 95% of the multiple R2 point estimates to be between .068 and .197. It is
important to note that I specified 90% confidence intervals in the SPSS syntax file (i.e., conlev =
.9000). That wasn’t a mistake. For point-estimates that can only take-on positive values, you
need to specify 90% confidence intervals to correspond to a statistical test with alpha = .05.
C14.50
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
variables included in the model, as well as the magnitude of the correlations between the
independent variables.
Figure C14.8. Bivariate Regression (Panels A and B) and Multiple Regression (Panel C) Scatter Plots
Panel A Panel B Panel C
bivariate regression chapter. The distinction in this example is that there are two beta weights
for which 95% confidence intervals will be estimated. The confidence intervals can then be
used to test the difference between the standardized beta-weight point-estimates. When I ran
the analysis, I obtained the following result (Watch Video 14.22: Standardized Beta Weight
Confidence Intervals in SPSS).
It can be seen that the education predictor standardized beta weight was associated
with the 95% confidence intervals of .090 and .375. Similarly, the IQ predictor standardized
beta weight 95% confidence intervals were .031 and .316. Note that, if you perform the
analysis yourself, you may not get exactly the same results as reported above. The reason is
that your 1000 bootstrapped re-samples are very unlikely to be exactly the same 1000 re-
samples I obtained, when I performed in the analysis in SPSS. However, the results should be
fairly consistent across re-analyses, particularly if a minimum of 2,000 re-samples is selected.
In order to display the results, if you wished to do so, the lower-bound, point-estimate,
and upper-bound estimates can be displayed visually in a HILO chart in SPSS. I used the
following syntax (Data File: dif_between_b) (Watch Video 14.23: Test the Difference Between
two Beta Weights in SPSS).
GRAPH
/HILO(SIMPLE)=VALUE(UB LB Point).
The chart displayed in Figure 14.9 (Panel A) was produced by SPSS. Based on the visual
inspection of the confidence intervals, it is clear that there was much too much overlap to
suggest that there was a statistically significant difference between the standardized beta-
weight point-estimates (.232 vs. 174). In fact, the confidence intervals intersect with the point-
estimates, so, the p-value associated with the difference in the beta weights would be much
greater than .05, in this case. Consequently, it may not be concluded that years of education
was a more substantial contributor to the regression equation than IQ.
C14.52
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
For the purposes of demonstration, I have created a fictitious set of data that were
consistent with a statistically significant difference in the beta-weights, as the confidence
intervals overlapped less than 50% (Data File: dif_between_b_sig). The chart displayed in
Figure C14.3 (Panel B) was created based on that fictitious data.
It can be seen that the lower-bound of education intersected with the upper-bound of
IQ. However, the amount of overlapped was less than 50%; consequently, it may be suggested
that the standardized beta weight point-estimates (.232 vs. .174) were statistically significantly
different from each other (p < .05), in this supplementary fictitious case.
The exact calculations of the procedure involve two steps: (1) calculate the average
length of the two overlapping arms; and (2) calculate the amount of overlap. If the amount of
overlap is less than 50% of the length of the arms, reject the null hypothesis of equal
standardized beta weights, p < .05. Thus, in this case, the average length of the overlapping
arms (step 1) corresponded to (.037 + .031) / 2 = .034. The amount of overlap corresponded to
.205 - .195 = .010. Thus, because .010 is less than 50% of the average arm length (.010 / .034 =
.29 or 29%), the null hypothesis was rejected, p < .05. Finally, I’ll note that the procedure
described here assumes that the sample size is at least 30 (Cumming, 2009).
(i.e., correlation between the standardized predicted values and the absolute standardized
residuals). As the assumption of homoscedasticity has not be satisfied, I can trust the standard
errors and/or the p-values associated with the multiple regression beta-weights.
Consequently, in this case, I need to try to deal with that somehow (e.g., conduct a Wild
bootstrap multiple regression; adjust the standard errors; see remaining topics in this chapter).
Hypothetically, had the preferred Koenker test not produced a statistically significant result
(i.e., p > .05), I wouldn’t have to do any extra work. Instead, I could potentially trust the non-
adjusted standard errors and p-values outputted by SPSS (or any other program).
C14.54
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
BOOTSTRAP
/SAMPLING METHOD=SIMPLE
/VARIABLES TARGET=earnings INPUT= education IQ
/CRITERIA CILEVEL=95 CITYPE=BCA NSAMPLES=2000
/MISSING USERMISSING=EXCLUDE.
REGRESSION
/MISSING LISTWISE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT earnings
/METHOD=ENTER education IQ.
You need to change the second line of syntax (i.e., /SAMPLING METHOD=SIMPLE) to
/SAMPLING METHOD=WILD(RESIDUALS=RES_1). When you run the above syntax, you should
get the key multiple regression results in a SPSS table entitled ‘Bootstrap for Coefficients’:
It can be observed that, based on the wild bootstrap, IQ was not a statistically
significant contributor to the model, p = .086. Furthermore, the lower and upper bound 95%
confidence intervals were consistent with a non-significant p-value, as one value was negative
(-207.72) and one value was positive (862.38). Thus, application of the wild bootstrap, at least
in this case, overcame the awkward situation of a non-significant p-value, in conjunction with
confidence intervals suggestive of a statistically significant effect, which was observed in the
original bootstrap analysis above. (Watch Video 14.25: Wild Bootstrap Multiple Regression in
SPSS)
I’m not aware of any simulation research that has compared the wild bootstrap versus
Hayes and Cai’s (2007) adjusted standard error technique for dealing with heteroscedasticity.
My hunch, however, is that you will be steered in the right direction with the wild bootstrap, in
most cases. Plus, you get to use the term ‘wild bootstrap’ in your report, which is pretty cool.
C14.55
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
A substantial amount of time and space was devoted to dealing with the problem of
homoscedasticity in this chapter, because it is a serious problem, and one which I believe
occurs fairly frequently with field data. However, researchers rarely test for it.
C14.56
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Hayes and Cai’s (2007) method is a viable option, as well, when the assumption of
homoscedasticity has been violated.
9
There is also Budesco’s (1993) dominance analysis to consider. However, the results of a dominance
analysis and a relative importance analysis almost always agree perfectly (Budescu & Azen, 2004).
Shapley’s Value Regression also provides results very similar to a Johnson’s relative weights analysis, but
Shapley’s method is more complicated to compute, and, so, is less popular.
C14.58
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
generated the data to have a sample size of 135, rather than 3,568, to make the results more
interesting (Data File: phd_student_satisfaction). When I ran the relative importance analysis
with the Lorenzo-Seva et al. (2010) syntax file (Watch: Relative Weights Analysis), I obtained
the following results.
First, the program produced a correlation matrix. As can be seen below, the academic
advice variable (‘aca_advi’) was associated with the largest zero-order Pearson correlation
(i.e., r = .681) with student satisfaction (‘satisfac’). However, the remaining three predictors
were also non-negligibly correlated with the dependent variable, as well as inter-correlated
with each other.
Next, the program reported the multiple R and multiple R2, with corresponding 95%
confidence intervals (a really nice touch). It can be seen below that 53.5% of the variance in
graduate student satisfaction with their supervisor was accounted for by the four predictors.
After the unstandardized beta-weights (not included here), the program reported the
standardized beta-weights and corresponding 95% confidence intervals. As can be seen below,
the academic advice predictor variable was associated with the numerically largest
standardized beta-weight (β = .423). The perceived cheap labor variable (‘cheap_la’) variable
was the only one of the four that was not found to be a statistically significant contributor to
the multiple regression equation, as the lower and upper bound 95% confidence intervals
overlapped zero. However, as discussed further below, it would be a mistake to think that a
supervisor could simply treat their students as cheap labor and think it wouldn’t have a
negative impact.
C14.59
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Next, the program reported the structure coefficients. In multiple regression, a structure
coefficient represents the linear association between a predictor and the predicted values
derived from a multiple regression equation. It may be regarded as a competitor to the
relative importance analysis approach to evaluating the importance of predictors. As can be
seen below, the academic advice variable correlated .93 with the regression equation function,
which was the numerically largest correlate. Interestingly, despite the fact that the cheap labor
variable was not found to be a statistically significant unique contributor to the regression
equation (i.e., non-significant beta-weight), the cheap labor variable, nonetheless, correlated
with the regression equation function in a statistically significant way (lower and upper-bound
estimates were both negative). This means that treating graduate students as cheap labor
might impact their perception of their supervisors in a negative way. However, the impact
might be via its influence on another variable (e.g., personal touch?).
Finally, after a couple of preliminary relative importance analysis results (not included here),
the program reported the key relative weights. As can be seen below, the academic advice
variable was associated with a relative weight of 42.0%, which implies that of the 53.5%
percentage of variance accounted for in graduate student satisfaction with their supervisor,
academic advice produced 42.0% of that predicted variance. Next, personal touch and career
development produced nearly identical amounts of the predicted variance (≈ 25%). Finally,
cheap labor accounted for 7.0% of the predicted variance, an amount of predicted variance
that was statistically significant, as the lower-bound exceeded zero. The reason that cheap
labor’s standardized beta-weight was not significant was because it failed to contribute unique
variance to the regression equation. However, the relative importance analysis suggests that
cheap labor did contribute predictive variance that was common to one or more of the other
predictors included in the model.
C14.60
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
C14.61
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
It has been well-established that people can, to some degree, report their own level of
intelligence accurately (Freund & Kasten, 2012). Typically, researchers report a correlation of
about .30 between self-reported IQ and objective intelligence measures (task-based IQ).
Gignac (2018) hypothesized that a construct known as self-deceptive enhancement (similar to
narcissism) would correlate positively with self-reported IQ, but not with task-based IQ. Based
on such a pattern of correlations, one would expect that the association between self-reported
IQ and task-based IQ would increase, if the influence of self-deceptive enhancement on the
self-reported IQ scores were taken into consideration. Stated alternatively, if self-reported IQ
scores could be ‘decontaminated’ of self-deceptive enhancement, then the association
between a decontaminated self-reported IQ variable and task-based IQ would be expected to
be larger than the association between the original (contaminated) self-reported IQ scores and
the task-based IQ scores.
To test the hypothesis, Gignac (2018) administered a self-report IQ questionnaire
(SRIQ), a task-based IQ test (TBIQ), and a self-report measure of self-deceptive enhancement
(SDE) to a sample of 253 young adults. Gignac (2018) found the following Pearson correlations:
SRIQ and TBIQ r = .16, p = .011; SRIQ and SDE r = .44, p < .001; SDE and TBIQ r = -.05, p = .428.
Thus, self-reported IQ was only relatively weakly associated with task-based IQ. However,
because self-deceptive enhancement was fairly substantially correlated positively with self-
reported IQ and very weakly with task-based IQ, there was the possibility of a suppressor
situation.
It should be acknowledged that there are several different types of suppressor
situations (Paulhus, Robins, Trzesniewski & Tracy, 2004). The Gignac (2018) suppressor
situation is consistent with ‘classical suppression’, where X and Y are (usually) relatively weakly
inter-related, X and S are inter-related positively, and Y and S are essentially or entirely
unrelated.10 Thus, the Gignac (2018) study may be regarded as consistent with classical
suppression. Classical suppression may be argued to be best tested by comparing the
difference between the squared zero-order correlation between X and Y and the squared
semi-partial correlation between X and Y, controlling for the effects of S on X (McGrath, 2010;
Smith, Ager Jr & Williams, 1992; Velicer, 1978).
Recall that the zero-order Pearson correlation between self-reported IQ and task-
based IQ was r = .16, thus r2 = .026. In order to obtain the semi-partial correlation between
self-reported IQ and task-based IQ in SPSS, I entered SRIQ and SDE as independent variables
and TBIQ as a dependent variable in a multiple regression. Based on the output from SPSS, I
obtained a semi-partial correlation of r = .202 (see column ‘Part’ in the SPSS table below
10
A classical suppression situation would also be observed where X and S are unrelated, but Y and S are
related. The main issue is that either the independent variable or the dependent variable is
decontaminated of a source of “nuisance” variance (nuisance at least with respect to the association
between X and Y).
C14.62
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
entitled ‘Coefficients’). Thus, the squared semi-partial correlation was equal to r2 = .041. To my
knowledge, a method to test statistically the difference between a squared zero-order
correlation and a squared semi-partial correlation has not yet been established. I suspect the
sample sizes required to detect a significant difference in the r-squared values with
respectable power will be forbidding.
Fortunately, the results of Paunonen and Lebel’s (2012) simulation work can offer
guidance with respect to evaluating possible suppressor effects from the perspective of effect
size. Specifically, based on the results of Paunonen and Lebel (2012), moderate and substantial
suppressor situations may be regarded as consistent with an increase of 1% and 2% in
predictive validity, respectively. Thus, in the Gignac (2018) SRIQ and TBIQ case, the difference
in the semi-partial r2 and the zero-order r2 was Δr2 = .015 (i.e., .041 - .026 = .015)11, which
corresponds to a 1.5% increase, i.e., consistent with somewhere between a moderate and
substantial classical suppressor effect.
Cooperative Suppression
It should be noted that many researchers would test a classical suppression hypothesis
via multiple regression, by regressing Y onto both X and S, with the expectation that the X
beta-weight will be observed to be statistically significantly larger than the X and Y zero-order
correlation. Based on both theoretical and statistical grounds, I do not believe this approach is
entirely appropriate for testing classical suppression hypothesis. Instead, the multiple
regression approach is optimal to test a suppression situation known a cooperative
suppression (a.k.a., reciprocal suppression; Paulhus, 2004). The cooperative suppression
situation is consistent with the following pattern of zero-order correlations: X and Y relatively
weakly inter-related; S and Y relatively weakly inter-related; and X and S inter-related
negatively. Thus, with respect to their associations with Y, both the X and S variables are
mutually suppressed. Therefore, when both X and S are included in a multiple regression to
predict Y, their beta-weights are observed to increase in magnitude, in comparison to their
11
The results reported in Gignac (2018) are slightly different, as the analyses were conducted with
maximum likelihood estimation in Amos with single-indicator latent variables; whereas, here, they have
been conducted with ordinary least squares estimation in SPSS, based on simulated data generated to
correspond closely to the results reported in Gignac (2018).
C14.63
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
It can be seen that incarceration rate correlated positively with both income ratio (income
inequality) and divorce rate at r = .549 and r = .464, respectively. Thus, greater levels of income
inequality and greater levels of divorce were associated with greater levels of incarceration
rates. Furthermore, the income ratio variable correlated negatively with divorce rate at r = -
.214. Consequently, I regressed the income ratio and the divorce rate independent variables
C14.64
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
onto incarceration rate and obtained the following key results reported in the SPSS table
entitled ‘Coefficients’:
It can be seen that the income ratio variable was associated with a standardized beta-weight
equal to .680, which was numerically larger than the zero-order Pearson correlation between
income ratio and incarceration rate (i.e., .549). Such a result suggests the presence of a
suppression situation. Furthermore, the divorce rate variable was associated with a
standardized beta-weight equal to .610, which was numerically larger than the zero-order
Pearson correlation between divorce rate and incarceration rate (i.e., .464). Given both beta-
weights were numerically larger than the respective zero-order Pearson correlation, there was
evidence to suggest cooperative suppression.
In practice, researchers do not test the difference between a standardized-beta weight
and a zero-order correlation statistically. I suspect the reasons is due to that fact that such a
statistical test has not been well-established, yet. Instead, researchers simply interpret the
larger standardized beta-weights than the corresponding zero-order correlations as evidence
for cooperative suppression. Based on such a practice, it may be contended that income
inequality and divorce rate suppress, in a mutual way, their respective statistical prediction of
incarceration rate. Ideally, a relatively easy to implement statistical test will be established, in
the near future, to test the numerical difference between a standardized beta-weight and a
zero-order correlation.
In light of the above, it may be best to focus upon the difference between the
independent variable squared semi-partial correlation and the corresponding squared zero-
order correlation. In this example, income inequality was associated with a squared semi-
partial correlation and a zero-order correlation of .441 and .301, respectively, which is equal to
a difference of .140. A value of .140 is much greater than the .02 expectation for a large
suppressor effect identified by Paunonen and Lebel’s (2012) simulation work. Similarly, divorce
rate was associated with a squared semi-partial correlation and a zero-order correlation of
.355 and .215, respectively. Again, the difference of .140 would suggest a large suppressor
effect.
In contrast to the example above, cooperative suppression can be observed when the
two independent variables have differentially directed zero-order correlations (one positive
C14.65
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
and one negative), but are inter-correlated positively (Paulhus, 2004). Such a suppression
situation can also be found to be associated with larger standardized beta-weights than zero-
order correlations. Finally, there is the suppression situation where X1 and is relatively weakly,
positively inter-related with Y (say, r = .10) and X2 is fairly substantially, positively inter-related
with Y (say, r = .45). Finally, X1 and X2 are also fairly substantially, positively inter-related with
each other. Such a pattern of correlations will result in increased validity coefficients for both
X1 and X2; however, X1 is found to be negatively associated with Y, controlling for the effects of
X2 in a multiple regression. Such a suppression situation is known as negative (or net)
suppression (Paulhus et al., 2004). My impression, is that instances of negative suppression are
far less than common than classical and cooperative suppression. Overall, contrary to popular
belief, my experience suggests that instances of suppressor situations are somewhat more
commonly observed in behavioral science data. Consequently, it is important to keep
suppression as a possibility in mind, when conducting analyses. Additionally, the possibility of
suppression is a reason to not cease any hypothesized mediation analyses, simply because X
and Y are not found to be associated with a statistically significant zero-order correlation. True,
there is no possibility of mediation, when X and Y are not associated with a significant zero-
order correlation. However, there is the possibility of a suppressor situation, which may be
explicable on theoretical grounds.
C14.66
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
Practice Questions
5: Absence frequency versus total number of days absent and manager impression.
What has a bigger impact on managers’ impression of employee absenteeism, the
frequency of employee absent periods, or the total number of days the employee is absent?
Breaugh (1981) collected official human resources data on absenteeism from 112 employees.
Absenteeism was measured in two ways: (1) total number of days absent; and (2) total
number of periods absent. Thus, a person may be absent for only one period within a year, but
it could be for several days in a row. By contrast, a person may miss 5 days of work throughout
a year, but each of those days were across the year (5 separate periods). Breaugh (1981) also
collected day from each employee’s manager with respect to their rating (impression) of work
attendance. Consequently, these data can help answer the question of which type of
absenteeism impacts a manager’s impression more: total number of absent days versus total
number of absent periods? Run a multiple regression with manager ratings as the dependent
variable. Total number of days and total number of periods should be included as the
independent variables. Test the difference between the two standardized beta-weights for
statistically significance (Data File: absence_frequency_total) (Watch: Absence Frequency).
C14.68
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
References
Antonakis, J., & Dietz, J. (2011). Looking for validity or testing it? The perils of stepwise
regression, extreme-scores analysis, heteroscedasticity, and measurement
error. Personality and Individual Differences, 50(3), 409-415.
Barnes, J. C., Boutwell, B. B., Beaver, K. M., & Gibson, C. L. (2013). Analyzing the origins of
childhood externalizing behavioral problems. Developmental Psychology, 49(12), 2272.
Belsey, D. A., Kuh, E., & Welsch, R. E. (2004). Regression diagnostics: Identifying
influential data and sources of collinearity. New York: Wiley.
Bonaa, K. H. & Thelle, D. S. (1991). Association between blood pressure and serum lipids in a
population: the Tromso Study. Circulation, 83, 1305-1314.
Breaugh, J. A. (1981). Predicting absenteeism from prior absenteeism and work
attitudes. Journal of Applied Psychology, 66(5), 555-560.
Budescu, D. V. (1993). Dominance analysis: A new approach to the problem of relative
importance of predictors in multiple regression. Psychological Bulletin, 114, 542-551.
Budescu, D. V., & Azen, R. (2004). Beyond global measures of relative importance: Some
insights from dominance analysis. Organizational Research Methods, 7, 341-350.
Chopik, W. J., Kim, E. S., & Smith, J. (2015). Changes in optimism are associated with changes in
health over time among older adults. Social Psychological and Personality
Science, 6(7), 814-822.
Cumming, G. (2009). Inference by eye: Reading the overlap of independent confidence
intervals. Statistics in Medicine, 28(2), 205-220.
Davidson, R., & Flachaire, E. (2008). The wild bootstrap, tamed at last. Journal of
Econometrics, 146(1), 162-169.
Disatnik, D., & Sivan, L. (2016). The multicollinearity illusion in moderated regression
analysis. Marketing Letters, 27(2), 403-408.
Dziuban, C. D., & Shirkey, E. C. (1974). When is a correlation matrix appropriate for factor
analysis? Some decision rules. Psychological Bulletin, 81(6), 358-361.
Duckworth, A. L., Quinn, P. D., Lynam, D. R., Loeber, R., & Stouthamer-Loeber, M. (2011). Role
of test motivation in intelligence testing. Proceedings of the National Academy of
Sciences, 108(19), 7716-7720.
Flachaire, E. (2005). Bootstrapping heteroskedastic regression models: wild bootstrap
vs. pairs bootstrap. Computational Statistics & Data Analysis, 49(2), 361-376.
Freund, P.A., & Kasten, N. (2012). How smart do you think you are? A meta-analysis
on the validity of self-estimates of cognitive ability. Psychological Bulletin, 138,
296-321.
Gignac, G. E. (2018). Socially desirable responding suppresses the association between self-
C14.69
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.
CHAPTER 14: MULTIPLE REGRESSION
C14.71
Gignac, G. E. (2019). How2statsbook (Online Edition 1). Perth, Australia: Author.