Multiple Regression
Multiple Regression
MULTIPLE REGRESSION
• Multiple regression is not just one technique but a family of techniques that
can be used to explore the relationship between one continuous dependent
variable and a number of independent variables or predictors
• You cannot just throw variables into a multiple regression and hope that,
magically, answers will appear. You should have a sound theoretical or
conceptual reason for the analysis and, in particular, the order of variables
entering the equation
Address a variety of research questions
It can tell you how well a set of variables is able to predict a particular outcome
• For example, you may be interested in exploring how well a set of subscales on an intelligence
test is able to predict performance on a specific task
• Multiple regression will provide you with information about the model as a whole (all subscales)
and the relative contribution of each of the variables that make up the model (individual
subscales).
• As an extension of this, multiple regression will allow you to test whether adding a variable (e.g.
motivation) contributes to the predictive ability of the model, over and above those variables
already included in the model.
Multiple Regression Main point
• The literature in this area suggests that if people feel that they are in control of their lives, they
are less likely to experience ‘stress’
In the questionnaire, there were two different measures of control
These include the Mastery Scale, which measures the degree to which people feel
they have control over the events in their lives;
the Perceived Control of Internal States Scale (PCOISS), which measures the
degree to which people feel they have control over their internal states (their
emotions, thoughts and physical reactions).
• In this example, I am interested in exploring how well the Mastery
Scale and the PCOISS are able to predict scores on a measure of
perceived stress.
Variables:
Total perceived stress (tpstress): total score on the Perceived Stress Scale. High scores indicate
high levels of stress.
Total Perceived Control of Internal States (tpcoiss): total score on the Perceived Control of
Internal States Scale. High scores indicate greater control over internal states.
Total Mastery (tmast): total score on the Mastery Scale. High scores indicate higher levels of
perceived control over events and circumstances.
Total Social Desirability (tmarlow): total scores on the Marlowe-Crowne Social Desirability
Scale, which measures the degree to which people try to present themselves in a positive light.
• Age: age in years.
Example of research questions:
1. How well do the two measures of control (mastery, PCOISS) predict perceived
stress? How much variance in perceived stress scores can be explained by
scores on these two scales?
3. If we control for the possible effect of age and socially desirable responding, is
this set of variables still able to predict a signifi cant amount of the variance in
perceived stress?
What you need
1. one continuous dependent variable (Total perceived stress)
• Multiple regression tells you how much of the variance in your dependent
variable can be explained by your independent variables.
Question 1: How well do the two measures of control (mastery, PCOISS) predict perceived
stress? How much variance in perceived stress scores can be explained by scores on these two
scales?
Question 2: Which is the best predictor of perceived stress: control of external events (Mastery
Scale) or control of internal states (PCOISS)?
Step 1: Checking the assumptions
You probably don’t want to include two variables with a bivariate correlation of .7 or more in
the same analysis.
If you find yourself in this situation, you may need to consider omitting one of the variables or
forming a composite variable from the scores of the two highly correlated variables. In the
example presented here the correlation is .52, which is less than .7; therefore all variables will
be retained.
Collinearity Diagnostics
R Square. This tells you how much of the variance in the dependent
variable (perceived stress) is explained by the model (which includes the
variables of Total Mastery and Total PCOISS). In this case, the value is .
468.
Step 3: Evaluating each of the independent variables
which of the variables included in the model contributed to the prediction of the dependent variable
In this case the largest beta coeffi cient is –.42, which is for Total Mastery.
This means that this variable makes the strongest unique contribution to explaining the dependent
variable, when the variance explained by all other variables in the model is controlled for. The Beta
value for Total PCOISS was slightly lower (–.36), indicating that it made less of a unique contribution.
In this case, both Total Mastery and Total PCOISS made
check the value in the column marked Sig. This tells a unique, and statistically significant, contribution to the
you whether this variable is making a statistically prediction of perceived stress scores.
significant unique contribution to the equation.
If the Sig. value is less than .05 (.01, .0001, etc.), the
variable is making a significant unique contribution to
the prediction of the dependent variable.
In this example, the Mastery Scale has a part correlation co-efficient of –.36. If we square this (multiply it
by itself) we get .13, indicating that Mastery uniquely explains 13 per cent of the variance in Total
perceived stress scores. For the PCOISS the value is –.31, which squared gives us .09, indicating a unique
contribution of 9 per cent to the explanation of variance in perceived stress.
The next thing we want to know is which of the variables included in the model
contributed to the prediction of the dependent variable. We find this information in
the output box labelled Coefficients. Look in the column labelled Beta under
Standardised Coefficients. To compare the different variables it is important that
you look at the standardised coefficients, not the unstandardised ones.
‘Standardised’ means that these values for each of the different variables have been
converted to the same scale so that you can compare them. If you were interested
in constructing a regression equation, you would use the unstandardised coefficient
values listed as B.