Multiple Linear Regression 1

Multiple Linear Regression I
Image source:http://commons.wikimedia.org/wiki/File:Vidrarias_de_Laboratorio.jpg
Lecture 7
Survey Research & Design in Psychology
James Neill, 2016
Creative Commons Attribution 4.0
Overview
1. Correlation (Review)
2. Simple linear regression
3. Multiple linear regression
– General steps
– Assumptions
– R, coefficients
– Equation
– Types
4. Summary
5. MLR I Quiz - Practice questions
2
Readings
1. Howitt & Cramer (2011/2014):
– Regression: Prediction with precision
[Ch 8/9] [Textbook/eReserve]
– Multiple regression & multiple correlation
[Ch 31/32] [Textbook/eReserve]
2. Tabachnick & Fidell (2013).
Multiple regression
(includes example write-ups) [eReserve]
3. StatSoft (2016). How to find relationship
between variables, multiple regression. StatSoft
Electronic Statistics Handbook. [Online]
3
Correlation (Review)
Linear relation between

two variables
Purposes of
correlational statistics
Explanatory - Regression Predictive - Regression

e.g., hours of study → e.g., demographics → life
academic grades expectancy
5
Linear correlation
● Linear relations between continuous
variables
● Line of best fit on a scatterplot
6
Correlation is shared variance
.68 .32 .68
Venn diagrams are helpful for depicting

relations between variables. 7
Correlation – Key points
• Covariance = sum of cross-products
(unstandardised)
• Correlation = sum of cross-products
(standardised), ranging from -1 to 1
(sign indicates direction, value indicates size)
2
• Coefficient of determination (r )
indicates % of shared variance
• Correlation does not necessarily
equal causality 8
Simple linear
regression
Explains and predicts a Dependent Variable

(DV) based on a linear relation with an
Independent Variable (IV)
What is simple linear regression?
• An extension of correlation
• Best-fitting straight line for a scatterplot
between two variables. Involves:
• a predictor (X) variable – also called an
independent variable (IV)
• an outcome (Y) variable - also called a dependent
variable (DV) or criterion variable
• Uses an IV to explain/predict a DV
• Can help to understand possible causal
effects of one variable on another.
10
resi Least squares criterion
Least
The line ofsquares criterion
best fit minimises
the total sum of squares of
the vertical deviations for
each case.
b = slope
of the line of best fit
residuals
= vertical (Y) distance
between line of best fit
a = point at which line of and each observation
best fit crosses the Y-axis. (unexplained variance)
11
Linear Regression - Example:
Cigarettes & coronary heart disease
Example from Landwehr & Watkins (1987),
cited in Howell (2004, pp. 216-218) and accompanying lecture notes.
IV = Cigarette DV = Coronary
consumption Heart Disease 12
Linear regression - Example:
(Howell, 2004)
Research question:
How fast does CHD mortality rise
with a one unit increase in smoking?
• IV = Av. # of cigs per adult per day
• DV = CHD mortality rate (deaths per
10,000 per year due to CHD)
• Unit of analysis = Country
13
Linear regression - Data:
(Howell, 2004)
14
Scatterplot with Line of Best Fit
CHD Mortality per 10,000 30
20
10
0
2 4 6 8 10 12
Cigarette Consumption per Adult per Day 15

Linear regression equation
(without error)
slope = rate of Y-intercept =

predicted
values of Y increase/decrea level of Y
se of Y hat for when X is 0.
each unit
increase in X
16
Linear regression equation
(with error)
Y = bX + a + e
X = IV values
Y = DV values
a = Y-axis intercept
b = slope of line of best fit
(regression coefficient)
e = error
17
Linear regression – Example:
Equation
Variables:
• (DV) = predicted rate of CHD mortality
• X (IV) = mean # of cigarettes per adult
per day per country
Regression co-efficients:
• b = rate of ↑/↓ of CHD mortality for each
extra cigarette smoked per day
• a = baseline level of CHD (i.e., CHD
when no cigarettes are smoked)
18
Explained variance
• r = .71
• R2 = .712 = .51
• Approximately 50% in variability
of incidence of CHD mortality is
associated with variability in
smoking rates.
19
Test for overall significance
● R = .71, R2 = .51, p < .05
ANOVAb
Sum of Mean
Squares df Square F Sig.
Regression 454.482 1 454.48 19.59 .00a
Residual 440.757 19 23.198
Total 895.238 20
a. Predictors: (Constant), Cigarette Consumption per
Adult per Day
b. Dependent Variable: CHD Mortality per 10,000 20
Regression coefficients - SPSS
Coefficientsa
Unstandardiz
ed Standardized
Coefficients Coefficients
Std.
B Error Beta t Sig.
a (Constant) 2.37 2.941 .80 .43
Cigarette
b Consumption 2.04 .461 .713 4.4 .00
per Adult per
Day
a. Dependent Variable: CHD Mortality per 10,000
21
Making a prediction
● What if we want to predict CHD mortality
when cigarette consumption is 6?
Yˆ = bX + a = 2.04 X + 2.37
Yˆ = 2.04 * 6 + 2.37 = 14.61
● We predict that 14.61 / 10,000 people in a
country with an average cigarette
consumption of 6 per person will die of
coronary heart disease per annum. 22
Accuracy of prediction - Residual
• Finnish smokers smoke 6
cigarettes/adult/day
• We predict 14.61 deaths /10,000
• But Finland actually has 23
deaths / 10,000
• Therefore, the error (“residual”)
for this case is 23 - 14.61 = 8.39
23
30
Residual
CHD Mortality per 10, 000
20
Prediction
10
0
2 4 6 8 10 12
Cigarette Consumption per Adult per D ay 24

Hypothesis testing
Null hypotheses (H0):

• a (Y-intercept) = 0
• b (slope of line of best fit) = 0
25
Testing slope and intercept
Coefficientsa
Unstandardiz
ed Standardized
Std.
B Error Beta t Sig.
a (Constant) 2.37 2.941 .80 .43
Cigarette
b Consumption 2.04 .461 .713 4.4 .00
per Adult per
Day
a. Dependent Variable: CHD Mortality per 10,000
26
Linear regression - Example
Does a tendency to
‘ignore problems’ (IV)
predict
‘psychological distress’ (DV)?
27
140
Line of best fit
120 seeks to minimise
sum of squared
residuals
Psychological Distress
100
80
60
PD is
measured
in the 40
direction of
mental
20 Rsq = 0.1058
health – i.e.,
high scores 0 1 2 3 4 5
mean less Higher IP scores indicate

distress. Ignore the Problem greater frequency of ignoring
problems as a way of coping.28
Model Summary
Adjusted Std. Error of

Model R R Square R Square the Estimate
1 .325a .106 .102 19.4851
a. Predictors: (Constant), IGNO2 ACS Time 2 - 11. Ignore
R = .32, R2 = .11, Adjusted R2 = .10

Ignoring Problems accounts for ~10% of the
The predictor (Ignore the Problem) explains
variation in Psychological Distress
approximately 10% of the variance in the
dependent variable (Psychological Distress).
29
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 9789.888 1 9789.888 25.785 .000a
Residual 82767.884 218 379.669
Total 92557.772 219
a. Predictors: (Constant), IGNO2 ACS Time 2 - 11. Ignore
b. Dependent Variable: GWB2NEG
The population relationship between Ignoring

Problems and Psychological Distress is
unlikely to be 0% because p = .000
(i.e., reject the null hypothesis that there is no
relationship) 30
Coefficientsa
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 118.897 4.351 27.327 .000
IGNO2 ACS Time
-9.505 1.872 -.325 -5.078 .000
2 - 11. Ignore
a. Dependent Variable: GWB2NEG
There is a sig. a or constant (Y-intercept) - this

is the baseline level of Psychological Distress.
In addition, Ignore Problems (IP) is a
significant predictor of Psychological Distress
(PD). PD = 119 - 9.5*IP 31
140
120
a = 119
Psychological Distress
100
b = -9.5
80
e=
error
60
40
PD = 119 - 9.5*IP
20 Rsq = 0.1058
0 1 2 3 4 5
Ignore the Problem

32
Linear regression summary
• Linear regression is for

explaining or predicting the
linear relationship between two
variables
•Y = bx + a + e
• = bx + a
(b is the slope; a is the Y-intercept)
33
Multiple Linear
Regression
Linear relations between two

or more IVs and a single DV
What is multiple linear regression (MLR)?
Visual model
Linear Regression
Single predictor X Y
Multiple Linear Regression

X1
Multiple X2
X3 Y
predictors
X4
X5
35
What is MLR?
• Use of several IVs to predict a DV
• Weights each predictor (IV)
according to the strength of its
linear relationship with the DV
• Makes adjustments for inter-
relationships among predictors
• Provides a measure of overall fit (R)
36
What is MLR?
Correlation /
X Y
Regression
Y
Correlation
Partial correlation
MLR X1 X2
37
What is MLR?
A 3-way scatterplot can depict the correlational
relationship between 3 variables.
However, it is difficult to graph/visualise 4+-

way relationships via scatterplot. 38
General steps
1. Develop a visual model and
express a research question
and/or hypotheses
2. Check assumptions
3. Choose type of MLR
4. Interpret output
5. Develop a regression equation
(if needed)
39
LR → MLR example:
• ~50% of the variance in CHD
mortality could be explained by
cigarette smoking (using LR)
• Strong effect - but what about the
other 50% (‘unexplained’
variance)?
• What about other predictors?
–e.g., exercise and cholesterol?
40
MLR – Example
Research question 1
How well do these three IVs:
• # of cigarettes / day (IV1)
• exercise (IV2) and
• cholesterol (IV3)
predict
• CHD mortality (DV)?
Cigarettes
Exercise CHD Mortality
Cholesterol
41
MLR – Example
Research question 2
To what extent do personality factors

(IVs) predict annual income (DV)?
Extraversion
Neuroticism Income
Psychoticism
42
MLR - Example
Research question 3
“Does the # of years of formal study
of psychology (IV1) and the no. of
years of experience as a
psychologist (IV2) predict clinical
psychologists’ effectiveness in
treating mental illness (DV)?”
Study
Experience Effectiveness
43
MLR - Example
Your example
Generate your own MLR research
question (e.g., based on some of the following
variables):
• Gender & Age • Time management
• Stress & Coping – Planning
– Procrastination
• Uni student satisfaction – Effective actions
– Teaching/Education
– Social • Health
– Campus – Psychological
– Physical
44
Assumptions
• Levels of measurement
• Sample size
• Normality (univariate, bivariate, and multivariate)
• Linearity: Linear relations between IVs & DVs
• Homoscedasticity
• Multicollinearity
– IVs are not overly correlated with one another
(e.g., not over .7)
• Residuals are normally distributed
45
Levels of measurement
• DV = Continuous
(Interval or Ratio)
• IV = Continuous or Dichotomous
(if neither, may need to recode
into a dichotomous variable
or create dummy variables)
46
Dummy coding
• “Dummy coding” converts a more
complex variable into a series of
dichotomous variables
(i.e., 0 or 1)
• So, dummy variables are
dichotomous variables created
from a variable with a higher level
of measurement.
47
Dummy coding - Example
• Religion
(1 = Christian; 2 = Muslim; 3 = Atheist)
can't be an IV in regression
(a linear correlation with a categorical
variable doesn't make sense).
• However, it can be dummy coded into
dichotomous variables:
– Christian (0 = no; 1 = yes)
– Muslim (0 = no; 1 = yes)
– Atheist (0 = no; 1 = yes) (redundant)
• These variables can then be used as IVs.
• More information (Dummy variable (statistics), Wikiversity)48
Sample size:
Some rules of thumb
• Enough data is needed to provide reliable estimates
of the correlations.
• N >= 50 cases and N >= 10 to 20 as many cases as
there are IVs, otherwise the estimates of the regression line
are probably unstable and are unlikely to replicate if the study is
repeated.
• Green (1991) and Tabachnick & Fidell (2013)
suggest:
– 50 + 8(k) for testing an overall regression model and
– 104 + k when testing individual predictors (where k is the
number of IVs)
– Based on detecting a medium effect size (β >= .20), with
critical α <= .05, with power of 80%. 49
Dealing with outliers
Extreme cases
should be deleted or modified if
they are overly influential.
• Univariate outliers -
detect via initial data screening
• Bivariate outliers -
detect via scatterplots
• Multivariate outliers -
unusual combination of predictors –
detect via Mahalanbis' distance 50
Multivariate outliers
• A case may be within normal range for
each variable individually, but be a
multivariate outlier based on an unusual
combination of responses which unduly
influences multivariate test results.
• e.g., a person who:
– Is 18 years old
– Has 3 children
– Has a post-graduate degree
51
• Identify & check unusual

cases
• Use Mahalanobis' distance or
Cook’s D as a MV outlier
screening procedure
52
• Mahalanobis' distance (MD)
– Distributed as χ2 with df equal to the number of
predictors (with critical α = .001)
– Cases with a MD greater than the critical value
are multivariate outliers.
• Cook’s D
– Cases with CD values > 1 are multivariate
outliers.
• Use either MD or CD
• Examine cases with extreme MD or CD
scores - if in doubt, remove & re-run. 53
Normality &
homoscedasticity
Normality
• If variables are non-normal,
this will create
heteroscedasticity
Homoscedasticity
• Variance around the
regression line should be
the same throughout the
distribution
• Even spread in residual
plots 54
Multicollinearity
• Multicollinearity – IVs shouldn't

be overly correlated (e.g., over .7)
– if so, consider removing one.
• Singularity - perfect correlations
among IVs.
• Leads to unstable regression
coefficients.
55
Multicollinearity
Detect via:
 Correlation matrix - are there
large correlations among IVs?

 Tolerance statistics - if < .3 then
exclude that variable.

 Variance Inflation Factor (VIF) –
if < 3, then exclude that variable.

 VIF is the reciprocal of Tolerance
(so use one or the other – not both) 56

Causality
• Like correlation, regression does
not tell us about the causal
relationship between variables.
• In many analyses, the IVs and DVs
could be swapped around –
therefore, it is important to:
–Take a theoretical position
–Acknowledge alternative explanations
57
Multiple correlation coefficient
(R)
• “Big R” (capitalised)
• Equivalent of r, but takes into
account that there are multiple
predictors (IVs)
• Always positive, between 0 and 1
• Interpretation is similar to that for r
(correlation coefficient)
58
2
Coefficient of determination (R )
• “Big R squared”
• Squared multiple correlation
coefficient
• Usually report R2 instead of R
• Indicates the % of variance in
DV explained by combined
effects of the IVs
• Analogous to r2
59
Rule of thumb for
2
interpretation of R
• .00 = no linear relationship
• .10 = small (R ~ .3)
• .25 = moderate (R ~ .5)
• .50 = strong (R ~ .7)
• 1.00 = perfect linear relationship
R ~ .30 is good for social sciences
2
60
2
Adjusted R
• R2 is explained variance in a sample.
• Adjusted R2 is used for estimating
explained variance in a population.
• Report R2 and adjusted R2
• Particularly for small N and where
results are to be generalised, take
more note of adjusted R2
61
Multiple linear regression –
Test for overall significance
• Shows if there is a linear
relationship between all of the X
variables taken together and Y
• Examine F and p in the ANOVA
table to determine the likelihood
that the explained variance in Y
could have occurred by chance
62
Regression coefficients
• Y-intercept (a)
• Slopes (b):
–Unstandardised
–Standardised
• Slopes are the weighted loading of
each IV on the DV, adjusted for the
other IVs in the model.
63
Unstandardised
regression coefficients
• B = unstandardised regression
coefficient
• Used for regression equations
• Used for predicting Y scores
• But can’t be compared with other Bs
unless all IVs are measured on the
same scale
64
Standardised
regression coefficients
• Beta (β) = standardised regression

coefficient
• Useful for comparing the relative
strength of predictors
• β = r in LR but this is only true in
MLR when the IVs are uncorrelated.
65
Test for significance:
Individual variables
Indicates the likelihood of a linear
relationship between each variable
Xi and Y occurring by chance.
Hypotheses:
H0: βi = 0 (No linear relationship)
H1: βi ≠ 0 (Linear relationship
between Xi and Y)
66
Relative importance of IVs
• Which IVs are the most important?

• To answer this, compare the
standardised regression
coefficients (β’s)
67
Regression equation
Y = b1x1 + b2x2 +.....+ bixi + a + e
• Y = observed DV scores
• bi = unstandardised regression
coefficients (the Bs in SPSS) -
slopes
• x1 to xi = IV scores
• a = Y axis intercept
• e = error (residual)
68
Multiple linear regression -
Example
“Does ‘ignoring problems’ (IV1)
and ‘worrying’ (IV2)
predict ‘psychological distress’
(DV)”
69
70
Y
.32 .52
.35
X2
X1
71
Example
Together, Ignoring Problems and Worrying

explain 30% of the variance in Psychological
Distress in the Australian adolescent
population (R2 = .30, Adjusted R2 = .29). 72
Example
The explained variance in the population is

unlikely to be 0 (p = .00).
73
Example
Coefficients a
Unstandardized Standardized
Model B Std. Error Beta t Sig.
1 (Constant) 138.932 4.680 29.687 .000
Worry -11.511 1.510 -.464 -7.625 .000
Ignore the Problem -4.735 1.780 -.162 -2.660 .008
a. Dependent Variable: Psychological Distress
Worry predicts about three times as much

variance in Psychological Distress than Ignoring
the Problem, although both are significant,
negative predictors of mental health. 74
Example – Prediction equations
Linear Regression
PD (hat) = 119 – 9.50*Ignore
R2 = .11
Multiple Linear Regression

PD (hat) = 139 - .4.7*Ignore - 11.5*Worry
R2 = .30
75
Confidence interval for the slope
Mental Health (PD) is reduced by between 8.5 and

14.5 units per increase of Worry units.
Mental Health (PD) is reduced by between 1.2 and
8.2 units per increase in Ignore the Problem units.
76
Multiple linear regression - Example
Effect of violence, stress, social support
on internalising behaviour problems
Kliewer, Lepore, Oskin, & Johnson, (1998)
77
Example - Study
• Participants were children:
– 8 - 12 years
– Lived in high-violence areas, USA
• Hypotheses:
– Violence and stress →
↑ internalising behaviour
– Social support →
↓ internalising behaviour. 78
Example - Variables
• Predictors
–Degree of witnessing violence
–Measure of life stress
–Measure of social support
• Outcome
–Internalising behaviour
(e.g., depression, anxiety, withdrawal
symptoms) – measured using the
Child Behavior Checklist (CBCL)
79
Correlations
Pearson Correlation Correlations
Internalizin
Correlations
amongst Amount g
the IVs violenced Current Social symptoms
witnessed stress support on CBCL
Amount violenced
Correlations
witnessed
between the
Current stress .050 IVs and the DV
Social support .080 -.080
Internalizing symptoms
.200* .270** -.170
on CBCL
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).
80
2
R
Model Summary
Adjusted Std. Error
R R of the
R Square Square Estimate
.37a .135 .108 2.2198
a. Predictors: (Constant), Social
support, Current stress, Amount
violenced witnessed
81
Coefficientsa
Regression coefficients
UnstandardizedStandardized
Std.
B Error Beta t Sig.
(Constant) .477 1.289 .37 .712
Amount
violenced .038 .018 .201 2.1 .039
witnessed
Current stress .273 .106 .247 2.6 .012
Social
-.074 .043 -.166 -2 .087
support
a. Dependent Variable: Internalizing symptoms on CBC
82
Regression equation
Yˆ = b1 X 1 + b2 X 2 + b3 X 3 + b0
= 0.038Wit + 0.273 Stress − 0.074 SocSupp + 0.477
• A separate coefficient or slope for

each variable
• An intercept (here its called b0)
83
Interpretation
Yˆ = b1 X 1 + b2 X 2 + b3 X 3 + b0
= 0.038Wit + 0.273 Stress − 0.074 SocSupp + 0.477
• Slopes for Witness and Stress are +ve;
slope for Social Support is -ve.
• Ignoring Stress and Social Support, a
one unit increase in Witness would
produce .038 unit increase in
Internalising symptoms.
84
Predictions
If Witness = 20, Stress = 5, and
SocSupp = 35, then we would predict
that internalising symptoms would be
… .012.
Yˆ = .038 *Wit + .273 * Stress − .074 * SocSupp + 0.477
= .038(20) + .273(5) − .074(35) + 0.477
= .012
85
Multiple linear regression - Example
The role of human, social, built, and natural
capital in explaining life satisfaction at the
country level:
Towards a National Well-Being Index (NWI)
Vemuri & Costanza (2006)
86
Variables
• IVs:
–Human & Built Capital
(Human Development Index)
–Natural Capital
(Ecosystem services per km2)
–Social Capital
(Press Freedom)
• DV = Life satisfaction
• Units of analysis: Countries
(N = 57; mostly developed countries, e.g., in Europe
and America)
87
● There are moderately strong positive and
statistically significant linear relations between
the IVs and the DV
● The IVs have small to moderate positive
inter-correlations.
88
● R2 = .35
● Two sig. IVs (not Social Capital - dropped)
89
90
● R2 = .72
(after dropping 6 outliers) 91
Types of MLR
• Standard or direct (simultaneous)
• Hierarchical or sequential
• Stepwise (forward & backward)
92
Direct or Standard
• All predictor variables are entered
together (simultaneously)
• Allows assessment of the
relationship between all predictor
variables and the criterion (Y)
variable if there is good theoretical
reason for doing so.
• Manual technique & commonly used
93
Hierarchical (Sequential)
• IVs are entered in blocks or stages.
–Researcher defines order of entry for the
variables, based on theory.
–May enter ‘nuisance’ variables first to
‘control’ for them, then test ‘purer’ effect
of next block of important variables.
2
• R change - additional variance in Y
explained at each stage of the
regression.
– F test of R2 change.
94
Hierarchical (Sequential)
• Example
– Drug A is a cheap, well-proven drug which reduces
AIDS symptoms
– Drug B is an expensive, experimental drug which
could help to cure AIDS
– Hierarchical linear regression:
• Step 1: Drug A (IV1)
• Step 2: Drug B (IV2)
• DV = AIDS symptoms
• Research question: To what extent does Drug B
reduce AIDS symptoms above and beyond the effect
of Drug A?
• Examine the change in R2 between Step 1 & Step 2
95
Forward selection
• The strongest predictor variables

are entered, one by one, if they
reach a criteria (e.g., p < .05)
• Best predictor =
IV with the highest r with Y
• Computer-driven - controversial
96
Backward elimination
• All predictor variables are entered,

then the weakest predictors are
removed, one by one, if they meet a
criteria (e.g., p > .05)
• Worst predictor = x with the lowest r
with Y
97
Stepwise
• Combines forward & backward.
• At each step, variables may be
entered or removed if they meet
certain criteria.
• Useful for developing the best
prediction equation from a large
number of variables.
• Redundant predictors removed.
98
Which method?
• Standard: To assess impact of
all IVs simultaneously
• Hierarchical: To test IVs in a
specific order (based on
hypotheses derived from theory)
• Stepwise: If the goal is accurate
statistical prediction e.g., from a
large # of variables - computer
driven 99
Summary
100
Summary: General steps
1. Develop model and hypotheses
2. Check assumptions
3. Choose type
4. Interpret output
5. Develop a regression equation
(if needed)
101
Summary: Linear regression
1. Best-fitting straight line for a
scatterplot of two variables
2. Y = bX + a + e
1. Predictor (X; IV)
2. Outcome (Y; DV)
3. Least squares criterion
4. Residuals are the vertical
distance between actual and
predicted values
102
Summary:
MLR assumptions
1. Level of measurement
2. Sample size
3. Normality
4. Linearity
5. Homoscedasticity
6. Collinearity
7. Multivariate outliers
8. Residuals should be normally
distributed
103
Summary:
Level of measurement and
dummy coding
1. Levels of measurement
1. DV = Continuous
2. IV = Continuous or dichotomous
2. Dummy coding
1. Convert complex variable into series of
dichotomous IVs
104
Summary:
MLR types
1. Standard
2. Hierarchical
3. Stepwise / Forward / Backward
105
Summary:
MLR output
1. Overall fit
1. R, R2, Adjusted R2
2. F, p
2. Coefficients
1. Relation between each IV and the DV,
adjusted for the other IVs
2. B, β, t, p, and rp
3. Regression equation (if useful)
Y = b1x1 + b2x2 +.....+ bixi + a + e
106
Practice quiz
107
MLR I Quiz –
Practice question 1
A linear regression analysis produces the
equation Y = 0.4X + 3. This indicates
that:
(a) When Y = 0.4, X = 3
(b) When Y = 0, X = 3
(c) When X = 3, Y = 0.4
(d) When X = 0, Y = 3
(e) None of the above
108
MLR I Quiz –
Practice question 2
Multiple linear regression is a
________ type of statistical analysis.
(a) univariate
(b) bivariate
(c) multivariate
109
MLR I Quiz –
Practice question 3
The following types of data can be used in
MLR (choose all that apply):
(a) Interval or higher DV
(b) Interval or higher IVs
(c) Dichotomous Ivs
(d) All of the above
(e) None of the above
110
MLR I Quiz –
Practice question 4
In MLR, the square of the multiple
2
correlation coefficient, R , is called the:
(a) Coefficient of determination
(b) Variance
(c) Covariance
(d) Cross-product
(e) Big R
111
MLR I Quiz –
Practice question 5
In MLR, a residual is the difference
between the predicted Y and actual Y
values.
(a) True
(b) False
112
Next lecture
• Review of MLR I
• Semi-partial correlations
• Residual analysis
• Interactions
• Analysis of change
113
References
Howell, D. C. (2004). Chapter 9: Regression. In D. C. Howell..
Fundamental statistics for the behavioral sciences (5th ed.) (pp. 203-
235). Belmont, CA: Wadsworth.
Howitt, D. & Cramer, D. (2011). Introduction to statistics in psychology
(5th ed.). Harlow, UK: Pearson.
Kliewer, W., Lepore, S.J., Oskin, D., & Johnson, P.D. (1998). The role of
social and cognitive processes in children’s adjustment to community
violence. Journal of Consulting and Clinical Psychology, 66, 199-209.
Landwehr, J.M. & Watkins, A.E. (1987) Exploring data: Teacher’s
edition. Palo Alto, CA: Dale Seymour Publications.
Tabachnick, B. G., & Fidell, L. S. (2013) (6th ed. - International ed.).
Multiple regression [includes example write-ups]. In Using multivariate
statistics (pp. 117-170). Boston, MA: Allyn and Bacon.
Vemuri, A. W., & Constanza, R. (2006). The role of human, social, built,
and natural capital in explaining life satisfaction at the country level:
Toward a National Well-Being Index (NWI). Ecological Economics,
58(1), 119-133.
114
Open Office Impress
● This presentation was made using
Open Office Impress.
● Free and open source software.
● http://www.openoffice.org/product/impress.html
115

Multiple Linear Regression 1

Uploaded by

Copyright:

Available Formats

Multiple Linear Regression 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multiple Linear Regression 1

Uploaded by

Copyright:

Available Formats

Multiple Linear Regression I

Linear relation between

Explanatory - Regression Predictive - Regression

.68 .32 .68

Venn diagrams are helpful for depicting

Explains and predicts a Dependent Variable

Cigarette Consumption per Adult per Day 15

slope = rate of Y-intercept =

Cigarette Consumption per Adult per D ay 24

Null hypotheses (H0):

mean less Higher IP scores indicate

Adjusted Std. Error of

R = .32, R2 = .11, Adjusted R2 = .10

The population relationship between Ignoring

There is a sig. a or constant (Y-intercept) - this

Ignore the Problem

• Linear regression is for

Linear relations between two

Multiple Linear Regression

However, it is difficult to graph/visualise 4+-

To what extent do personality factors

• Identify & check unusual

• Multicollinearity – IVs shouldn't

large correlations among IVs?

exclude that variable.

if < 3, then exclude that variable.

(so use one or the other – not both) 56

• Beta (β) = standardised regression

• Which IVs are the most important?

Together, Ignoring Problems and Worrying

The explained variance in the population is

Worry predicts about three times as much

Multiple Linear Regression

Mental Health (PD) is reduced by between 8.5 and

• A separate coefficient or slope for

• The strongest predictor variables

• All predictor variables are entered,

You might also like