0% found this document useful (0 votes)
9 views

Correlation Regression

Uploaded by

hehefarwa91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Correlation Regression

Uploaded by

hehefarwa91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

1

Correlation & Regression


2

 Researchers often wish to measure the degree of


relationship between two variables. For example, there
is likely to be a relationship between age and reading
Correlation ability in children. A test of correlation will provide you
with a measure of the strength and direction of such a
relationship.
3

 In a correlation there is no independent variable: you


simply measure two variables. So, if someone wished to
investigate the effect of smoking on respiratory
function, then they could measure both how many
Correlation cigarettes people smoke and their respiratory function,
and then test for a correlation between these two
variables.
4

 Correlation does not imply causation. In any correlation,


there could be a third variable which explains the
association between the two variables that you
measured. For example, there may be a correlation
between the number of ice creams sold and the
number of people who drown. Here temperature is the
Correlation third variable, which could explain the relationship
between the measured variables. Even when there
seems to be a clear cause and effect relationship, a
correlation alone is not sufficient evidence for a causal
relationship.
5

 Francis Galton carried out early work on correlation,


and one of his colleagues, Pearson, developed a
method of calculating correlation coefficients for
parametric data: Pearson’s Product Moment Correlation
Correlation Coefficient (Pearson’s r). If the data are not parametric,
or if the relationship is not linear, then a nonparametric
test of correlation, such as Spearman’s r should be used.
6

 Note that for a correlation to be acceptable one should


Correlation normally test at least 100 participants.
7

Correlation
8

Correlation
9

 The value of r indicates the strength of the correlation,


and it is a measure of effect size. As a rule of thumb, r
values of 0 to .2 are generally considered weak, .3 to .6
moderate, and .7 to 1 strong.
Correlation  The strength of the correlation alone is not necessarily
an indication of whether it is an important correlation:
normally the significance value should also be
considered.
10

 With small sample sizes this is crucial, as strong


correlations may easily occur by chance. With large to
very large sample sizes, however, even a small
correlation can be highly statistically significant.
 For example, if you carry out a correlational study with a
sample of 100 and obtain r of .2, it is significant at the
Correlation .05 level, two-tailed. Yet .2 is only a weak correlation.
Thus we recommend you report the effect size, the
statistical significance and something known as the
proportion of variation.
 Proportion of variation explained does not have to be
large to be important.
11

 Regression is a statistical technique that allows us to


predict someone’s score on one variable on the basis of
their scores on one or more other variables.
 Regression involves one dependent variable, which we
term the ‘criterion variable’, and one or more
independent variables, which we refer to as the
‘predictor variables’.
Regression  With only one predictor variable, we have bivariate
regression; multiple regression involves two or more
predictor variables.
 The predictor variables can be measured using a range
of scales (although ideally at interval or ratio level) but
the criterion variable should be measured using a ratio
or interval scale.
12

 Human behavior is inherently noisy and therefore it is


not possible to produce totally accurate predictions;
however, regression allows us to assess how useful the
predictor variable/s are in estimating a participant’s
Regression likely score on a criterion variable.
 Multiple regression allows us to identify which set of the
predictor variables together provide the best
prediction of that score.
13

The bivariate regression equation:


 The relationship between any two variables that have a
linear relationship is given by the equation for a straight
line: Y = a + bX
Bivariate  Y is the criterion variable (pronounced Y prime).
Regression  X is the predictor variable.
 b is the slope of the line, known as the regression
coefficient (or regression weight); it is the value by
which Y will change when X is changed by one unit.
14

Residuals:

Bivariate  It is very unlikely, that two variables measured in


psychological research will be perfectly correlated.
Regression  Difference = Residuals.
15

Proportion of variance explained:


 In regression, we wish to explain variance in the data;
Bivariate how much of the variance in the criterion variance is
explained, or accounted for, by the predictor variable?
Regression For bivariate relationships, r 2 is the proportion of
variance explained.
16

 From bivariate to multiple Bivariate correlation


measures the strength of association between two
variables, and bivariate regression allows one variable
to be predicted by one other variable.
 Multiple correlation measures the strength of
Multiple association between one variable and a set of other

Regression variables, and multiple regression allows prediction of


one variable (the criterion variable) from the set of
other variables (the predictor variables).

 In practice, the term multiple regression usually


includes multiple correlation.
17

 You can use this statistical technique when exploring


linear (straight line) relationships between the
predictor and criterion variables. If any relationship is
non-linear, transforming one or more variables may be
required. If a relationship remains non-linear, other
Use of multiple techniques may be possible.
regression  The criterion variable should be measured on a scale at
either interval or ratio level. There are separate
statistical procedures for predicting values on a
criterion variable that is nominal.
18

 The predictor variables that you select should be


measured on a ratio, interval or ordinal scale.
 A nominal predictor variable is valid, but only if it is
dichotomous. For example, sex (male and female) is
acceptable, but gender identity (masculine, feminine
Multiple and androgynous) could not be entered into multiple
regression as a single variable. Instead, you would
Regression create three different variables each with two
categories (masculine/not masculine; feminine/not
feminine and androgynous/not androgynous).
 The term ‘dummy variable’ is used to describe this type
of dichotomous variable.
19

 Multiple regression requires a large number of


observations.
 The estimate of R (the multiple correlation) depends
both on the number of predictor variables and on the
Multiple number of participants (N).
Regression  One rule of thumb is to have at least ten times as many
participants as predictor variables.
 You should screen your data for outliers and normality.
20

 You should check for multicollinearity.


 When choosing a predictor variable you should select
Multiple one that might be correlated with the criterion variable,
Regression but that is not strongly correlated with the other
predictor variables.
21

Regression coefficients: B (unstandardized) and Beta


(standardized):
 Regression coefficients (or regression weights) are
measures of how strongly each predictor variable
influences the criterion variable, if all the other
predictor variables were held constant.
Multiple
 As B is unstandardized (measured in the original units
Regression of its X) it can be it difficult to interpret.
 A useful alternative is beta, the standardized regression
coefficient, which is measured in units of standard
deviation allowing us to more easily compare the
influence of several predictors.
22

 A regression coefficient is either negative or positive,


Multiple indicating whether an increase in the predictor will
result in a decrease or increase in the criterion
Regression variable.
23

R, R Square, and Adjusted R Square:


 R is a measure of the correlation between the observed
values of the criterion variable and its predicted values.

 R Square (R square ) indicates the proportion of the


variance in the criterion variable which is accounted for
by the model.
Multiple  Adjusted R square value is calculated which takes into
Regression account the number of predictor variables in the model
and the number of observations (participants) that the
model is based on.
 If, for example, we have an Adjusted R2 value of .75 we
can say that our model has accounted for 75% of the
variance in the criterion variable.
24

 In Forward selection, SPSS enters the predictors into the


model one at a time in an order determined by the
strength of their correlation with the criterion variable
Multiple (the strongest is entered first).

Regression  As each predictor is added its effect is assessed, and if


it does not significantly add to the success of the model
it is excluded.
25

 In Backward selection, SPSS first enters all the


predictors into the model.

Multiple  The weakest predictor is then removed and the


regression re-calculated. If the model was significantly
Regression weakened then the predictor is re-entered, otherwise it
is deleted.
26

 Stepwise is the most sophisticated of the statistical


methods. SPSS enters the predictors into the model one
at a time in an order determined by the strength of their
correlation with the criterion variable, and the strength
of the predictor in adding to the predictive power is
assessed.

Multiple  If adding the predictor contributes to the model it is


retained.
Regression  So far this is like the Forward option, but then all other
predictors in the model are re-tested for their
contribution to the success of the model. Any predictor
that no longer contributes significantly is removed.
Thus, with this method you end up with the smallest
possible set of predictors included in your model.
27

Multiple
Regression
28

Multiple
Regression
29

Thank you

You might also like