Topic - 9 PDF
Topic - 9 PDF
Correlation
Correlation looks at the relationship between two variables in a linear fashion. A pearson
product-moment correlation coefficient describes the relationship between two continuous
variables and is available through the Analyze and Correlate menus.
In this chapter, you will address bivariate correlation, refers to the correlation between two
continuous variables and is the most common measure of linear relationship. This coefficient
has range of possible values from -1 to +1. The value indicates the strength of the relationship,
while the sign (+ or -) indicates the direction.
1
09/11/2020
Assumption testing
Correlational analysis has a number of underlying assumptions:
1. Related pairs – data must be collected from related pairs: that is, if you obtain a score on an X variable, there
must also be a score on the Y variable from the same participant.
5. Homoscedasticity – the variability in scores for one variable is roughly the same at all values of the other
variable. That is, it is concerned with how the scores cluster uniformly about the regression line.
Assumptions 1 and 2 are a matter if research design. Assumption 3 can be tested using the
appropriate procedures. Assumptions 4 and 5 can be tested by examining scatterdots of the
variables.
2
09/11/2020
To obtain a scatterdot
4. Click on the Define command pushbutton to open the Simple Scatter sub-dialogue box.
5. Select the first variable and click on the ► button to move the variable into the Y Axis: box.
6. Select the second variable and click on the ► button to move the variable into the X Axis: box.
7. Click on OK.
You can see from the scatterdot whether there is a linear relationship between the two variables. If the scores
cluster uniformly around the regression line, the assumption of homescedasticity is not violated.
3
09/11/2020
To obtain a bivariate Pearson product-moment correlation
1. Select the Analyze menu.
2. Click on Correlate and then Bivariate… to open the Bivariate Correlations dialogue box.
3. Select the variables you require and click on the ► button to move the variables into the Variables:
box.
5. Click on OK.
To interpret the correlations coefficient, you examine the coefficient and its associated significance
value (p). The output will confirm the results of the scatterdot in that whether a significant positive
relationship exists between the variables. If r is large and p < .05, we can say that higher scores of a
variable are associated with higher scores of the second variable.
4
09/11/2020
Linear Regression
Linear Regression estimates the coefficients of the linear equation, involving one or more independent variables that best
predict the value of the dependent variable. For example, you can try to predict a salesperson's total yearly sales (the
dependent variable) from independent variables such as age, education, and years of experience.
Example. Is the number of games won by a basketball team in a season related to the average number of points the team
scores per game? A scatterplot indicates that these variables are linearly related. The number of games won and the
average number of points scored by the opponent are also linearly related. These variables have a negative relationship.
As the number of games won increases, the average number of points scored by the opponent decreases. With linear
regression, you can model the relationship of these variables. A good model can be used to predict how many games
teams will win.
Statistics. For each variable: number of valid cases, mean, and standard deviation. For each model: regression
coefficients, correlation matrix, part and partial correlations, multiple R, R2, adjusted R2, change in R2, standard error of
the estimate, analysis-of-variance table, predicted values, and residuals. Also, 95% confidence intervals for each
regression coefficient, variance-covariance matrix, variance inflation factor, tolerance, Durbin-Watson test, distance
measures (Mahalanobis, Cook, and leverage values), DfBeta, DfFit, prediction intervals, and casewise diagnostics. Plots:
scatterplots, partial plots, histograms, and normal probability plots.
5
09/11/2020
How to obtain a linear regression analysis
Analyze
Regression
Linear...
• Group independent variables into blocks and specify different entry methods for different subsets of variables.
• Choose a selection variable to limit the analysis to a subset of cases having a particular value(s) for this variable.
Assumptions. For each value of the independent variable, the distribution of the
dependent variable must be normal. The variance of the distribution of the dependent
variable should be constant for all values of the independent variable. The relationship
between the dependent variable and each independent variable should be linear, and all
observations should be independent.
7
09/11/2020
Linear Regression variable selection methods
Method selection allows you to specify how independent variables are entered into the
analysis. Using different methods, you can construct a variety of regression models
from the same set of variables.
To enter the variables in the block in a single step, select Enter. To remove the variables
in the block in a single step, select Remove. Forward variable selection enters the
variables in the block one at a time based on entry criteria. Backward variable
elimination enters all of the variables in the block in a single step and then removes
them one at a time based on removal criteria. Stepwise variable entry and removal
examines the variables in the block at each step for entry or removal. This is a forward
stepwise procedure.
8
09/11/2020
The significance values in your output are based on fitting a single model. Therefore, the
significance values are generally invalid when a stepwise method (Stepwise, Forward, or
Backward) is used.
All variables must pass the tolerance criterion to be entered in the equation, regardless of the
entry method specified. The default tolerance level is 0.0001. Also, a variable is not entered if it
would cause the tolerance of another variable already in the model to drop below the tolerance
criterion.
All independent variables selected are added to a single regression model. However, you can
specify different entry methods for different subsets of variables. For example, you can enter
one block of variables into the regression model using stepwise selection and a second block
using forward selection. To add a second block of variables to the regression model, click Next.
9
09/11/2020
Linear Regression Statistics
Model fit. The variables entered and removed from the model are listed, and the following goodness-
of-fit statistics are displayed: multiple R, R2 and adjusted R2, standard error of the estimate, and an
analysis-of-variance table.
R squared change. Displays changes in R2 change, F change, and the significance of F change.
10
09/11/2020
Descriptives. Provides the number of valid cases, the mean, and the standard deviation
for each variable in the analysis. A correlation matrix with a one-tailed significance
level and the number of cases for each correlation are also displayed.
Part and partial correlations. Displays zero-order, part, and partial correlations.
Residuals. Displays the Durbin-Watson test for serial correlation of the residuals and
casewise diagnostics for the cases meeting the selection criterion (outliers above n
standard deviations).
11
09/11/2020
Specifying options for a linear regression
From the menus choose:
Analyze
Regression
Linear...
12
09/11/2020