0% found this document useful (0 votes)
65 views64 pages

Regression Analysis: SW318 Social Work Statistics Slide 1

This document discusses regression analysis and its use in evaluating relationships between interval level variables. It explains that regression analysis examines the strength and direction of relationships between two quantitative variables through the use of scatterplots and regression lines. Scatterplots graph the independent and dependent variables, with the regression line showing the best fitting slope. The angle and tightness of points around this line provide evidence of whether a relationship exists between the variables and how strong it may be. Regression analysis allows testing if relationships found in sample data apply to the underlying population.

Uploaded by

zain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views64 pages

Regression Analysis: SW318 Social Work Statistics Slide 1

This document discusses regression analysis and its use in evaluating relationships between interval level variables. It explains that regression analysis examines the strength and direction of relationships between two quantitative variables through the use of scatterplots and regression lines. Scatterplots graph the independent and dependent variables, with the regression line showing the best fitting slope. The angle and tightness of points around this line provide evidence of whether a relationship exists between the variables and how strong it may be. Regression analysis allows testing if relationships found in sample data apply to the underlying population.

Uploaded by

zain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 64

SW318

Social Work
Statistics
Slide 1
Regression Analysis

 We have previously studied the Pearson’s r


correlation coefficient and the r2 coefficient of
determination as measures of association for
evaluating the relationship between an interval level
independent variable and an interval level
dependent variable.

 These statistics are components of a broader set of


statistical techniques for evaluating the relationship
between two interval level variables, called
regression analysis (sometimes referred to in
combination as correlation and regression analysis).
SW318
Social Work Regression Analysis vs. Chi-Square Test of
Statistics
Slide 2 Independence

 Our purpose now is to use a hypothesis test to conclude that


there is a relationship between two interval level variables in
the population represented by our sample data.

 We could use a chi-square test of independence to determine


whether or not a relationship exists between two variables in
the population represented by our data, provided we grouped
the values of both variables to create a bivariate table.

 However, it is preferable to test for the presence of a


relationship retaining the variables as interval level data
because this strategy is more effective at detecting the
existence of relationship. We might find a relationship using
interval level statistics that we do not find using nominal level
statistics because the nominal level statistics are less precise.
SW318
Social Work
Statistics
Slide 3
Elements of Regression Analysis

 We will first review previous material on regression


and correlation:
 The scatterplot or scattergram

 The regression equation

 Then, we will examine the statistical evidence to


determine whether or not, the relationships found in
our sample data are applicable to the population
represented by the sample using a hypothesis test.
SW318
Social Work
Statistics
Slide 4
Purpose of Regression Analysis

 The purpose of regression analysis is to answer the


same three questions that have been identified as
requirements for understanding the relationships
between variables:
 Is there a relationship between the two
variables?
 How strong is the relationship?

 What is the direction of the relationship?


SW318
Social Work
Statistics
Slide 5
Scatterplots - 1

 The relationship between two interval variables can be


graphed as a scatterplot or a scatter diagram which shows
the position of all of the cases in an x-y coordinate system.

 The independent variable is plotted on the x-axis, or the


horizontal axis.

 The dependent variable is plotted on the y-axis, or the


vertical axis.

 A dot in the body of the chart represented the


intersection of the data on the x-axis and the y-axis
SW318
Social Work
Statistics
Slide 6
Scatterplots - 2
 The trendline or regression line is plotted on the
chart in a contrasting color
 The overall pattern of the dots, or data points,
succinctly summarizes the nature of the relationship
between the two variables.
 The clarity of the pattern formed by the dots can be
enhanced by drawing a straight line through the
cluster such that the line touches every dot or comes
as close to doing so as possible.
 This summarizing line is called the “regression line.”
 We will see later how this line is obtained, but for
now, we will look at how it helps us understand the
scatterplot.
SW318
Social Work
Statistics
Slide 7
Scatterplots - 3

The pattern of the points on the scatterplot gives us


information about the relationship between the variables.
The regression line, drawn in red, makes it easier for us
to understand the scatterplot.
SW318
Social Work
Statistics
Slide 8
The Uses of Scatterplots

 Scatterplots give us information about our three


questions about the relationship between two
interval variables:
 Is there a relationship between the two variables?

 How strong is the relationship?

 What is the direction of the relationship?

 In addition, the regression line on the scatterplot can


be used to estimate the value of the dependent
variable for any value of the independent variable.
SW318
Social Work
Statistics
Slide 9
Scatterplots: Evidence of a Relationship

The angle between the regression line and the


horizontal x-axis provides evidence of a relationship. If
there is no relationship, the regression line will be
parallel to the axis.
2
Work orientation scale score

80.00

Com posite aptitude score


1 70.00
60.00
0
50.00
-3 -2 -1 0 1 2 3
-1 40.00
30.00
-2
20.00
-3 10.00
.00
-4
.000 20.000 40.000 60.000 80.000
When thereSES
is scale
no score Vocabulary aptitude score
relationship between two When there is a relationship
variables, the regression between two variables, the
line is parallel to the regression line lies at an angle
horizontal axis. to the horizontal axis, sloping
either upward or downward.
SW318
Social Work
Statistics
Slide 10
Scatterplots: Strength of a Relationship

The strength of a relationship is indicated by the


narrowness of the band of points spread around the
regression line: the tighter the band, the stronger the
relationship. 80.00

Com posite aptitude score


70.00
2
Work orientation scale score

60.00
1 50.00
40.00
0 30.00
-3 -2 -1 0 1 2 3
-1 20.00
10.00
-2 .00
-3 .000 20.000 40.000 60.000 80.000

-4 The spread of the points around


Vocabulary aptitude score

In this scatterplot, the


SES scale score the regression line is narrow,
points are very spread out indicating a stronger relationship.
around the regression line.
The relationship is weak. We should check the scale of the
vertical axis to make sure the
narrow band is not the result of an
excessively large scale.
SW318
Social Work
Statistics
Slide 11
Scatterplots: Direction of Relationship

When the regression line slopes upward to the right,


there is a positive, or direct, relationship between the
variables. When the regression line slopes downward,
the relationship is negative, or inverse.

Mathem atics aptitude score


80.00
Com posite aptitude score

80.000
70.00 70.000
60.00 60.000
50.00 50.000
40.00 40.000
30.00 30.000
20.00 20.000
10.00
In this10.000
scatterplot, the
.00
In .000
this
scatterplot, the regression.000 line slopes
20.000 40.000 60.000 80.000
regression line slopes upward -2.00 donward
-1.00 to.00the right,
1.00 2.00 3.00
Vocabulary aptitude score
to the right, indicating a indicating a negative
Self concept or
scale score
positive or direct relationship. inverse relationship. The
The values of both variables values of the variables move
increase and decrease at the in opposite directions.
same time.
SW318
Social Work
Statistics
Slide 12
Scatterplots: Predicting Scores

For any value of the independent variable on the


horizontal x-axis, the predicted value for the dependent
variable will be the corresponding value on the vertical
y-axis.
80.00
Com posite aptitude score

70.00
For the value of the
60.00
independent variable
on the horizontal axis,
50.00
we draw a line upward
40.00
to the regression line,
30.00
e.g. 52.
20.00
10.00 We draw a
.00 perpendicular line
.000 20.000 40.000 60.000 80.000 from the value on the
Vocabulary aptitude score x-axis to the
regression line.
The estimate for the dependent
variable is obtained by drawing
a line parallel to the x-axis from
the regression line to the
vertical y-axis and reading the
value where this line crosses the
y-axis, e.g. 50.
SW318
Social Work
Statistics The Effect of Scaling on the Scatterplot
Slide 13

The scale used for the vertical 80

Com posite aptitude score


70

y-axis can change the appearance 60


50
of the scatterplot and alter our 40

interpretation of the strength of the 30


20
relationship. The three scatterplots 10
In the original plot, the
on this slide all use the same data. 0
0 20
y-axis
40
is scaled
60
from80
0
to 80.
Vocabulary aptitude score

160.00

Com posite aptitude score


75.00
Com posite aptitude score

140.00
70.00
120.00
65.00
60.00 100.00
55.00 80.00
50.00 60.00
45.00
40.00
40.00
35.00 20.00
30.00 .00
25.00 .000 20.000 40.000 60.000 80.000
.000 20.000 40.000
In this Vocabulary
plot, I have
60.000
narrowed
80.000
In this plot,
Vocabulary I doubled
aptitude score the
aptitude score
the range of the y-axis scale range of the y-axis scale to 0
to 25 to 75, spreading the to 160, drawing the points
points, and making the closer together, and making
relationship appear weaker. the relationship appear
stronger.
SW318
Social Work
Statistics
Slide 14
The Assumption of Linearity

 An underlying assumption of regression analysis is


that the relationship between the variables is linear,
meaning that the points in the scatterplot must form
a pattern that can be approximated with a straight
line.

 While we could test the assumption of linearity with


a test of statistical significance of the correlation
coefficient, we will make a visual assess tor
scatterplots.

 If the scatterplot indicates that the points do not


follow a linear pattern, the techniques of linear
correlation and regression should not be applied.
SW318
Social Work
Statistics
Slide 15
Examples of Linear Relationships

These two scatterplots are for data on poverty of


nations. The plots below show strong linear
relationships. The points are evenly distributed on
either side of the regression line.

90 200
Fem ale Life Expectancy

80

Infant Mortality Rate


70 150
60
50 100
40
30 50
20
10 0
0 0 20 40 60 80 100
0 20 40 60 80 -50

Male Life Expectancy Fem ale Life Expectancy


SW318
Social Work
Statistics
Slide 16
Examples of Non-linear Relationships

These scatterplots show a non-


linear relationship. The points are 90

Male Life Expectancy


80
not evenly distributed on either 70
60
side of the regression line. We will 50

often see a concentration of points 40


30
on one side of the regression line 20
10
and an absence of points on the 0

other side.
0 10000 20000 30000 40000
Gross National Product

40000 30
35000
Fem ale Life Expectancy

25
30000
20

Birth Rate
25000
20000
15
15000
10000 10
5000 5
0
-5000 0 20 40 60 80 100 0
-10000 0 50 100 150 200

Male Life Expectancy Death Rate


SW318
Social Work
Statistics
Slide 17
The Regression Equation

 The regression equation is the algebraic formula for


the regression line, which states the mathematical
relationship between the independent and the
dependent variable.

 We can use the regression line to estimate the value


of the dependent variable for any value of the
independent variable.

 The stronger the relationship between the


independent and dependent variables, the closer
these estimates will come to the actual score that
each case had on the dependent variable.
SW318
Social Work
Statistics
Slide 18
Components of the Regression Equation

 The regression equation has two components.


 The first component is a number called the y-
intercept that defines where the line crosses the
vertical y axis.
 The second component is called the slope of the
line, and is a number that multiplies the value of
the independent variable.

 These two elements are combined in the general


form for the regression equation:
 the estimated score on the dependent variable

= the y-intercept + the slope × the score on the


independent variable
SW318
Social Work
Statistics
Slide 19
The Standard Form of the Regression Equation

 The standard form for the regression equation or


formula is:
Y = a + bX

 where
 Y is the estimated score for the dependent
variable
 X is the score for the independent variable

 b is the slope of the regression line, or the


multiplier of X
 a is the intercept, or the point on the vertical axis
where the regression line crosses the vertical y-
axis
SW318
Social Work
Statistics
Slide 20
Depicting the Regression Equation

The regression equation includes both the y-


intercept and the slope of the line. The y-
intercept is 1.0 and the slope is 0.5.

y = 1.0 + 0.5 x
5.0
4.5
4.0
3.5
3.0
y 2.5
2.0
1.5
1.0
The slope is the multiplier
0.5
of x. It is the amount of
change in y for a change
0.0
0. 0. 1. 1. 2. 2. 3. 3. 4. 4. of
5 one unit in x.
0 5 0 5 0 5 0 5 0 5 .0

If x changes one unit from


The y-intercept is the x
2.0 to 3.0, depicted by the
point on the vertical y-
blue arrow, y will change
axis where the
by 0.5 units, from 2.0 to
regression line crosses
2.5 as depicted by the red
the axis, i.e. 1.0.
arrow.
SW318
Social Work
Statistics
Slide 21
Deriving the Regression Equation
 In this plot, none of the points fall on the
regression line.
 The difference between the actual value for y = 0.8 + 0.6 x
the dependent variable and the predicted 5
value for each point is shown by the red 4

lines. This difference is called the residual, 3

and represents the error between the actual


y
2

and predicted values. 1

The regression equation is computed to


0
 0 1 2 3 4 5

minimize the total amount of error in x

predicting values for the dependent variable.


The method for deriving the equation is
called the "method of least squares,"
meaning that the regression line minimizes
the sum of the squared residuals, or errors
between actual and predicted values.
SW318
Social Work
Statistics
Interpreting the Regression Equation:
Slide 22
the Intercept

 The intercept is the point on the vertical axis where


the regression line crosses the axis. It is the
predicted value for the dependent variable when the
independent variable has a value of zero.

 This may or may not be useful information depending


on the context of the problem.
SW318
Social Work
Statistics
Interpreting the Regression Equation:
Slide 23
the Slope
 The slope is interpreted as the amount of change in
the predicted value of the dependent variable
associated with a one unit change in the value of the
independent variable.

 If the slope has a negative sign, the direction of the


relationship is negative or inverse, meaning that the
scores on the two variables move in opposite
directions.

 If the slope has a positive sign, the direction of the


relationship is positive or direct, meaning that the
scores on the two variables move in the same
direction.
SW318
Social Work
Statistics
Interpreting the Regression Equation:
Slide 24
when the Slope equals 0

 If there is no relationship between two variables, the


slope of the regression line is zero and the regression
line is parallel to the horizontal axis.

 A slope of zero means that the predicted value of


the dependent variable will not change, no matter
what value of the independent variable is used.

 If there is no relationship, using the regression


equation to predict values of the dependent variable
is no improvement over using the mean of the
dependent variable.
SW318
Social Work
Statistics
Assumptions Required for Utilizing
Slide 25
a Regression Equation

 The assumptions required for utilizing a regression


equation are the same as the assumptions for the test
of significance of a correlation coefficient.
 Both variables are interval level.

 Both variables are normally distributed.

 The relationship between the two variables is


linear.
 The variance of the values of the dependent
variable is uniform for all values of the
independent variable (equality of variance).
SW318
Social Work
Statistics
Slide 26
Assumption of Normality

 Strictly speaking, the test requires that the two


variables be bivariate normal, meaning that the
combined distribution of the two variables is normal.
It is usually assumed that the variables are bivariate
normal if each variable is normally distributed, so
this assumption is tested by checking the normality
of each variable.

 Each variable will be considered normal if its


skewness and kurtosis statistics fall between –1.0 and
+1.0 or if the sample size is sufficiently large to
apply the Central Limit theorem.
SW318
Social Work
Statistics
Slide 27
Assumption of Linearity

 Linearity means that the 90

Fem ale Life Expectancy


80
pattern of the points in a 70

scatterplot form a band, 60


50
like the pattern in the chart 40

on the right:
30
20
10
0
0 50 100 150 200
Infant Mortality Rate

 When the pattern of the 40000


points follows a curve, like 35000

Fem ale Life Expectancy


the scatterplot on the right, 30000
25000
the correlation coefficient 20000

will not accurately 15000


10000
measure the relationship. 5000
0
-5000 0 20 40 60 80 100
-10000
Male Life Expectancy
SW318
Social Work
Statistics
Slide 28
Test of Linearity

 The test of linearity is a diagnostic statistical test of


the null hypothesis that the linear model is an
appropriate fit for the data points. The desired
outcome for this test is to fail to reject the null
hypothesis.
 If the probability for the test of statistic is less than
or equal to the level of significance for the problem,
we reject the null hypothesis, concluding that the
data is not linear and the Regression Analysis is not
appropriate for the relationship between the two
variables.
 If the probability for the test of linearity statistic is
greater than the level of significance for the problem,
we fail to reject the null hypothesis and conclude
that we satisfy the assumption of linearity.
SW318
Social Work
Statistics
Slide 29
Assumption of Homoscedasticity

 Homoscedasticity (equality of variances) means that


the points are evenly dispersed on either side of the
regression line for the linear relationship.
In this scatterplot, the points extend In this scatterplot, the spread of the
about the same distance above and points around the regression line is
below the regression line for most of the narrower at the left end of the regression
length of the regression line. line than at the right end of the regression
line.
This scatterplot meets the assumption of
homoscedasticity. This “funnel” shape is typical of a
scatterplot showing violations of the
assumption of homoscedasticity.
90 200
Fem ale Life Expectancy

80

Infant Mortality Rate


70 150
60
50 100
40
30 50
20
10 0
0 0 10 20 30 40 50 60
0 50 100 150 200 -50

Infant Mortality Rate Birth Rate


SW318
Social Work
Statistics
Slide 30
Test of Homoscedasticity

 When we compared groups, we used the Levene test


of population variances to test for the assumption
that the group variances were equal.

 In order to use this test for the assumption of


homoscedasity, we will convert the interval level
independent variable into a dichotomous variable
with low scores in one group and high scores in the
other group. We can then compare the variances of
the two groups derived from the independent
variable.
SW318
Social Work
Statistics
Slide 31
Levene Test of Homogeneity of Variances
 The Levene test of equality of population variances
tests whether or not the variances for the two groups
are equal. It is a test of the research hypothesis that
the variance (dispersion) of the group with low scores
is different from the variance of the group with high
scores. The null hypothesis that the variance
(dispersion) of both groups are equal.
 If the probability of the test statistic is greater than
0.05, we do not reject the null hypothesis and
conclude that the variances are equal. This is the
desired outcome.
 If the probability of the test statistic is less than or
equal to 0.05, we conclude the variances are
different and the Regression Analysis is not an
appropriate test for the relationship between the two
variables.
SW318
Social Work
Statistics
Slide 32
The hypothesis test of r2

 The purpose of the hypothesis test of r2 is a test of


the applicability of our findings to the population
represented by the sample.

 When we studied association between two interval


variables, we stated that the Pearson r correlation
coefficient and its square, the coefficient of
determination measure the strength of the
relationship between two interval variables. When
the correlation coefficient and coefficient of
determination are zero (0), there is no relationship.

 The hypothesis test of r2 is a test of whether or not


r2 is larger than zero in the population.
SW318
Social Work
Statistics
Slide 33
The hypothesis test of r2
 The research hypothesis states that r2 is larger than
zero. (a relationship exists)

 The null hypothesis states that r2 is equal to zero.


(no relationship)

 Recall that we interpreted the coefficient of


determination r2 as the reduction in error
attributable to with the relationship between the
variables.

 The test statistic is an ANOVA F-test which tests


whether or not the reduction in error associated with
using the regression equation is really greater than
zero.
SW318
Social Work
Statistics
Slide 34
How the regression ANOVA test works?
We will use the sample data we used for
correlation and regression to examine how
the hypothesis test for r2 works.

We are interested in
the relationship
between family size
and number of
credit cards.
SW318
Social Work
Statistics
Slide 35
The scatter diagram or scatterplot

The dependent
variable is plotted
on the Y or
vertical axis.

The independent variable is


plotted on the x or
horizontal axis.
SW318
Social Work
Statistics
Slide 36
The mean as the best guess

Without taking into account


the independent variable, our
best guess for the number of
credit cards for any subject is
the mean, 7.0.
SW318
Social Work
Statistics
Slide 37
Errors using the mean as estimate

Errors are measured by computing the difference


between the mean and each Y value, squaring
the differences, and then summing them.

When we compute the answer in SPSS, it will tell


us that the total amount of error is 22.0.
SW318
Social Work
Statistics
Slide 38
The regression line

The regression line minimizes the error


(the best fitting or least squares line)
SW318
Social Work
Statistics
Slide 39
The equation for the regression line

SPSS will give us the formula for the regression line in the
form Y = a + bX, or for these variables:

Number of Credit Cards = 2.871 + .971 x Family Size


SW318
Social Work
Statistics
Slide 40
PRE reduction in error

SPSS also tells us the amount of error using only the mean
and using the regression line.

Error using mean only (total) 22.000

Error using regression line 5.486

Reduction in error associated with 16.514


the regression
PRE measure (r2) 22.0-5.486 = .751
22.0
SW318
Social Work
Statistics
Slide 41
The ANOVA test for the regression

The F statistic is calculated as the ratio of error reduced by regressions


divided the error remaining.

If the ratio were 1 and these two numbers were the same, we would not
have reduced any error, there would be no relationship, and the p-value
would not let us reject the null hypothesis.

In this problem, the amount of error reduced by the regression is large


relative to the amount remaining, so the F statistic is large, the p-
value(0.005) is smaller than the alpha level of significance, so we reject the
null hypothesis.
SW318
Social Work
Statistics
Interpreting Pearson’s r correlation
Slide 42
coefficient
The square root of r2 is Pearson’s
r, the correlation coefficient.

If we want to characterize the


strength of the relationship, we
compare the size of r to the
interpretive guidelines for
measures of association.
SW318
Social Work
Statistics
Slide 43
Interpreting the direction of the relationship

To interpret the direction of the relationship between the


variables, we look at the coefficient for the independent
variable.

In this example, the coefficient of 0.971 is positive, so we


would interpret this relationship as:

Families with more members had more credit cards.


SW318
Social Work
Statistics
Slide 44
Testing Assumptions in Homework Problems

 The process of testing assumptions can easily


overwhelm the task of testing the significance of the
relationship.

 Since our emphasis here is testing the hypothesis


that the relationship is generalizable to the
population represented by the sample data, we will
assume that our data satisfies the assumptions
without explicitly testing assumptions.
SW318
Social Work
Statistics
Slide 45
Homework Problem Questions

 The question in the homework problems requires us


to look at three things:
 Does the hypothesis test support the existence of
a relationship in the population?
 Is the strength of the relationship characterized
correctly?
 Is the direction of the relationship between the
variables correctly stated?
SW318
Social Work
Statistics
Slide 46
Practice Problem – 1

This question asks you to use linear regression to examine the relationship
between [marital] and [age]. Linear regression requires that the dependent
variable and the independent variables be interval. Ordinal variables may be
included as interval variables if a caution is added to any true findings.

The dependent variable [marital] is nominal level which does not satisfy the
requirement for a dependent variable. The independent variable [age] is
interval level, satisfying the requirement for an independent variable.
SW318
Social Work
Statistics
Slide 47
Practice Problem - 2

This question asks you to use linear regression to examine the relationship
between [fund] and [attend]. The level of measurement requirements for
multiple regression are satisfied: [fund] is ordinal level, and [attend] is ordinal
level. A caution is added because ordinal level variables are included in the
analysis.

Given the assumption that the distributional requirements for linear regression
are satisfied, you can conduct a linear regression using SPSS without examining
distributional assumptions for the variables.
SW318
Social Work
Statistics
Slide 48
Linear Regression Hypothesis Test in SPSS (1)

You can conduct a linear


regression using:

Analyze > Regression >


Linear…
SW318
Social Work
Statistics
Slide 49
Linear Regression Hypothesis Test in SPSS (2)

Move the dependent variable


to “Dependent:” and the
independent variable to
“Independent(s):” boxes and
then click “OK” button.
SW318
Social Work
Statistics
Slide 50
Linear Regression Hypothesis Test in SPSS (3)

Based on the ANOVA table for the linear regression (F(1, 604) = 70.579, p<0.001),
there was an relationship between the dependent variable "degree of religious
fundamentalism" and the independent variable "frequency of attendance at religious
services". Since the probability of the F statistic (p<0.001) was less than or equal to
the level of significance (0.05), the null hypothesis that correlation coefficient (R)
was equal to 0 was rejected.

The research hypothesis that there was a relationship between the variables was
supported.
SW318
Social Work
Statistics
Slide 51
Linear Regression Hypothesis Test in SPSS (4)

Given the significant F-test result, the correlation coefficient (R) can be
interpreted.

The correlation coefficient for the relationship between the independent variable
and the dependent variable was 0.323, which would be characterized as a weak
relationship using the rule of thumb that a correlation between 0.0 and 0.20 is
very weak; 0.20 to 0.40 is weak; 0.40 to 0.60 is moderate; 0.60 to 0.80 is
strong; and greater than 0.80 is very strong.

The relationship between the independent variables and the dependent variable
was incorrectly characterized as a moderate relationship. The relationship should
have been characterized as a weak relationship.

The answer to the problem is false.


SW318
Social Work
Statistics
Slide 52
Practice Problem – 3

This question asks you to use linear regression to examine the relationship
between [educ] and [age]. [educ] and [age] are interval level, satisfying the
level of measurement requirements for regression.

Given the assumption that the distributional requirements for linear


regression are satisfied, you can conduct a linear regression using SPSS
without examining distributional characteristics of variables.
SW318
Social Work
Statistics
Slide 53
Linear Regression Hypothesis Test in SPSS (5)

You can conduct a linear


regression using:

Analyze > Regression >


Linear…
SW318
Social Work
Statistics
Slide 54
Linear Regression Hypothesis Test in SPSS (6)

Move the dependent variable


to “Dependent:” and the
independent variable to
“Independent(s):” boxes and
then click “OK” button.
SW318
Social Work
Statistics
Slide 55
Linear Regression Hypothesis Test in SPSS (7)

Based on the ANOVA table for the linear regression (F(1, 659) = 9.983, p=0.002),
there was an relationship between the dependent variable "highest year of school
completed" and the independent variable "age". Since the probability of the F
statistic (p=0.002) was less than or equal to the level of significance (0.05), the null
hypothesis that correlation coefficient (R) was equal to 0 was rejected.

The research hypothesis that there was a relationship between the variables was
supported.
SW318
Social Work
Statistics
Slide 56
Linear Regression Hypothesis Test in SPSS (8)

Given the significant F-test result, the


correlation coefficient (R) can be
interpreted.

The correlation coefficient for the


relationship between the independent
variable and the dependent variable was
0.122, which can be characterized as a very
weak relationship. .
SW318
Social Work
Statistics
Slide 57
Linear Regression Hypothesis Test in SPSS (9)

The b coefficient for the independent variable "age" was -.021,


indicating an inverse relationship with the dependent variable.
Higher numeric values for the independent variable "age" [age]
are associated with lower numeric values for the dependent
variable "highest year of school completed" [educ].

The statement in the problem that "survey respondents who


were older had completed more years of school" is incorrect.
The direction of the relationship is stated incorrectly.
SW318
Social Work
Statistics
Slide 58
Practice Problem – 4

This question asks you to use linear regression to examine the relationship
between [sei] and [age]. [sei] and [age] are interval level, satisfying the
level of measurement requirements for regression.

Given the assumption that the distributional requirements for linear


regression are satisfied, you can conduct a linear regression using SPSS
without examining distributional characteristics of variables.
SW318
Social Work
Statistics
Slide 59
Linear Regression Hypothesis Test in SPSS (10)

You can conduct a linear


regression using:

Analyze > Regression >


Linear…
SW318
Social Work
Statistics
Slide 60
Linear Regression Hypothesis Test in SPSS (11)

Move the dependent variable


to “Dependent:” and the
independent variable to
“Independent(s):” boxes and
then click “OK” button.
SW318
Social Work
Statistics
Slide 61
Linear Regression Hypothesis Test in SPSS (12)

Based on the ANOVA table for the linear regression (F(1, 629) = .266, p=0.606),
there was no relationship between the dependent variable "socioeconomic index" and
the independent variable "age". Since the probability of the F statistic (p=0.606) was
greater than the level of significance (0.05), the null hypothesis that correlation
coefficient (R) was equal to 0 was not rejected.

The research hypothesis that there was a relationship between the variables was not
supported.
SW318
Social Work Steps in solving Linear Regression Hypothesis
Statistics
Slide 62 Test Problems - 1

The following is a guide to the decision process for answering


homework problems about Linear Regression Hypothesis Test
problems:

Are the dependent and No Incorrect


independent variables application of
ordinal or interval level? a statistic

Yes

Make sure that the assumption that the


distributional requirements for linear
regression are satisfied is made. Otherwise,
you have to check the assumption first.

Our regression problems


will assume that the
assumptions are met.
SW318
Social Work
Statistics
Steps in solving Linear Regression Hypothesis
Slide 63
Test Problems - 2

Conduct the linear regression analysis

Is the p-value in the No


ANOVA table for the F False
ratio test <= alpha?

Yes

Is the interpretation of the No


strength of the correlation False
coefficient correct?

Yes
SW318
Social Work Steps in solving Linear Regression Hypothesis
Statistics
Slide 64 Test Problems - 3

Is the direction of the No


relationship correctly stated? False

Yes

Are either of the No


variables ordinal level? True

Yes

True with caution

You might also like