Regression Analysis: SW318 Social Work Statistics Slide 1
Regression Analysis: SW318 Social Work Statistics Slide 1
Social Work
Statistics
Slide 1
Regression Analysis
80.00
60.00
1 50.00
40.00
0 30.00
-3 -2 -1 0 1 2 3
-1 20.00
10.00
-2 .00
-3 .000 20.000 40.000 60.000 80.000
80.000
70.00 70.000
60.00 60.000
50.00 50.000
40.00 40.000
30.00 30.000
20.00 20.000
10.00
In this10.000
scatterplot, the
.00
In .000
this
scatterplot, the regression.000 line slopes
20.000 40.000 60.000 80.000
regression line slopes upward -2.00 donward
-1.00 to.00the right,
1.00 2.00 3.00
Vocabulary aptitude score
to the right, indicating a indicating a negative
Self concept or
scale score
positive or direct relationship. inverse relationship. The
The values of both variables values of the variables move
increase and decrease at the in opposite directions.
same time.
SW318
Social Work
Statistics
Slide 12
Scatterplots: Predicting Scores
70.00
For the value of the
60.00
independent variable
on the horizontal axis,
50.00
we draw a line upward
40.00
to the regression line,
30.00
e.g. 52.
20.00
10.00 We draw a
.00 perpendicular line
.000 20.000 40.000 60.000 80.000 from the value on the
Vocabulary aptitude score x-axis to the
regression line.
The estimate for the dependent
variable is obtained by drawing
a line parallel to the x-axis from
the regression line to the
vertical y-axis and reading the
value where this line crosses the
y-axis, e.g. 50.
SW318
Social Work
Statistics The Effect of Scaling on the Scatterplot
Slide 13
160.00
140.00
70.00
120.00
65.00
60.00 100.00
55.00 80.00
50.00 60.00
45.00
40.00
40.00
35.00 20.00
30.00 .00
25.00 .000 20.000 40.000 60.000 80.000
.000 20.000 40.000
In this Vocabulary
plot, I have
60.000
narrowed
80.000
In this plot,
Vocabulary I doubled
aptitude score the
aptitude score
the range of the y-axis scale range of the y-axis scale to 0
to 25 to 75, spreading the to 160, drawing the points
points, and making the closer together, and making
relationship appear weaker. the relationship appear
stronger.
SW318
Social Work
Statistics
Slide 14
The Assumption of Linearity
90 200
Fem ale Life Expectancy
80
other side.
0 10000 20000 30000 40000
Gross National Product
40000 30
35000
Fem ale Life Expectancy
25
30000
20
Birth Rate
25000
20000
15
15000
10000 10
5000 5
0
-5000 0 20 40 60 80 100 0
-10000 0 50 100 150 200
where
Y is the estimated score for the dependent
variable
X is the score for the independent variable
y = 1.0 + 0.5 x
5.0
4.5
4.0
3.5
3.0
y 2.5
2.0
1.5
1.0
The slope is the multiplier
0.5
of x. It is the amount of
change in y for a change
0.0
0. 0. 1. 1. 2. 2. 3. 3. 4. 4. of
5 one unit in x.
0 5 0 5 0 5 0 5 0 5 .0
on the right:
30
20
10
0
0 50 100 150 200
Infant Mortality Rate
80
We are interested in
the relationship
between family size
and number of
credit cards.
SW318
Social Work
Statistics
Slide 35
The scatter diagram or scatterplot
The dependent
variable is plotted
on the Y or
vertical axis.
SPSS will give us the formula for the regression line in the
form Y = a + bX, or for these variables:
SPSS also tells us the amount of error using only the mean
and using the regression line.
If the ratio were 1 and these two numbers were the same, we would not
have reduced any error, there would be no relationship, and the p-value
would not let us reject the null hypothesis.
This question asks you to use linear regression to examine the relationship
between [marital] and [age]. Linear regression requires that the dependent
variable and the independent variables be interval. Ordinal variables may be
included as interval variables if a caution is added to any true findings.
The dependent variable [marital] is nominal level which does not satisfy the
requirement for a dependent variable. The independent variable [age] is
interval level, satisfying the requirement for an independent variable.
SW318
Social Work
Statistics
Slide 47
Practice Problem - 2
This question asks you to use linear regression to examine the relationship
between [fund] and [attend]. The level of measurement requirements for
multiple regression are satisfied: [fund] is ordinal level, and [attend] is ordinal
level. A caution is added because ordinal level variables are included in the
analysis.
Given the assumption that the distributional requirements for linear regression
are satisfied, you can conduct a linear regression using SPSS without examining
distributional assumptions for the variables.
SW318
Social Work
Statistics
Slide 48
Linear Regression Hypothesis Test in SPSS (1)
Based on the ANOVA table for the linear regression (F(1, 604) = 70.579, p<0.001),
there was an relationship between the dependent variable "degree of religious
fundamentalism" and the independent variable "frequency of attendance at religious
services". Since the probability of the F statistic (p<0.001) was less than or equal to
the level of significance (0.05), the null hypothesis that correlation coefficient (R)
was equal to 0 was rejected.
The research hypothesis that there was a relationship between the variables was
supported.
SW318
Social Work
Statistics
Slide 51
Linear Regression Hypothesis Test in SPSS (4)
Given the significant F-test result, the correlation coefficient (R) can be
interpreted.
The correlation coefficient for the relationship between the independent variable
and the dependent variable was 0.323, which would be characterized as a weak
relationship using the rule of thumb that a correlation between 0.0 and 0.20 is
very weak; 0.20 to 0.40 is weak; 0.40 to 0.60 is moderate; 0.60 to 0.80 is
strong; and greater than 0.80 is very strong.
The relationship between the independent variables and the dependent variable
was incorrectly characterized as a moderate relationship. The relationship should
have been characterized as a weak relationship.
This question asks you to use linear regression to examine the relationship
between [educ] and [age]. [educ] and [age] are interval level, satisfying the
level of measurement requirements for regression.
Based on the ANOVA table for the linear regression (F(1, 659) = 9.983, p=0.002),
there was an relationship between the dependent variable "highest year of school
completed" and the independent variable "age". Since the probability of the F
statistic (p=0.002) was less than or equal to the level of significance (0.05), the null
hypothesis that correlation coefficient (R) was equal to 0 was rejected.
The research hypothesis that there was a relationship between the variables was
supported.
SW318
Social Work
Statistics
Slide 56
Linear Regression Hypothesis Test in SPSS (8)
This question asks you to use linear regression to examine the relationship
between [sei] and [age]. [sei] and [age] are interval level, satisfying the
level of measurement requirements for regression.
Based on the ANOVA table for the linear regression (F(1, 629) = .266, p=0.606),
there was no relationship between the dependent variable "socioeconomic index" and
the independent variable "age". Since the probability of the F statistic (p=0.606) was
greater than the level of significance (0.05), the null hypothesis that correlation
coefficient (R) was equal to 0 was not rejected.
The research hypothesis that there was a relationship between the variables was not
supported.
SW318
Social Work Steps in solving Linear Regression Hypothesis
Statistics
Slide 62 Test Problems - 1
Yes
Yes
Yes
SW318
Social Work Steps in solving Linear Regression Hypothesis
Statistics
Slide 64 Test Problems - 3
Yes
Yes