R Tutorial For STAT 350 For Computer Assignment 9a: Example
R Tutorial For STAT 350 For Computer Assignment 9a: Example
All of the tutorials for Computer Assignment 9 use the same data set.
a) Make a scatterplot of the data (including the least-squares regression line) with LOC
on the x-axis and Stress on the y-axis. Briefly describe the relationship between
Stress and LOC.
b) Compute the correlation coefficient between Stress and LOC.
c) Find the equation of the least-squares regression line for predicting Stress from
LOC.
d) What are MSE and R2 for these data?
e) Using the ANOVA table for linear regression, confirm the values of MSE and R2.
Solution:
1
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907
R Tutorial for STAT 350 for Computer Assignment 9a
Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke
a) Make a scatterplot of the data (including the least-squares regression line) with LOC
on the x-axis and Stress on the y-axis. Briefly describe the relationship between Stress
and LOC.
Solution:
The plot looks linear with a positive direction. I am not sure about the strength because
the scale on the y-axis is so small. I do not see any outliers.
Solution:
2
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907
R Tutorial for STAT 350 for Computer Assignment 9a
Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke
c) Find the equation of the least-squares regression line for predicting Stress from LOC.
Solution:
Call:
lm(formula = STRESS ~ LOC, data = job)
Residuals:
Min 1Q Median 3Q Max
-1.04704 -0.33806 0.02169 0.30798 1.06715
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.25550 0.14691 15.353 < 2e-16 ***
LOC 0.03991 0.01226 3.254 0.00156 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
^
Stress = 2.25550 + 0.03991 LOC
Be sure to always report the equation, not just the values of b0 and b1.
Solution:
R2 = 0.09752
For simple linear regression, do not use the adjusted R-squared value.
This does not look very good.
3
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907
R Tutorial for STAT 350 for Computer Assignment 9a
Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke
e) Using the ANOVA table for linear regression, confirm the values of MSE and R2.
Solution:
> anova(job.lm)
Analysis of Variance Table
Response: STRESS
Df Sum Sq Mean Sq F value Pr(>F)
LOC 1 2.1565 2.15651 10.589 0.001562 **
Residuals 98 19.9578 0.20365
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
MSE = 0.20365
4
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907