0% found this document useful (0 votes)
128 views4 pages

R Tutorial For STAT 350 For Computer Assignment 9a: Example

This document provides an R tutorial and solutions for a computer assignment examining the relationship between job stress (Stress) and locus of control (LOC) in a sample of 100 accountants. The tutorial includes instructions to [1] make a scatterplot of Stress by LOC with the regression line, [2] compute the correlation between Stress and LOC, [3] find the regression equation for predicting Stress from LOC, [4] report the mean squared error (MSE) and R-squared (R2) values, and [5] confirm MSE and R2 using the ANOVA table. The solutions show a positive linear relationship between Stress and LOC, a correlation of 0.312, and an R2 of 0.09752 indicating
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views4 pages

R Tutorial For STAT 350 For Computer Assignment 9a: Example

This document provides an R tutorial and solutions for a computer assignment examining the relationship between job stress (Stress) and locus of control (LOC) in a sample of 100 accountants. The tutorial includes instructions to [1] make a scatterplot of Stress by LOC with the regression line, [2] compute the correlation between Stress and LOC, [3] find the regression equation for predicting Stress from LOC, [4] report the mean squared error (MSE) and R-squared (R2) values, and [5] confirm MSE and R2 using the ANOVA table. The solutions show a positive linear relationship between Stress and LOC, a correlation of 0.312, and an R2 of 0.09752 indicating
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

R Tutorial for STAT 350 for Computer Assignment 9a

Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke

All of the tutorials for Computer Assignment 9 use the same data set.

Example: (Data Set: loc.txt)


Job Stress and Locus of Control. Many factors, such as the type of job, education
level, and job experience, can affect the stress felt by workers on the job. Locus of
control (LOC) is a term in psychology that describes the extent to which a person
believes he or she is in control of the events that influence his or her life. Is feeling “more
in control” associated with less job stress? A recent study examined the relationship
between LOC and several work-related behavioral measures among certified public
accountants in Taiwan. LOC was assessed using a questionnaire that asked
respondents to select one of two options for each of 23 items. Scores ranged from 0 to
23. Individuals with low LOC believe that their own behavior and attributes determine
their rewards in life. Those with high LOC believe that these rewards are beyond their
control. We will consider a random sample of 100 accountants.

a) Make a scatterplot of the data (including the least-squares regression line) with LOC
on the x-axis and Stress on the y-axis. Briefly describe the relationship between
Stress and LOC.
b) Compute the correlation coefficient between Stress and LOC.
c) Find the equation of the least-squares regression line for predicting Stress from
LOC.
d) What are MSE and R2 for these data?
e) Using the ANOVA table for linear regression, confirm the values of MSE and R2.

Solution:

job <- read.table(file = "loc.txt", header = TRUE)


#
# a) Scatterplot of the data
#
library(ggplot2)
ggplot(job, aes(x=LOC, y=STRESS))+
geom_point() +
geom_smooth(method = lm, se = FALSE) +
ggtitle("Relationship between Stress and LOC") +
xlab("Locus") +
ylab("Stress")
#
# b) Correlation
#
cor(job$LOC, job$STRESS)
#
# c), d) Calculate linear regression and get results
#
job.lm <- lm(STRESS ~ LOC, data = job)
summary(job.lm)
#
# e) ANOVA table
#
anova(job.lm)

1
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907
R Tutorial for STAT 350 for Computer Assignment 9a
Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke

a) Make a scatterplot of the data (including the least-squares regression line) with LOC
on the x-axis and Stress on the y-axis. Briefly describe the relationship between Stress
and LOC.

Solution:

The plot looks linear with a positive direction. I am not sure about the strength because
the scale on the y-axis is so small. I do not see any outliers.

b) Compute the correlation coefficient between Stress and LOC.

Solution:

> cor(job$LOC, job$STRESS)


[1] 0.3122765

The correlation coefficient between Stress and LOC is 0.3122765.


This looks like there is a weak but nonnegligible association between Stress and LOC.

2
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907
R Tutorial for STAT 350 for Computer Assignment 9a
Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke

c) Find the equation of the least-squares regression line for predicting Stress from LOC.

Solution:

Call:
lm(formula = STRESS ~ LOC, data = job)

Residuals:
Min 1Q Median 3Q Max
-1.04704 -0.33806 0.02169 0.30798 1.06715

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.25550 0.14691 15.353 < 2e-16 ***
LOC 0.03991 0.01226 3.254 0.00156 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4513 on 98 degrees of freedom


Multiple R-squared: 0.09752, Adjusted R-squared: 0.08831
F-statistic: 10.59 on 1 and 98 DF, p-value: 0.001562

The answer to this part is highlighted in yellow in the data above.

^
Stress = 2.25550 + 0.03991 LOC
Be sure to always report the equation, not just the values of b0 and b1.

d) What are MSE and R2 for these data?

Solution:

The answer to this part is highlighted in green in the data above.

R2 = 0.09752
For simple linear regression, do not use the adjusted R-squared value.
This does not look very good.

MSE = 0.45132 = 0.2037


This is squared because the reported value is the standard deviation.

3
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907
R Tutorial for STAT 350 for Computer Assignment 9a
Author: Leonore Findsen, Chunyan Sun, Sarah H. Sellke

e) Using the ANOVA table for linear regression, confirm the values of MSE and R2.

Solution:

> anova(job.lm)
Analysis of Variance Table

Response: STRESS
Df Sum Sq Mean Sq F value Pr(>F)
LOC 1 2.1565 2.15651 10.589 0.001562 **
Residuals 98 19.9578 0.20365
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

MSE = 0.20365

SSR SSR 2.1565 2.1565


R 2= = = = =0.09752
SST ( SSR+ SSE) 2.1565+19.9578 22.1143
These numbers match the values from the previous output.

4
STAT 350: Introduction to Statistics
Department of Statistics, Purdue University, West Lafayette, IN 47907

You might also like