0% found this document useful (0 votes)

9 views

Lesson Week 13

Uploaded by

1roebot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Lesson Week 13

Uploaded by

1roebot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Week 13: Simple Linear Regression Analysis

in R
written by Junvie Pailden

Load the required packages for this lesson.

# install the necessary package if it doesn't exist
if (!require(mosaic)) install.packages(`mosaic`)
# load the packages
library(mosaic)

Checking the Basic Regression Assumptions

Recall the data used in Week 12 to model six-year graduation rate (%) using the predictors
student-related expenditure per full-time student, and median SAT score for the 38 primarily
undergraduate public universities and colleges in the United States with enrollments between
10,000 and 20,000.

gradrates <- read.csv("https://www.siue.edu/~jpailde/gradrates.csv")

str(gradrates)
# 'data.frame': 38 obs. of 3 variables:
# $ Graduation.Rate: num 81.2 66.8 66.4 66.1 64.9 63.7 62.6 62.5 61.2
59.8 ...
# $ Expenditure : int 7462 7310 6959 8810 7657 8063 8352 7789 8106
7776 ...
# $ Median.SAT : int 1160 1115 1070 1205 1135 1060 1130 1200 1015
1100 ...

Let's consider, for the moment, only the predictor student expenditure Expenditure to model the
six year graduation rates using a simple regression line.

xyplot(Graduation.Rate ~ Expenditure,
data = gradrates,
type = c("p", "r"))
model1 <- lm(Graduation.Rate ~ Expenditure,
data = gradrates)
model1
#
# Call:
# lm(formula = Graduation.Rate ~ Expenditure, data = gradrates)
#
# Coefficients:
# (Intercept) Expenditure
# 12.41357 0.00483

The fitted regression line is: Graduation.Rate = 12.4136 + 0.0048*Expenditure.

According to this model the average six year graduation rate increases by 0.0048 percent for
each additional dollar increase in student expenditure. Equivalently, the average six year
graduation rate increases by 2.4 (0.0048*500) percent for each additional 500 dollar increase in
student expenditure.

Notice from the graph above that the simple regression model tends to fit better at low student
expenditure values than at higher student expenditure values. In practice, it is easier to analyze
the regression model errors using a residual plot.

We define the residuals as the vertical distances from the observed (response) graduation rate
Graduation.Rate to the fitted regression line. The residual plot is a scatterplot of the residuals
and the fitted response values.

The mplot() function can facilitate creating a variety of useful plots, including the residuals vs.
fitted value scatterplots, by specifying the which = 1 option.

mplot(model1, which = 1)
# [[1]]
We can see from the residual plot that the residuals are closer to zero at low student expenditure
values (or low graduation rates); the variance is not constant across all student expenditure (or
graduation rates), but increasing with student expenditure (spread around the horizontal axis in
the residual plot).

Inference (tests or confidence intervals) about the model is usually based on the assumption that
the errors are normally distributed with mean zero and constant variance. Checking whether the
error (estimated by the residuals) has zero mean and constant variance is often done visually
using a residual plot.

We can also the mplot() function check the normality of the errors visually using a normal
quantile-quantile plot (QQplot, check Week 9) by using the argument which = 2.

mplot(model1, which = 2)
# [[1]]
The QQplot above indicates that the normality assumption seems to be valid for the error in the
simple regression model on graduation rates using student expenditure as predictor.

Inferences on the regression slope

Confidence interval for the slope

When the assumptions of the simple linear regression model are satisfied, a confidence interval
for the slope of the population regression line has the form

(estimated slope) +/- (t critical value) x (standard error)

We can use the function confint() to construct confidence intervals for the intercept and the
slope. We are mainly interested in the estimated slope interval.

confint(model1) # default 95% confidence interval

# 2.5 % 97.5 %
# (Intercept) -2.11e+01 45.90326
# Expenditure 3.79e-04 0.00929
confint(model1, level = 0.99) # 90% confidence interval
# 0.5 % 99.5 %
# (Intercept) -32.49296 57.3201
# Expenditure -0.00114 0.0108

The 99% confidence interval for the slope is (-0.00114, 0.0108). From this interval, we can say
zero is a plausible value for the population regression slope on student expenditure as predictor
of graduation rates. Hence, it is possible that student expenditure has no influence on the
behavior of graduation rates.
Hypothesis testing for the slope

The model utility test for simple linear regression is the test of Ho: beta = 0 versus Ha: beta
not equal to 0. The null hypothesis specifies that there is no useful linear relationship
between the predictor x and the response y. If Ho is rejected in favor of Ha, we conclude that the
simple linear regression model is useful for predicting the response y.

We can use the summary functions summary or msummary to see the tests statistics and p-value
for the testing the slope.

msummary(model1)
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 12.41357 16.51288 0.75 0.457
# Expenditure 0.00483 0.00220 2.20 0.034 *
#
# Residual standard error: 12.3 on 36 degrees of freedom
# Multiple R-squared: 0.119, Adjusted R-squared: 0.0941
# F-statistic: 4.84 on 1 and 36 DF, p-value: 0.0343
coef(msummary(model1)) # to report just the summary for model coefficients
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 12.41357 16.5129 0.752 0.4571
# Expenditure 0.00483 0.0022 2.200 0.0343

Using a 1% level of significance (alpha = 0.01), we fail to reject the null hypothesis that the
population slope is zero. Based on this simple regression model, there is insufficient evidence to
conclude that the student expenditure is useful in predicting graduate rates.

Next, let us use student median SAT score Median.SAT as predictor of graduation rates
Graduation.Rate in our simple linear regression model. We follow the same analysis as above.

model2 <- lm(Graduation.Rate ~ Median.SAT,

data = gradrates)
model2 # estimated regression model equation
#
# Call:
# lm(formula = Graduation.Rate ~ Median.SAT, data = gradrates)
#
# Coefficients:
# (Intercept) Median.SAT
# -57.433 0.102
xyplot(Graduation.Rate ~ Median.SAT,
data = gradrates,
type = c("p", "r"))
confint(model2, level = 0.99) # confidence interval
# 0.5 % 99.5 %
# (Intercept) -113.845 -1.020
# Median.SAT 0.048 0.157
coef(summary(model2))
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -57.433 20.74 -2.77 8.84e-03
# Median.SAT 0.102 0.02 5.12 1.04e-05

Contrary to student expenditure, we have reason to believe that student median SAT score is a
useful predictor of graduation rates using a 1% level of significance. Graduation rate on average
increase by 0.102 percent for every additional increase in student median SAT score.

Steps in a Simple Linear Regression Analysis

Given a bivariate numerical data set consisting of observations on a dependent variable y and
independent variable x.

 Step 1. Summarize the data graphically by constructing scatterplot.

 Step 2. Based on the scatterplot, decide if it looks like the relationship between x and y is
approximately linear. If so, proceed to the next step.
 Step 3. Find the equation of the least-squares regression line.
 Step 4. Construct a residual plot and look for any patters or unusual features that may
indicate that a line is not the best way to summarize the
 Step 5. Perform inference on the slope.

Project Report (Design and Fabrication of A Hydraulic Arm Using Wood and Syringes)
No ratings yet
Project Report (Design and Fabrication of A Hydraulic Arm Using Wood and Syringes)
27 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
Assignments
No ratings yet
Assignments
6 pages
Class Research Agenda: Inquiries, Investigation and Immersion Quarter 1 - Module 1
100% (5)
Class Research Agenda: Inquiries, Investigation and Immersion Quarter 1 - Module 1
17 pages
3 Simple Linear Regression
No ratings yet
3 Simple Linear Regression
71 pages
PE Civil: Transportation e-book Practice Exam
No ratings yet
PE Civil: Transportation e-book Practice Exam
41 pages
Simple Linear Regression sample
No ratings yet
Simple Linear Regression sample
55 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Enjoy immediate access to the full Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual in PDF.
100% (13)
Enjoy immediate access to the full Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual in PDF.
48 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual - Quickly Download For The Best Reading Experience
100% (3)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual - Quickly Download For The Best Reading Experience
49 pages
Exercice V
No ratings yet
Exercice V
5 pages
Quanti_Simple-Linear-Regression_with-group-activities
No ratings yet
Quanti_Simple-Linear-Regression_with-group-activities
6 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Tutorial 1-13 Answer Intermediate Macro
No ratings yet
Tutorial 1-13 Answer Intermediate Macro
40 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Chapter Simple Linear Regression 1
100% (1)
Chapter Simple Linear Regression 1
77 pages
15.Simple Linear Regression-530
No ratings yet
15.Simple Linear Regression-530
54 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual - Available For Instant Download And Reading
100% (1)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual - Available For Instant Download And Reading
43 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual - Quick Download In Full PDF Format With All Chapters
100% (3)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual - Quick Download In Full PDF Format With All Chapters
45 pages
CH 2
No ratings yet
CH 2
31 pages
6th Lecture Note 108335647 230518 203102
No ratings yet
6th Lecture Note 108335647 230518 203102
35 pages
Econometrics - Functional Forms
No ratings yet
Econometrics - Functional Forms
22 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual pdf download
100% (2)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual pdf download
49 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Topic3_SimpleLinearRegressionModels
No ratings yet
Topic3_SimpleLinearRegressionModels
97 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual download
100% (1)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual download
52 pages
Wooldridge_7e_Ch06_IM
No ratings yet
Wooldridge_7e_Ch06_IM
20 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
SM Notes 2020
No ratings yet
SM Notes 2020
139 pages
STAT 445 Regression Analysis
No ratings yet
STAT 445 Regression Analysis
49 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual pdf download
100% (5)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual pdf download
46 pages
CH 8
No ratings yet
CH 8
60 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
22 Linear Fit Post
No ratings yet
22 Linear Fit Post
7 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
The Nature of Econometrics and The Modelling Process: Session 1
No ratings yet
The Nature of Econometrics and The Modelling Process: Session 1
51 pages
Stats101A - Chapter 1
No ratings yet
Stats101A - Chapter 1
25 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Wooldridge 7e Ch06 SM
No ratings yet
Wooldridge 7e Ch06 SM
9 pages
Estad Istica II Chapter 4: Simple Linear Regression
No ratings yet
Estad Istica II Chapter 4: Simple Linear Regression
46 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Linear Regression
No ratings yet
Linear Regression
56 pages
TP05 Econometrics p1
No ratings yet
TP05 Econometrics p1
22 pages
Lecture 2 Simple Regression Model
100% (1)
Lecture 2 Simple Regression Model
47 pages
Immediate download Using Econometrics A Practical Guide 6th Edition Studenmund Solutions Manual all chapters
100% (11)
Immediate download Using Econometrics A Practical Guide 6th Edition Studenmund Solutions Manual all chapters
55 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual download
100% (2)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual download
44 pages
Additional Problem Set Units I and II
No ratings yet
Additional Problem Set Units I and II
8 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Lab 9 Report
No ratings yet
Lab 9 Report
5 pages
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual instant download
100% (1)
Introductory Econometrics A Modern Approach 4th Edition Wooldridge Solutions Manual instant download
51 pages
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
No ratings yet
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
38 pages
Machine Learning-Lecture 1(Student)
No ratings yet
Machine Learning-Lecture 1(Student)
14 pages
Advanced - Linear Regression
No ratings yet
Advanced - Linear Regression
57 pages
Immediate download Using Econometrics A Practical Guide 6th Edition Studenmund Solutions Manual all chapters
100% (5)
Immediate download Using Econometrics A Practical Guide 6th Edition Studenmund Solutions Manual all chapters
50 pages
How to Perform Simple Linear Regression in Python
No ratings yet
How to Perform Simple Linear Regression in Python
8 pages
Econometrics 7
No ratings yet
Econometrics 7
49 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Metal Seal Design Guide
No ratings yet
Metal Seal Design Guide
49 pages
Hygiene Catalogue INT 14112022 Compressed
No ratings yet
Hygiene Catalogue INT 14112022 Compressed
80 pages
Test in Science 8 Sounds
No ratings yet
Test in Science 8 Sounds
1 page
Customer Service Checklist - KG Signed - 1-10
No ratings yet
Customer Service Checklist - KG Signed - 1-10
1 page
Physical Science Week 2
No ratings yet
Physical Science Week 2
9 pages
Edexcel Foundation Predicted
No ratings yet
Edexcel Foundation Predicted
17 pages
Kensville Resort - Case Study
No ratings yet
Kensville Resort - Case Study
13 pages
The Basic Elements & Phasors
No ratings yet
The Basic Elements & Phasors
120 pages
Lab 02 PDF
No ratings yet
Lab 02 PDF
3 pages
C Sai Bhavana CSlab MATLAB
No ratings yet
C Sai Bhavana CSlab MATLAB
8 pages
Manual DRX Programas
No ratings yet
Manual DRX Programas
10 pages
Foundations of Quantum Field Theory 3rd Edition Klaus Dieter Rothe instant download
100% (2)
Foundations of Quantum Field Theory 3rd Edition Klaus Dieter Rothe instant download
62 pages
Dynamic Balancing of The Axial Compressor of A Gas Turbine
No ratings yet
Dynamic Balancing of The Axial Compressor of A Gas Turbine
3 pages
Elise Ev 2018
No ratings yet
Elise Ev 2018
5 pages
Courses
No ratings yet
Courses
3 pages
PDF Drones to Go A Crash Course for Scientists and Makers 1st Edition Julio Alberto Mendoza-Mendoza download
100% (3)
PDF Drones to Go A Crash Course for Scientists and Makers 1st Edition Julio Alberto Mendoza-Mendoza download
50 pages
WS1-G7 - 2nd Term
No ratings yet
WS1-G7 - 2nd Term
8 pages
1 Simple Recall Test
0% (1)
1 Simple Recall Test
13 pages
BTPSG Green Fund Second Party Opinion 09302019
No ratings yet
BTPSG Green Fund Second Party Opinion 09302019
10 pages
TheFormulaforMiracles Part1
100% (3)
TheFormulaforMiracles Part1
114 pages
Section A Criteria A & C (Knowledge &understanding and Communication)
No ratings yet
Section A Criteria A & C (Knowledge &understanding and Communication)
9 pages
O Grice's Four Maxims in Conversation
No ratings yet
O Grice's Four Maxims in Conversation
10 pages
Suggested Investigatory Projects - Class-11
50% (4)
Suggested Investigatory Projects - Class-11
2 pages
Long Welded Rails: Indian Railways Institute of Civil Engineering Pune
No ratings yet
Long Welded Rails: Indian Railways Institute of Civil Engineering Pune
136 pages
写作个人陈述的指南
100% (1)
写作个人陈述的指南
8 pages
10 CBSE Q Sources of Energy
No ratings yet
10 CBSE Q Sources of Energy
3 pages
El Nino Plan
No ratings yet
El Nino Plan
4 pages
SSRN Id4685971
No ratings yet
SSRN Id4685971
10 pages

Lesson Week 13

Uploaded by

Lesson Week 13

Uploaded by

Week 13: Simple Linear Regression Analysis

Load the required packages for this lesson.

Checking the Basic Regression Assumptions

gradrates <- read.csv("https://www.siue.edu/~jpailde/gradrates.csv")

The fitted regression line is: Graduation.Rate = 12.4136 + 0.0048*Expenditure.

Inferences on the regression slope

(estimated slope) +/- (t critical value) x (standard error)

confint(model1) # default 95% confidence interval

model2 <- lm(Graduation.Rate ~ Median.SAT,

Steps in a Simple Linear Regression Analysis

 Step 1. Summarize the data graphically by constructing scatterplot.

You might also like