Chap13 - Simple Linear Regression
Chap13 - Simple Linear Regression
Chapter 13
Simple Regression
Correlation Analysis
The population correlation coefficient is
denoted ρ (the Greek letter rho)
The sample correlation coefficient is
s xy
r
sxsy
where
s xy
(x x)(y y) i i
n 1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Hypothesis Test for Correlation
r (n 2)
t
2
(1 r )
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Decision Rules
Hypothesis Test for Correlation
t0.025;8 2.3060
/2 /2
Yi β0 β1x i ε i
Yi β0 β1Xi ε i
Linear component Random Error
component
Y Yi β0 β1Xi ε i
Observed Value
of Y for Xi
εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value
Intercept = β0
Xi X
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Simple Linear Regression
Equation
The simple linear regression equation provides an
estimate of the population regression line
Estimated Estimate of Estimate of the
(or predicted) the regression regression slope
y value for
observation i intercept
Value of x for
ei ( y i - yˆ i ) y i - (b0 b1x i )
min (y i yˆ i )2
(x x)(y y)
i i
sY
b1 i1 n
rxy
sX
i
(x
i1
x) x
2
b0 y b1x
350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet
xi X
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Coefficient of Determination, R2
The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
The coefficient of determination is also called
R-squared and is denoted as R2
SSR regression sum of squares
2
R
SST total sum of squares
2
note: 0 R 1
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Examples of Approximate
r2 Values
Y
r2 = 1
X
r =1
2
X
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Examples of Approximate
r2 Values
r2 = 0
Y
No linear relationship
between X and Y:
2 2
R r xy
SSE i
e 2
σˆ s
2
2
e
i1
n 2 n 2
Division by n – 2 instead of n – 1 is because the simple regression
model uses two estimated parameters, b0 and b1, instead of one
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
small se X large se X
2 2
s s
s2b1 e
e
(xi x) (n 1)s x
2 2
where:
sb1 = Estimate of the standard error of the least squares slope
SSE
se = Standard error of the estimate
n 2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error
Observations
41.33032
10
sb1 0.03297
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Y Y
b1 sb1
H0: β1 = 0 From Excel output:
H1: β1 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
b1 β1 0.10977 0
t t 3.32938
sb1 0.03297
t8,.025 = 2.3060
Decision:
/2=.025 /2=.025 Reject H0
Conclusion:
Reject H0 Do not reject H0 Reject H0
There is sufficient evidence
-tn-2,α/2 0 tn-2,α/2 that square footage affects
-2.3060 2.3060 3.329 house price
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Inferences about the Slope:
t Test Example
(continued)
P-value = 0.01039
P-value
H0: β1 = 0 From Excel output:
H1: β1 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
SSR
MSR
k
SSE
MSE
n k 1
where F follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom
98.25 0.1098(2000)
317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chapter Summary
Introduced the linear regression model
Reviewed correlation and the assumptions of
linear regression
Discussed estimating the simple linear
regression coefficients
Described measures of variation
Described inference about the slope
Addressed estimation of mean values and
prediction of individual values
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
23. Can you use movie critics’ opinions to forecast box office
receipts on the opening weekend? The following data, stored in
Tomatometer , indicate the Tomatometer rating, the percentage
of professional critic reviews that are positive, and the receipts
per theater ($thousands) on the weekend a movie opened for
ten movies:
a) Use the least-squares method to
compute the regression coefficients
b0 and b1. Determine the coefficient of
determination, r2 .