Linear Regression Example Data: House Price in $1000s (Y) Square Feet (X)
Linear Regression Example Data: House Price in $1000s (Y) Square Feet (X)
350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet
Linear Regression Example
Using Excel
Tools
--------
Data Analysis
--------
Regression
Linear Regression Example
Excel Output
Regression Statistics The regression equation is:
house price = 98.24833 + 0.10977 (square feet)
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
350 Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet
= 98.25 + 0.1098(200 0)
= 317.85
450
400
House Price ($1000s)
350
300 Do not try to
250 extrapolate beyond
200
150
the range of
100 observed X’s
50
0
Statistics
0 for Managers
500 Using
1000 1500 2000 2500 3000
Microsoft Excel, 5e © 2008 Chap 13-9
Prentice-Hall, Inc. Square Feet
Measures of Variation
Total variation is made up of two parts:
SST = SSR + SSE
Total Sum of Regression Sum of Error Sum of
Squares Squares Squares
0 r 1
2
Linear Regression Example
Coefficient of Determination, r2
Regression Statistics
SSR 18934.9348
Multiple R 0.76211 r = 2
= = 0.58082
R Square 0.58082 SST 32600.5000
Adjusted R Square 0.52842
58.08% of the variation in house
Standard Error 41.33032
prices is explained by variation in
Observations 10
square feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
SSE i i
(Y − Yˆ ) 2
SYX = = i =1
n−2 n−2
Where
SSE = error sum of squares
n = sample size
Linear Regression Example
Standard Error of Estimate
Regression Statistics
Multiple R 0.76211 S YX = 41.33032
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
small s YX X large s YX X
b1 − β1 b1 = regression slope
t= coefficient
b1 Sb1
• H0: β1 = 0 From Excel output:
• H1: β1 ≠ 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
b1 − β1 0.10977 − 0
t= = t = 3.32938
Sb1 0.03297
Inferences About the Slope:
t Test Example
d.f. = 10- 2 = 8
a/2=.025 a/2=.025
Decision: Reject H0
• H1: β1 ≠ 0
Decision: Reject H0, since p-value < α
SSR
MSR =
k
SSE
MSE =
n − k −1
where F follows an F distribution with k numerator degrees of freedom
and (n - k - 1) denominator degrees of freedom
Regression Statistics
Multiple R 0.76211
MSR 18934.9348
R Square 0.58082 F= = = 11.0848
Adjusted R Square 0.52842 MSE 1708.1957
Standard Error 41.33032
Observations 10 With 1 and 8 degrees of P-value for
freedom the F-Test
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
F-Test for Significance
• H0: β1 = 0 Test Statistic:
• H1: β1 ≠ 0 MSR
F= = 11.08
• a = .05 MSE
• df1= 1 df2 = 8
Decision:
Critical Value: Reject H0 at a = 0.05
Fa = 5.32
a = .05 Conclusion:
There is sufficient evidence that
0 Do not Reject H0
F house size affects selling price
reject H0 F.05 = 5.32
Residual Analysis
ei = Yi − Yˆi
• The residual for observation i, ei, is the difference between its
observed and predicted value
• Check the L.I.N.E. assumptions of regression by examining the
residuals
– Examine for Linearity assumption
– Evaluate Independence assumption
– Evaluate Normal distribution assumption
– Examine Equal variance for all levels of X
• Graphical Analysis of Residuals
– Can plot residuals vs. X
Residual Analysis for Linearity
Y Y
x x
residuals
x residuals x
residuals
X
residuals
X
Checking for Normality
Y Y
x x
residuals
x residuals x
2 273.87671 38.12329 40
Residuals
3 284.85348 -5.853484 20
4 304.06284 3.937162 0
0 1000 2000 3000
5 218.99284 -19.99284 -20
6 268.38832 -49.38832 -40
7 356.20251 48.79749 -60
8 367.17929 -43.17929 Square Feet
9 254.6674 64.33264
10 284.85348 -29.85348 Does not appear to violate
any regression assumptions
Measuring Autocorrelation:
The Durbin-Watson Statistic