0% found this document useful (0 votes)
33 views21 pages

ch14 1

Uploaded by

Ngọc Yến
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views21 pages

ch14 1

Uploaded by

Ngọc Yến
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

Business Statistics, 4e

by Ken Black

Chapter 14
Discrete Distributions

Multiple Regression
Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-1
Learning Objectives

• Develop a multiple regression model.


• Understand and apply significance tests of the
regression model and its coefficients.
• Compute and interpret residuals, the standard
error of the estimate, and the coefficient of
determination.
• Interpret multiple regression computer output.

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-2
Regression Models

Probabilistic Multiple Regression Model:

Y = 0 + 1X1 + 2X2 + 3X3 + . . . + kXk+ 

Y = the value of the dependent (response) variable


0 = the regression constant
1 = the partial regression coefficient of independent variable 1
2 = the partial regression coefficient of independent variable 2
k = the partial regression coefficient of independent variable k
k = the number of independent variables
 = the error of prediction
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-3
Estimated Regression Model
Yˆ  b0  b1 X 1  b2 X 2  b3 X 3    bk X k
where :
Yˆ  predicted value of Y
b  estimate of regression constant
0

b  estimate of regression coefficient 1


1

b  estimate of regression coefficient 2


2

b  estimate of regression coefficient 3


3

b  estimate of regression coefficient k


k

k = number of independent variables


Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-4
Multiple Regression Model with Two
Independent Variables (First-Order)
Y   0  1 X 1   2 X 2
 Population Model

where:  0
= the regression constant

 1
 the partial regression coefficient for independent variable 1

 2
 the partial regression coefficient for independent variable 2
 = the error of prediction
Y  b  b X  b X
0 1 1 2 2
Estimated Model
where: Y  predicted value of Y
b 0
estimate of regression constant
b 1
estimate of regression coefficient 1
b 2
estimate of regression coefficient 2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-5
Response Plane for First-Order Two-Predictor
Multiple Regression Model

Y Vertical Intercept
Y1

Response Plane

X2 X1

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-6
Least Squares Equations for k = 2
For multiple regression models with two independent
variables, the result is three simultaneous equations with
three unknowns (b0, b1, and b2).

b n b  X b  X 0 1 1 2 2
 Y

b  X b  X b  X X   X 1Y
2
0 1 1 1 2 1 2

b  X b  X X b  X   X 2Y
2
0 2 1 1 2 2 2

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-7
Real Estate example

• A real estate study was conducted in a small Louisiana city to


determine what variables, if any, are related to the market price
of a home.
• Several variables were explored, including the number of
bedrooms, the number of bathrooms, the age of the house, the
number of square feet of living space, the total number of square
feet of space, and the number of garages.
• Suppose the researcher wants to develop a regression model to
predict the market price of a home by using only two
variables, “total number of square feet in the house” and
“the age of the house.”

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-8
Real Estate Data
Market Square Age Market Square Age
Price Feet (Years) Price Feet (Years)
($1,000) ($1,000)
Observation Y X1 X2 Observation Y X1 X2
1 63.0 1,605 35 13 79.7 2,121 14
2 65.1 2,489 45 14 84.5 2,485 9
3 69.9 1,553 20 15 96.0 2,300 19
4 7
76.8 2,404 32 16 109.5 2,714 4
5 73.9 1,884 25 17 102.5 2,463 5
6 77.9 1,558 14 18 121.0 3,076 7
7 74.9 1,748 8 19 104.9 3,048 3
8 78.0 3,105 10 20 128.0 3,267 6
9 79.0 1,682 28 21 129.0 3,069 10
10 63.4 2,470 30 22 117.9 4,765 11
11 79.5 1,820 2 23 140.0 4,540 8
12 83.9 2,143 6

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-9
Using Excel

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-10
Excel Output
Regression Statistics
Multiple R 0.86
R Square 0.74
Adjusted R 0.72
Standard Error 11.96
Observations 23.00

ANOVA
  df SS MS F Significance F
Regression 2.00 8189.72 4094.86 28.63 0.00
Residual 20.00 2861.02 143.05
Total 22.00 11050.74     

Coefficients Standard Upper Lower Upper


  Error t Stat P-value Lower 95% 95% 95.0% 95.0%
Intercept 57.35 10.01 5.73 0.00 36.48 78.23 36.48 78.23
Square Feet 0.02 0.00 5.63 0.00 0.01 0.02 0.01 0.02
Age -0.67 0.23 -2.92 0.01 -1.14 -0.19 -1.14 -0.19

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-11
Estimated Regression Model for
Real Estate Example

Yˆ  57.4  0.0177 X 1  0.667 X 2

• The coefficient of X1 (total number of square feet in the house)


is .0177, which means that if all other variables being held
constant, the addition of 1 square foot of space in the house
results in a predicted increase of $17.70 in the price of the home.
• The coefficient of X2 (age) is (-0.667) means that if all other
variables being held constant, a one-unit increase in the age of the
house (1 year) will result in ($1,000) = - $667, a predicted $667
drop in the price.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-12
Predicting the Price of Home

Yˆ  57.4  0.0177 X 1  0.666 X 2

For X 1  2500 and X 2  12,


Yˆ  57.4  0.01772500   0.66612 
 93.658 thousand dollars

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-13
Evaluating the Multiple Regression Model

H 0:  1   2   3    k  0 Testing the
Ha: At least one of the regression coefficients is  0 Overall Model

H 0:  1  0 H 0:  3  0

Ha:  1  0 Ha:  3  0
Significance Tests for
 Individual Regression
H 0:  2  0 H 0:  k  0 Coefficients

Ha:  2  0 Ha:  k  0
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-14
Testing the Overall Model for the Real
Estate Example

H 0 :   2  0 F .01, 2 , 20
 585
.
1

Ha : At least one of the regression coefficients is  0 F Cal


 28.63  585
. , reject H0.

SSR SSE MSR


MSR  MSE  F
k n  k 1 MSE

ANOVA
df SS MS F p-value
Regression 2 8189.723 4094.86 28.63 .000
Residual (Error) 20 2861.017 143.1
Total 22 11050.74
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-15
Significance Test of the Regression
Coefficients for the Real Estate Example
H 0:  1  0

Ha:  1  0 t.025,20 = 2.086

H 0:  2  0
tCal = 5.63 > 2.086, reject H0.
Ha:  2  0

Coefficients Std Dev t Stat p

x1 (Sq.Feet) 0.0177 0.003146 5.63 .000


x2 (Age) -0.666 0.2280 -2.92 .008

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-16
Residuals and Sum of Squares Error for the
Real Estate Example

  Y Y
2

Y Y
2

Observation Y Y Y  Y Observation Y Y Y  Y
1 43.0 42.466 0.534 0.285 13 59.7 65.602 -5.902 34.832
2 45.1 51.465 -6.365 40.517 14 64.5 75.383 -10.883 118.438
3 49.9 51.540 -1.640 2.689 15 76.0 65.442 10.558 111.479
4 56.8 58.622 -1.822 3.319 16 89.5 82.772 6.728 45.265
5 53.9 54.073 -0.173 0.030 17 82.5 77.659 4.841 23.440
6 57.9 55.627 2.273 5.168 18 101.0 87.187 13.813 190.799
7 54.9 62.991 -8.091 65.466 19 84.9 89.356 -4.456 19.858
8 58.0 85.702 -27.702 767.388 20 108.0 91.237 16.763 280.982
9 59.0 48.495 10.505 110.360 21 109.0 85.064 23.936 572.936
10 63.4 61.124 2.276 5.181 22 97.9 114.447 -16.547 273.815
11 59.5 68.265 -8.765 76.823 23 120.0 112.460 7.540 56.854
12 63.9 71.322 -7.422 55.092 SSE 2861.017

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-17
MINITAB Residual Diagnostics for the Real
Estate Problem
R e s i d u al M o d el Di a g n o s ti c s
N or m al Pl ot of R es id uals I C h art of R es i d u als
40
20 3.0S L=31. 26
30
10 20
Resi dual

Residual
10
0 X =-7.2E -14
0
- 10 -1 0
-2 0
- 20
-3 0 -3.0S L=-31.26

- 30 -4 0
-2 -1 0 1 2 0 10 20
N o r m a l S c o re O bs er vati o n N u m ber

H is t ogr a m of R e s id u als R e s i d u als v s . F it s


6
20
5
10
Fr eq uency

4
Residual

3 0

2 -1 0
1 -2 0
0 -3 0
-3 0 -2 0 -1 0 0 10 20 30 60 70 80 90 100 110 120 130 140
R es i d ual Fit

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-18
SSE and Standard Error of the Estimate

ANOVA
df SS MS F P
Regression 2 8189.7 4094.9 28.63 .000
Residual (Error) 20 2861.0 143.1
Total 22 11050.7

SSE
SSE
S e

n  k 1
2861

23  2  1
 1196
.
where: n = number of observations
k = number of independent variables

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-19
Coefficient of Multiple Determination (R2)
Reports the proportion of total variation in y explained by all x
variables taken together

SSYY
SSE SSR
ANOVA
df SS MS F p
Regression 2 8189.7 4094.89 28.63 .000
Residual (Error) 20 2861.0 143.1
Total 22 11050.7

SSR 8189.723
R SSY  11050.74 .741
2

SSE 2861.017
R  1  SSY  1  11050.74 .741
2

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-20
Adjusted R2

n-1 n-k-1 SSE SSYY


ANOVA
df SS MS F p
Regression 2 8189.7 4094.9 28.63 .000
Residual (Error) 20 2861.0 143.1
Total 22 11050.7

SSE 2861.017
adj. R  1  n  k  1  1  23  2  1  1.285 .715
2
SSY 11050.74
n 1 23  1

 n 1  Shows the proportion of variation


R A2  1  (1  R )  2
in y explained by all x variables
 n  k 1
adjusted for the number of x
variables used
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons.
14-21

You might also like