0% found this document useful (0 votes)
72 views88 pages

Nsbe9ege Ism Ch12

Uploaded by

高一二
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views88 pages

Nsbe9ege Ism Ch12

Uploaded by

高一二
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 88

Chapter 12:

Multiple Regression

12.1 Given the following estimated linear model: ^y =12+5 x 1 +6 x 2+2 x 3

a. ^y =12+5 ( 11 )+ 6 ( 24 ) +2 ( 27 ) =265
b. ^y =12+5 ( 31 ) +6 ( 20 )+ 2 ( 17 )=321
c. ^y =12+5 ( 32 ) +6 ( 29 )+ 2 ( 13 ) =372
d. ^y =12+5 ( 30 )+ 6 ( 26 ) +2 ( 9 )=336

12.2 Given the following estimated linear model:


a. = 174
b. = 181
c. = 311
d. = 188

12.3 Given the following estimated linear model:


a. = 262
b. = 488
c. = 478
d. = 378

12.4 Given the following estimated linear model:


a. increases by 8
b. increases by 8
c. increases by 24

12.5 Given the following estimated linear model:


a. decreases by 8
b. decreases by 6
c. increases by 28

Copyright © 2020 Pearson Education Ltd.


.
12-1
12-2 Statistics for Business and Economics, 9th Edition, Global Edition

12.6 The estimated regression slope coefficients are interpreted as follows:


b1 = .661: All else being equal, an increase in the plane’s top speed by one mph
will increase the expected number of hours in the design effort by an
estimated .661 million or 661 thousand worker-hours.

b2 = .065: All else being equal, an increase in the plane’s weight by one ton will
increase the expected number of hours in the design effort by an estimated .065
million or 65 thousand worker-hours.

b3 = -.018: All else being equal, an increase in the percentage of parts in common
with other models will result in a decrease in the expected number of hours in the
design effort by an estimated .018 million or 18 thousand worker-hours.

12.7 The estimated regression slope coefficients are interpreted as follows:


b 1=0.046 .All else being equal, an increase of one unit in the change over the quarter
in bond purchases by financial institutions results in an estimated .046 increase in the
change over the quarter in the bond interest rates.

b 2=−0.073 .All else being equal, an increase of one unit in the change over the
quarter in bond sales by financial institutions results in an estimated .073 decrease in
the change over the quarter in the bond interest rates.

12.8 a. b1 = .052: All else being equal, an increase of one hundred dollars in
weekly income results in an estimated .052 quart per week increase in milk
consumption. b2 = 1.14: All else being equal, an increase in family size by one
person will result in an estimated increase in milk consumption by 1.14 quarts
per week.
b. The intercept term b0 of -.025 is the estimated milk consumption of quarts of
milk per week given that the family’s weekly income is 0 dollars and there are
0 members in the family. This is likely extrapolating beyond the observed data
series and is not a useful interpretation.

12.9 a. b1 = .653: All else being equal, a one unit increase in the average number of
meals eaten per week will result in an estimated .653 pound gained during
freshman year.
b2 = -1.345: All else being equal, a one unit increase in the average number of
hours of exercise per week will result in an estimated 1.345 pound weight loss.
b3 = .613: All else being equal, a one unit increase in the average number of
beers consumed per week will result in an estimated .613 pound weight gain.
b. The intercept term b0 of 7.35 is the estimated amount of weight gain during the
freshman year given that the meals eaten is 0, hours exercise is 0 and there are
no beers consumed per week. This is likely extrapolating beyond the observed
data series and is not a useful interpretation.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-3

12.10
Compute the slope coefficients for the model: ^y i=b0 +b1 x 1i +b 2 x 2 i
s y (r x y −r x x r x y ) s y (r x y −r x x r x y )
Given thatb 1= , b2 =
1 1 2 2 2 1 2 1

2 2
s x (1−r x x )
1 1 2
s x (1−r x x )
2 1 2

a. b 1=100 ¿ ¿
b. b 1=100 ¿ ¿
c. b 1=100 ¿ ¿
d. b 1=100 ¿ ¿

12.11 a. For Y = a0 + a1X1, .

When the correlation between X1 and X2 is 0, a1 is still defined as, . This


is because there is no X2 in the equation, Y = a0 + a1X1. When the correlation
between X1 and X2 is 0, then the slope coefficient of the X1 term in Y = b0 + b1X1 +
b2X2 simplifies to the slope coefficient of the bivariate regression:

Start with equation 12.4: .


Note that if the correlation between X1 and X2 is zero, then the second terms in
both the numerator and denominator are zero and the formula algebraically

reduces to which is the equivalent of the bivariate slope coefficient.

b. For Y = a0 + a1X1, when the correlation between X1 and X2 is 1, a1 is still

defined as, .
For Y = b0 + b1X1 + b2X2,when the correlation between X1 and X2 is 1, then the
denominator goes to 0 and the slope coefficient is undefined.

12.12 a. Electricity sales as a function of number of customers and price


Regression Analysis: SalesMwh versus Pricelec, Numcust
The regression equation is
SalesMwh = - 584616 + 15421 Pricelec + 2.31 Numcust

Predictor Coef SE Coef T P


Constant -584616 267660 -2.18 0.033
Pricelec 15421 21103 0.73 0.468
Numcust 2.3068 0.2007 11.49 0.000

S = 64625.1 R-Sq = 81.6% R-Sq(adj) = 81.0%

Analysis of Variance

Source DF SS MS F P
12-4 Statistics for Business and Economics, 9th Edition, Global Edition

Regression 2 1.20344E+12 6.01719E+11 144.08 0.000


Residual Error 65 2.71466E+11 4176402085
Total 67 1.47490E+12

All else being equal, for every one unit increase in the price of electricity, we estimate
that sales will increase by 15,421 mwh. Note that this estimated coefficient is not
significantly different from zero (p-value = .468).
All else being equal, for every additional residential customer who uses electricity in the
heating of their home, we estimate that sales will increase by 2.31 mwh.

b. Electricity sales as a function of number of customers


Regression Analysis: salesmw2 versus numcust2
The regression equation is
salesmw2 = - 410202 + 2.20 numcust2
Predictor Coef SE Coef T P
Constant -410202 114132 -3.59 0.001
numcust2 2.2027 0.1445 15.25 0.000

S = 66282 R-Sq = 78.9% R-Sq(adj) = 78.6%


Analysis of Variance
Source DF SS MS F P
Regression 1 1.02136E+12 1.02136E+12 232.48 0.000
Residual Error 62 2.72381E+11 4393240914
Total 63 1.29374E+12
Regression Analysis: SalesMwh versus Numcust
The regression equation is
SalesMwh = - 404100 + 2.19 Numcust

Predictor Coef SE Coef T P


Constant -404100 102674 -3.94 0.000
Numcust 2.1947 0.1290 17.02 0.000

S = 64396.5 R-Sq = 81.4% R-Sq(adj) = 81.2%

Analysis of Variance

Source DF SS MS F P
Regression 1 1.20121E+12 1.20121E+12 289.66 0.000
Residual Error 66 2.73696E+11 4146912960
Total 67 1.47490E+12

An additional residential customer will add 2.19 mwh to electricity sales.


The two models have roughly equivalent explanatory power; therefore, adding price as a
variable does not add a significant amount of explanatory power to the model. There
appears to be a problem of high correlation between the independent variables of price
and customers.

c. Electricity sales as a function of price and degree days


Regression Analysis: SalesMwh versus Pricelec, Degreday
The regression equation is
SalesMwh = 2370438 - 173882 Pricelec + 56.7 Degreday

Predictor Coef SE Coef T P


Constant 2370438 142346 16.65 0.000
Pricelec -173882 23870 -7.28 0.000
Degreday 56.69 58.65 0.97 0.337

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-5

S = 111735 R-Sq = 45.0% R-Sq(adj) = 43.3%

Analysis of Variance

Source DF SS MS F P
Regression 2 6.63398E+11 3.31699E+11 26.57 0.000
Residual Error 65 8.11506E+11 12484705377
Total 67 1.47490E+12

All else being equal, an increase in the price of electricity will reduce electricity sales by
173,882 mwh.
All else being equal, an increase in the degree days (departure from normal weather) by
one unit will increase electricity sales by 56.7 mwh.
Note that the coefficient on the price variable is now negative, as expected, and it is
significantly different from zero (p-value = .000)

d. Electricity sales as a function of disposable income and degree days


Regression Analysis: SalesMwh versus YD87, Degreday
The regression equation is
SalesMwh = 317513 + 318 YD87 + 57.1 Degreday

Predictor Coef SE Coef T P


Constant 317513 60649 5.24 0.000
YD87 317.91 18.73 16.97 0.000
Degreday 57.06 33.70 1.69 0.095

S = 64613.5 R-Sq = 81.6% R-Sq(adj) = 81.0%

Analysis of Variance

Source DF SS MS F P
Regression 2 1.20353E+12 6.01767E+11 144.14 0.000
Residual Error 65 2.71369E+11 4174903356
Total 67 1.47490E+12

All else being equal, an increase in personal disposable income by one unit will increase
electricity sales by 318 mwh.
All else being equal, an increase in degree days by one unit will increase electricity sales
by 57.1 mwh.

12.13 a. mpg as a function of horsepower and weight


Regression Analysis: milpgal versus horspwr, weight
The regression equation is
milpgal = 55.8 - 0.105 horspwr - 0.00661 weight
150 cases used 5 cases contain missing values
Predictor Coef SE Coef T P
Constant 55.769 1.448 38.51 0.000
horspwr -0.10489 0.02233 -4.70 0.000
weight -0.0066143 0.0009015 -7.34 0.000

S = 3.901 R-Sq = 72.3% R-Sq(adj) = 72.0%


Analysis of Variance
Source DF SS MS F P
Regression 2 5850.0 2925.0 192.23 0.000
Residual Error 147 2236.8 15.2
Total 149 8086.8
12-6 Statistics for Business and Economics, 9th Edition, Global Edition

All else being equal, a one unit increase in the horsepower of the engine will reduce fuel
mileage by .10489 mpg. All else being equal, an increase in the weight of the car by 100
pounds will reduce fuel mileage by .66143 mpg.

b. Add number of cylinders


Regression Analysis: milpgal versus horspwr, weight, cylinder
The regression equation is
milpgal = 55.9 - 0.117 horspwr - 0.00758 weight + 0.726 cylinder
150 cases used 5 cases contain missing values
Predictor Coef SE Coef T P
Constant 55.925 1.443 38.77 0.000
horspwr -0.11744 0.02344 -5.01 0.000
weight -0.007576 0.001066 -7.10 0.000
cylinder 0.7260 0.4362 1.66 0.098

S = 3.878 R-Sq = 72.9% R-Sq(adj) = 72.3%


Analysis of Variance
Source DF SS MS F P
Regression 3 5891.6 1963.9 130.62 0.000
Residual Error 146 2195.1 15.0
Total 149 8086.8
All else being equal, one additional cylinder in the engine of the auto will increase fuel
mileage by .726 mpg. Note that this is not significant at the .05 level (p-value = .098).
Horsepower and weight still have the expected negative signs

c. mpg as a function of weight, number of cylinders


Regression Analysis: milpgal versus weight, cylinder
The regression equation is
milpgal = 55.9 - 0.0104 weight + 0.121 cylinder
154 cases used 1 cases contain missing values
Predictor Coef SE Coef T P
Constant 55.914 1.525 36.65 0.000
weight -0.0103680 0.0009779 -10.60 0.000
cylinder 0.1207 0.4311 0.28 0.780

S = 4.151 R-Sq = 68.8% R-Sq(adj) = 68.3%


Analysis of Variance
Source DF SS MS F P
Regression 2 5725.0 2862.5 166.13 0.000
Residual Error 151 2601.8 17.2
Total 153 8326.8
All else being equal, an increase in the weight of the car by 100 pounds will reduce fuel
mileage by 1.0368 mpg. All else being equal, an increase in the number of cylinders in
the engine will increase mpg by .1207 mpg.
The explanatory power of the models has stayed relatively the same with a slight drop in
explanatory power for the latest regression model.
Note that the coefficient on weight has stayed negative and significant (p-values of .000)
for all of the regression models; although the value of the coefficient has changed. The
number of cylinders is not significantly different from zero in either of the models where
it was used as an independent variable. There is likely some correlation between
cylinders and the weight of the car as well as between cylinders and the horsepower of
the car.

d. mpg as a function of horsepower, weight, price

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-7

Regression Analysis: milpgal versus horspwr, weight, price


The regression equation is
milpgal = 54.4 - 0.0938 horspwr - 0.00735 weight +0.000137 price
150 cases used 5 cases contain missing values
Predictor Coef SE Coef T P
Constant 54.369 1.454 37.40 0.000
horspwr -0.09381 0.02177 -4.31 0.000
weight -0.0073518 0.0008950 -8.21 0.000
price 0.00013721 0.00003950 3.47 0.001

S = 3.762 R-Sq = 74.5% R-Sq(adj) = 73.9%


Analysis of Variance
Source DF SS MS F P
Regression 3 6020.7 2006.9 141.82 0.000
Residual Error 146 2066.0 14.2
Total 149 8086.8
All else being equal, an increase by one unit in the horsepower of the auto will reduce
fuel mileage by .09381 mpg. All else being equal, an increase by 100 pounds in the
weight of the auto will reduce fuel mileage by .73518 mpg and an increase in the price of
the auto by one dollar will increase fuel mileage by .00013721 mpg.

e.
Horse power and weight remain significant negative independent variables throughout
whereas the number of cylinders has been insignificant. The size of the coefficients change
as the combinations of independent variables changes. This is likely due to strong
correlation that may exist between the independent variables.

12.14 a. Horsepower as a function of weight, cubic inches of displacement


Regression Analysis: horspwr versus weight, displace
The regression equation is
horspwr = 23.5 + 0.0154 weight + 0.157 displace
151 cases used 4 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 23.496 7.341 3.20 0.002
weight 0.015432 0.004538 3.40 0.001 6.0
displace 0.15667 0.03746 4.18 0.000 6.0

S = 13.64 R-Sq = 69.2% R-Sq(adj) = 68.8%


Analysis of Variance
Source DF SS MS F P
Regression 2 61929 30964 166.33 0.000
Residual Error 148 27551 186
Total 150 89480
All else being equal, a 100 pound increase in the weight of the car is associated with a
1.54 increase in horsepower of the auto.
All else being equal, a 10 cubic inch increase in the displacement of the engine is
associated with a 1.57 increase in the horsepower of the auto.

b. Horsepower as a function of weight, displacement, number of cylinders


Regression Analysis: horspwr versus weight, displace, cylinder
The regression equation is
horspwr = 16.7 + 0.0163 weight + 0.105 displace + 2.57 cylinder
151 cases used 4 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 16.703 9.449 1.77 0.079
weight 0.016261 0.004592 3.54 0.001 6.2
12-8 Statistics for Business and Economics, 9th Edition, Global Edition

displace 0.10527 0.05859 1.80 0.074 14.8


cylinder 2.574 2.258 1.14 0.256 7.8

S = 13.63 R-Sq = 69.5% R-Sq(adj) = 68.9%


Analysis of Variance
Source DF SS MS F P
Regression 3 62170 20723 111.55 0.000
Residual Error 147 27310 186
Total 150 89480
All else being equal, a 100 pound increase in the weight of the car is associated with a
1.63 increase in horsepower of the auto.
All else being equal, a 10 cubic inch increase in the displacement of the engine is
associated with a 1.05 increase in the horsepower of the auto.
All else being equal, one additional cylinder in the engine is associated with a 2.57
increase in the horsepower of the auto.
Note that adding the independent variable number of cylinders has not added to the
explanatory power of the model. R square has increased marginally. Engine
displacement is no longer significant at the .05 level (p-value of .074) and the estimated
regression slope coefficient on the number of cylinders is not significantly different from
zero. This is due to the strong correlation that exists between cubic inches of engine
displacement and the number of cylinders.

c. Horsepower as a function of weight, displacement and fuel mileage


Regression Analysis: horspwr versus weight, displace, milpgal
The regression equation is
horspwr = 93.6 + 0.00203 weight + 0.165 displace - 1.24 milpgal
150 cases used 5 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 93.57 15.33 6.11 0.000
weight 0.002031 0.004879 0.42 0.678 8.3
displace 0.16475 0.03475 4.74 0.000 6.1
milpgal -1.2392 0.2474 -5.01 0.000 3.1

S = 12.55 R-Sq = 74.2% R-Sq(adj) = 73.6%


Analysis of Variance
Source DF SS MS F P
Regression 3 66042 22014 139.77 0.000
Residual Error 146 22994 157
Total 149 89036
All else being equal, a 100 pound increase in the weight of the car is associated with
a .203 increase in horsepower of the auto.
All else being equal, a 10 cubic inch increase in the displacement of the engine is
associated with a 1.6475 increase in the horsepower of the auto.
All else being equal, an increase in the fuel mileage of the vehicle by 1 mile per gallon is
associated with a reduction in horsepower of 1.2392.
Note that the negative coefficient on fuel mileage indicates the trade-off that is expected
between horsepower and fuel mileage. The displacement variable is significantly
positive, as expected, however, the weight variable is no longer significant. Again, one
would expect high correlation among the independent variables.

d. Horsepower as a function of weight, displacement, mpg and price


Regression Analysis: horspwr versus weight, displace, milpgal, price
The regression equation is

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-9

horspwr = 98.1 - 0.00032 weight + 0.175 displace - 1.32 milpgal +0.000138


price
150 cases used 5 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 98.14 16.05 6.11 0.000
weight -0.000324 0.005462 -0.06 0.953 10.3
displace 0.17533 0.03647 4.81 0.000 6.8
milpgal -1.3194 0.2613 -5.05 0.000 3.5
price 0.0001379 0.0001438 0.96 0.339 1.3

S = 12.55 R-Sq = 74.3% R-Sq(adj) = 73.6%


Analysis of Variance
Source DF SS MS F P
Regression 4 66187 16547 105.00 0.000
Residual Error 145 22849 158
Total 149 89036
All else being equal, a 100 pound increase in the weight of the car is associated with a
reduction of .00324 in horsepower of the auto.
All else being equal, a 10 cubic inch increase in the displacement of the engine is
associated with a 1.7533 increase in the horsepower of the auto.
All else being equal, an increase in the fuel mileage of the vehicle by 1 mile per gallon is
associated with a reduction in horsepower of 1.3194.
All else being equal, an increase in the price of the auto by one dollar will increase fuel
mileage by .0001379 mpg.
Engine displacement has a significant positive impact on horsepower, fuel mileage is
negatively related to horsepower and price is not significant.

e. Explanatory power has marginally increased from the first model to the last. The
estimated coefficient on price is not significantly different from zero. Displacement
and fuel mileage have the expected signs. The coefficient on weight has the wrong
sign; however, it is not significantly different from zero (p-value of .953).

12.15
From the given Analysis of Variance table,
2 SSE 2 SSR SSE
SST =SSR+ SSE ; se = ;R = =1− ;
n−K−1 SST SST
SSE / ( n−K−1 )
R2=1− ; n−1=( n−K −1 ) + K
SST / ( n−1 )
2 150
a. SSE=150 ; se = =8.33 ; se =2.8868
18
b. SST =SSR+ SSE=200+150=350
2 150 / ( 18 )
c. R =1− =0.3810
350 / ( 26 )

12.16 From the given Analysis of Variance table,

SST = SSR + SSE, , , ,n–1=


(n – K – 1) + (K)
12-10 Statistics for Business and Economics, 9th Edition, Global Edition

a. = 2500, = 86.207, = 9.2848


b. SST = SSR + SSE = 7,000 + 2,500 = 9,500

c. = .7368, = .7187

12.17 From the given Analysis of Variance table,

SST = SSR + SSE, , , ,n–1=


(n – K – 1) + (K)

a. = 10,000, = 222.222, = 14.9071


b. SST = SSR + SSE = 40,000 + 10,000 = 50,000

c. = .80, = .7822

12.18 From the given Analysis of Variance table,

SST = SSR + SSE, , , ,n–1=


(n – K – 1) + (K)

a. = 15,000, = 75.0, = 8.660


b. SST = SSR + SSE = 80,000 + 15,000 = 95,000

c. = .8421, = .8382

12.19
2 SSR 3.474
a. R = = =0.8705 ,the coefficient of determination indicates that 87.05% of
SST 3.991
the variation in dependent variable(s) can be explained by the variation in the
independent variable.
b. SSE=3.991−3.474=0.517
2 SSE / ( n−K−1 ) 0.517/ ( 15−3−1 )
c. R =1− =1− =0.8351
SST / ( n−1 ) 3.991 / ( 15−1 )
d. R=√ 0.8705=0.9330 .The coefficient of correlation is said to be a strong correlation
if the absolute value is greater than 0.70. Thus, there is a strong correlation between
the independent and the dependent variables.

12.20 a. , therefore, 54.41% of the variability in milk consumption


can be explained by the variations in weekly income and family size.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-11

b.
c. . This is the sample correlation between observed and predicted
values of milk consumption.

12.21 a. , therefore, 63.31% of the variability in weight gain can


be explained by the variations in the average number of meals eaten, number of hours
exercised, and number of beers consumed weekly.

b.
c. . This is the sample correlation between observed and predicted
values of weight gained.

12.22 a.
Regression Analysis: Y profit versus X2 offices
The regression equation is
Y profit = 1.55 -0.000120 X2 offices
Predictor Coef SE Coef T P
Constant 1.5460 0.1048 14.75 0.000
X2 offi -0.00012033 0.00001434 -8.39 0.000

S = 0.07049 R-Sq = 75.4% R-Sq(adj) = 74.3%


Analysis of Variance
Source DF SS MS F P
Regression 1 0.34973 0.34973 70.38 0.000
Residual Error 23 0.11429 0.00497
Total 24 0.46402

b.
Regression Analysis: X1 revenue versus X2 offices
The regression equation is
X1 revenue = - 0.078 +0.000543 X2 offices

Predictor Coef SE Coef T P


Constant -0.0781 0.2975 -0.26 0.795
X2 offi 0.00054280 0.00004070 13.34 0.000

S = 0.2000 R-Sq = 88.5% R-Sq(adj) = 88.1%


Analysis of Variance
Source DF SS MS F P
Regression 1 7.1166 7.1166 177.84 0.000
Residual Error 23 0.9204 0.0400
Total 24 8.0370

c.
Regression Analysis: Y profit versus X1 revenue
The regression equation is
Y profit = 1.33 - 0.169 X1 revenue
Predictor Coef SE Coef T P
Constant 1.3262 0.1386 9.57 0.000
X1 reven -0.16913 0.03559 -4.75 0.000
12-12 Statistics for Business and Economics, 9th Edition, Global Edition

S = 0.1009 R-Sq = 49.5% R-Sq(adj) = 47.4%


Analysis of Variance
Source DF SS MS F P
Regression 1 0.22990 0.22990 22.59 0.000
Residual Error 23 0.23412 0.01018
Total 24 0.46402

d.
Regression Analysis: X2 offices versus X1 revenue
The regression equation is
X2 offices = 957 + 1631 X1 revenue
Predictor Coef SE Coef T P
Constant 956.9 476.5 2.01 0.057
X1 reven 1631.3 122.3 13.34 0.000

S = 346.8 R-Sq = 88.5% R-Sq(adj) = 88.1%


Analysis of Variance
Source DF SS MS F P
Regression 1 21388013 21388013 177.84 0.000
Residual Error 23 2766147 120267
Total 24 24154159

12.23 Given the regression results where the numbers in parentheses are the sample standard
error of the coefficient estimates
a. The two-sided 95% confidence intervals for the three regression slope coefficients are
given byb j ±t n−K −1 ,α /2 s b ; t 21, 0.025=2.080
j

95% CI for x 1=21.8 ± 2.080 (2.1 )=17.432up ¿ 26.168


95% CI for x 2=23.7 ± 2.080 ( 22.4 ) =−22.892up ¿ 70.292
95% CI for x 3=−15.4 ± 2.080 ( 32.4 )=−82.792 up ¿ 51.992

b. Testing the hypothesis H 0 : β j=0; H 1 : β j >0


21.8
For x 1 :t= =10.381 ; t 21 ,0.05 =1.721. Therefore, reject H 0at the 5% level.
2.1
23.7
For x 2 :t= =1.058 ; t 21 ,0.05=1.721 .Therefore, do not reject H 0at the 5%.
22.4
−15.4
For x 3 :t= =−0.475 ;−t 21 , 0.05=−1.721 .Therefore, do not reject H 0at the 5% level
32.4

12.24
Given the regression results where the numbers in parentheses are the sample standard
error of the coefficient estimates
a. The two-sided 95% confidence intervals for the three regression slope coefficients are
given by

95% CI for x1 = 7.4  2.026 (1.8); 3.7532 up to 11.0468


95% CI for x2 = 3.2  2.026 (1.4); 0.3636 up to 6.0364
95% CI for x3 = 10.1  2.026 (4.3); 1.3882 up to 18.8118

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-13

b. Testing the hypothesis

For x1: ; = 1.687, 2.431


Therefore, reject at the 5% level and at the 1% level

For x2: ; = 1.687, 2.431


Therefore, reject at the 5% level but not at the 1% level

For x3: ; = 1.687, 2.431


Therefore, reject at the 5% level but not at the 1% level
12.25 Given the regression results where the numbers in parentheses are the sample standard
error of the coefficient estimates
a. The two-sided 95% confidence intervals for the three regression slope coefficients are
given by
95% CI for x1 = 7.4  2.026 (4.2); -1.1092 up to 15.9092
95% CI for x2 = 3.2  2.026 (0.7); 1.7818 up to 4.6182
95% CI for x3 = 10.1  2.026 (13.2); -16.6432 up to 36.8432
b. Testing the hypothesis

For x1: ; = 1.687, 2.431


Therefore, reject at the 5% level but not at the 1% level

For x2: ; = 1.687, 2.431


Therefore, reject at the 5% level and at the 1% level

For x3: ; = 1.687, 2.431


Therefore, do not reject at either level

12.26 Given the regression results where the numbers in parentheses are the sample standard
error of the coefficient estimates
a. The two-sided 95% confidence intervals for the three regression slope coefficients are
given by

95% CI for x1 = 17.8  2.030 (7.1); 3.387 up to 32.213


95% CI for x2 = 26.9  2.030 (13.7); -.911 up to 54.711
95% CI for x3 = -9.2  2.030 (3.8); -16.914 up to -1.486
b. Testing the hypothesis
12-14 Statistics for Business and Economics, 9th Edition, Global Edition

For x1: ; = 1.690, 2.438


Therefore, reject at the 5% level and at the 1% level

For x2: ; = 1.690, 2.438


Therefore, reject at the 5% level but not at the 1% level

For x3: ; = 1.690, 2.438


Therefore, do not reject at either level

12.27
Given b j ±t n−K −1 ,α /2 s b
j

a. b 1=0.631 ; s b =0.096 ; n=23 ; t 19 , 0.05/ 0.025 =1.729 ,2.093


1

The 90% CI for the coefficient β 1=0.631 ±1.729 ( 0.096 )=0.465 up ¿ 0.797
The 95% CI for the coefficient β 1=0.631 ± 2.093 ( 0.096 )=0.430 up ¿ 0.832
b. 2=0.066 ; s b =0.037 ; n=23 ; t 19 ,0.025/ 0.005=2.093 , 2.861
b 2

The 95% CI for the coefficient β 2=0.066 ± 2.093 ( 0.037 ) =−0.011up ¿ 0.143
The 99% CI for the coefficient β 2=0.066 ± 2.861 ( 0.037 )=−0.040 up ¿ 0.172
c. H 0 : β 2=0 ; H 1 : β 2 ≠0.
0.066
t= =1.784 ; t 19 ,0.05 /0.025 =1.729 , 2.093 .Therefore, reject H 0at the 90% level. Do not
0.037
reject . The 95% level.
d. H 0 : β 1=β 2=0 ; H 1 : At least one β 1 ≠0 for i=1 , 2.
SSE ( R )−SSE
K−1 (3.397−0.392)/2
F= = =72.825
SSE /(n−K−1) 0.392 /(23−3−1)
Since the value of the test statistic, F, is greater than the critical value, F2 , 19, 0.01 ,reject H 0 .

12.28 a.

Therefore, reject at the 2.5% level but not at the 1% level


b.
90% CI: 1.14  1.703(.35); .5439 up to 1.7361
95% CI: 1.14  2.052(.35); .4218 up to 1.8582
99% CI: 1.14  2.771(.35); .1701 up to 2.1099

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-15

12.29 a.

Therefore, reject at the 2.5% level but not at the 1% level


b.

Therefore, reject at the 1% level but not at the .5% level


12-16 Statistics for Business and Economics, 9th Edition, Global Edition

c.
90% CI: .653  1.721(.189); .3277 up to .9783
95% CI: .653  2.080(.189); .2599 up to 1.0461
99% CI: .653  2.831(.189); .1179 up to 1.1881

12.30 a.

= -1.337
Therefore, do not reject at the 20% level

b. At least one

, F 3,16,.01 = 5.29
Therefore, reject at the 1% level

12.31 a.
95% CI: 7.878  1.988(1.809); 4.2817 up to 11.4743
99% CI: 7.878  2.635(1.809); 3.1113 up to 12.6447

b. , ,
Therefore, reject at the .5% level

12.32 a. All else being equal, an extra $1 in mean per capita personal income leads to an
expected extra $.4 of net revenue per capita from the lottery

b.
95% CI: .8772  2.064(.3107); .2359 up to 1.5185
Therefore the 95% confidence interval for the expected increase in the dollars of net
revenue per capita per year generated by the lottery resulting from a one-unit
increase in number of hotel, motel, inn, and resort rooms per thousand persons in the
country, if the other variables do not change, runs from .2359 up to 1.5185. Also,
since the 95% confidence interval does not include 0, we conclude that the
coefficient on x2 in the population regression is statistically significant.

c.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-17

= -1.318, -1.711
Therefore, reject at the 10% level but not at the 5% level

12.33
a. b 3=5.403 ; s b =3.833 ; n=21; t 17 ,0.025 =2.110
3

95% CI:5.403 ± 2.110 ( 3.833 )=−2.685up ¿ 13.491


23.799
b. H 0 : β 2=0 ; H 1 : β 2 >0 ; t= =2.367 ; t 17, 0.05=1.740 .
10.056
Therefore, reject H 0as there is sufficient evidence to support the claim that the higer the
energy efficiency ratio, the higher the price.

12.34
. a.
99% CI: .05146  2.712(.01367); 0.014387 up to 0.088533
Therefore, the 99% confidence interval for the expected increase in the number of
full-time firefighters resulting from a 1 unit increase in the amount of
intergovernmental grants per capita, if the other variables do not change, runs from
0.0144 up to 0.0885.
Also, since the 99% confidence interval for β5 does not include 0, we conclude that
this variable is statistically significant.

b.

= 1.304
Therefore, do not reject at the 20% level. And conclude that the population density
is not statistically significant.

c.

= 2.024, 2.429
Therefore, reject at the 5% level but not at the 2% level. And conclude that the percentage of
the population that is male and between 12 and 21 years of age is statistically significant at the
5% level but not at the 2% level.

12.35
Testing the hypothesis that all three of the predictor variables are equal to zero for the
given Analysis of Variance tables
a. H 0 : β 1=β 2=β 3=0 ; H 1 : At least one β j ≠ 0.
12-18 Statistics for Business and Economics, 9th Edition, Global Edition

SSR /K MSR 1000


F= = 2 = =50 , F 3 ,25 ,0.05 =2.99
SSE /(n−K−1) se 20

Therefore, reject at the 5% level of significance.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-19

b. H 0 : β 1=β 2=β 3=0 ; H 1 : At least one β j ≠ 0.


SSR/ K MSR 1500
F= = 2 = =63 , F 3 ,21 ,0.05 =3.07
SSE / ( n−K −1 ) se 23.8095

Therefore, reject at the 5% level of significance.

c. H 0 : β 1=β 2=β 3=0 ; H 1 : At least one β j ≠ 0.


SSR/ K MSR 14000
F= = 2 = =15.56 , F 3 ,30 ,0.05 =2.92
SSE / ( n−K −1 ) se 900

Therefore, reject at the 5% level of significance.


d. H 0 : β 1=β 2=β 3=0 ; H 1 : At least one β j ≠ 0.
SSR/ K MSR 3202
F= = 2 = =29.24 , F 3 , 21, 0.01=4.87
SSE / ( n−K −1 ) se 109.5238

Therefore, reject at the 1% level of significance.

12.36 a. SST = 3.881, SSR = 3.549, SSE = .332


At least one

F 3,23,.01 = 4.76
Therefore, reject at the 1% level

b. Analysis of Variance table:

Sources of Sum of Degrees of


variation Squares Freedom Mean Squares F-Ratio
Regression 3.549 3 1.183 81.955
Error .332 23 .014435
Total 3.881 26

12.37 At least one

F 2,45,.01 = 5.18
Therefore, reject at the 1% level
12-20 Statistics for Business and Economics, 9th Edition, Global Edition

12.38 a. SST = 162.1, SSR =88.2, SSE = 73.9


At least one

, F 2,27,.01 = 5.49
Therefore, reject at the 1% level
b. Analysis of Variance table:

Sources of Sum of Degrees of


variation Squares Freedom Mean Squares F-Ratio
Regression 88.2 2 44.10 16.113
Error 73.9 27 2.737
Total 162.1 29

12.39 a. SST = 125.1, SSR = 79.2, SSE = 45.9


At least one

, F 3,21,.01 = 4.87
Therefore, reject H0 at the 1% level

b. Analysis of Variance table:

Sources of Sum of Degrees of


variation Squares Freedom Mean Squares F-Ratio
Regression 79.2 3 26.4 12.078
Error 45.9 21 2.185714
Total 125.1 24

12.40

= =

12.41
Let β 3be the coefficient on the number of teenagers in the household
H 0 : β 3=0 ; H 1 : β 3 ≠ 0
SSE ( R ) −SSE /R (91.3−72.6)/1
F= = =6.18 , F 1 ,24 , 0.01=7.82
2
se 3.025

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-21

Therefore, do not reject H 0at the 5% level of significance as there is insufficient evidence
that the number of teenagers in the household affects the consumption of bread.
12-22 Statistics for Business and Economics, 9th Edition, Global Edition

12.42 a. =

= =

b. Since , then

c. =

= where

12.43
Given the estimated multiple regression equation,
a. ^y =4.2+5.3 ( 10 )−4.4 ( 23 ) +6.8 ( 9 )−0.8 ( 12 )=7.6
b. ^y =4.2+5.3 ( 23 )−4.4 ( 18 ) +6.8 ( 10 ) −0.8 ( 11)=106.1
c. ^y =4.2+5.3 ( 15 )−4.4 ( 16 ) +6.8 ( 5 )−0.8 ( 0 )=47.3
d. ^y =4.2+5.3 (−10 )−4.4 ( 13 ) +6.8 (−8 )−0.8 (−16 )=−147.6

12.44 Y^ =7.74 +0.684 (17 )−1.417 ( 8 )+ 0.577(18)=18.418 pounds

12.45 quarts of milk per week

12.46 million worker hours

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-23
12-24 Statistics for Business and Economics, 9th Edition, Global Edition

12.47 a. All else being equal, a one square foot increase in the lot size is expected to increase
the selling price of the house by $1.468
b. 98.43% of the variation in the selling price of homes can be explained by the
variation in house size, lot size, number of bedrooms, and number of bathrooms

c. , , = 1.341, 1.753
Therefore, reject at the 10% level but not at the 5% level
d.

12.48 a. mpg as a function of horsepower and weight


Regression Analysis: milpgal versus horspwr, weight
The regression equation is
milpgal = 55.8 - 0.105 horspwr - 0.00661 weight
150 cases used 5 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 55.769 1.448 38.51 0.000
horspwr -0.10489 0.02233 -4.70 0.000 2.918
weight -0.0066143 0.0009015 -7.34 0.000 2.918

S = 3.901 R-Sq = 72.3% R-Sq(adj) = 72.0%

Analysis of Variance
Source DF SS MS F P
Regression 2 5850.0 2925.0 192.23 0.000
Residual Error 147 2236.8 15.2
Total 149 8086.8

Predicted Values for New Observations

New
Obs Fit SE Fit 95% CI 95% PI
1 21.242 0.975 (19.316, 23.168) (13.296, 29.188)X

X denotes a point that is an outlier in the predictors.

Values of Predictors for New Observations

New
Obs horspwr weight
1 140 3000

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-25

b. Adding the number of cylinders


Regression Analysis: milpgal versus horspwr, weight, cylinder
The regression equation is
milpgal = 55.9 - 0.117 horspwr - 0.00758 weight + 0.726 cylinder
150 cases used 5 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 55.925 1.443 38.77 0.000
horspwr -0.11744 0.02344 -5.01 0.000 3.254
weight -0.007576 0.001066 -7.10 0.000 4.131
cylinder 0.7260 0.4362 1.66 0.098 3.586

S = 3.878 R-Sq = 72.9% R-Sq(adj) = 72.3%

Analysis of Variance
Source DF SS MS F P
Regression 3 5891.6 1963.9 130.62 0.000
Residual Error 146 2195.1 15.0
Total 149 8086.8

Predicted Values for New Observations

New
Obs Fit SE Fit 95% CI 95% PI
1 21.112 0.972 (19.190, 23.033) (13.211, 29.012)

Values of Predictors for New Observations

New
Obs horspwr weight cylinder
1 140 3000 6.00

12.49
Computing values of y i when x i = 1, 2, 4, 6, 8, 10
Xi 1 2 4 6 8 10
1.4
y i=2 x 2 5.278 13.929 24.572 36.758 50.238
2
y i=2+6 x i +1.4 x i 6.6 8.4 3.6 -12.4 -39.6 -78

12.50 Computing values of yi when xi = 1, 2, 4, 6, 8, 10


Xi 1 2 4 6 8 10
4 13.9288 48.5029 100.6311 168.8970 252.3829
5 13 41 85 145 221

12.51
Computing values of y i when x i = 1, 2, 4, 6, 8, 10
Xi 1 2 4 6 8 10
1.2
y i=3 x 3 6.892 15.834 25.757 36.377 47.547
2
y i=3+ 5 x i +1.9 x i 6.1 5.4 -7.4 -35.4 -78.6 -137
12-26 Statistics for Business and Economics, 9th Edition, Global Edition

12.52 Computing values of yi when xi = 1, 2, 4, 6, 8, 10


Xi 1 2 4 6 8 10
3 6.8922 15.8341 25.7574 36.3772 47.5468
4.5 5 -3 -23 -55 -99

12.53 There are many possible answers. Relationships that can be approximated by a non-
linear quadratic model include many supply functions, production functions, and cost
functions including average cost versus the number of units produced.

12.54 To estimate the function with linear least squares, solve the equation for
. Since , plug into the equation and algebraically manipulate:

Conduct the variable transformations and estimate the model using least squares.

12.55
a. All else being equal, 1% increase in annual consumption expenditures will be
associated with a 1.1545% increase in expenditures on vacation travel.
All else being equal, a 1% increase in the size of the household will be associated with a
0.4468% decrease in expenditures on vacation travel.
b. 16.1% of the variation in vacation travel expenditures can be explained by the
variations in the log of total consumption expenditures and log of the number of
members in the household.
c. 95% CI:1.1556 ± 1.96 ( 0.0546 ) =1.048up ¿ 1.261
−0.4468
d. H 0 : β 2=0 ; H 1 : β 2 <0 ; t= =−9.80 ; t 2330, 0.01=−2.33 .Therefore, reject H 0at
0.0456
the 1% level

12.56 a. A 1% increase in median income leads to an expected .68% increase in store size.

b. , . Therefore, reject at the 1% level

12.57
a. All else being equal, a 1% increase in the price of fish will be associated with a
decrease of 0.536% in the tons of fish consumed annually in France.
b. All else being equal, a 1% increase in the price of potatoes will be associated with an
increase of 0.208% in the tons of fish consumed annually in the France.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-27

c. , , = 2.485. Therefore, reject at the


1% level
d. At least one

, F 4,25,.01 = 4.18. Therefore, reject at the


1% level
e. If an important independent variable has been omitted, there may be specification
bias. The regression coefficients produced for the misspecified model would be
misleading.

12.58 Estimating a Cobb-Douglas production function with three independent variables:


where X1 = capital, X2 = labor, and X3 = basic research
Taking the log of both sides of the equation yields:

Using this form, now regression the log of Y on the logs of the three independent variables
and obtain the estimated regression slope coefficients.

Substituting in the restrictions on the coefficients:


Now,

Thus, we see that the coefficient is obtained by regressing on


and the coefficient by regressing on . The coefficient can

be found by subtracting and from 1.0.

12.59 a. Coefficients for exponential models can be estimated by taking the logarithm
of both sides of the multiple regression model to obtain an equation that is linear in the
logarithms of the variables.

Substituting in the restrictions on the coefficients: ,


12-28 Statistics for Business and Economics, 9th Edition, Global Edition

Thus, we see that the coefficient is obtained by regressing


. The coefficient can be found by subtracting
from 1.0. Likewise the coefficient can be estimated and the coefficient can be

found by subtracting from 1.0.

b. Constant elasticity for Y versus X4 is the regression slope coefficient on the X4 term of
the logarithm model.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-29

12.60 Linear model:


12-30 Statistics for Business and Economics, 9th Edition, Global Edition

Quadratic model:

Cubic model:

All three of the models appear to fit the data well. The cubic model appears to fit the data
the best as the standard error of the estimate is lowest. In addition, explanatory power is
marginally higher for the cubic model than the other models.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-31

12.61
Results for: GermanImports.xls
Regression Analysis: LogYt versus LogX1t, LogX2t
The regression equation is
LogYt = - 4.07 + 1.36 LogX1t + 0.101 LogX2t
Predictor Coef SE Coef T P VIF
Constant -4.0709 0.3100 -13.13 0.000
LogX1t 1.35935 0.03005 45.23 0.000 4.9
LogX2t 0.10094 0.05715 1.77 0.088 4.9

S = 0.04758 R-Sq = 99.7% R-Sq(adj) = 99.7%

Analysis of Variance
Source DF SS MS F P
Regression 2 21.345 10.673 4715.32 0.000
Residual Error 28 0.063 0.002
Total 30 21.409

Source DF Seq SS
LogX1t 1 21.338
LogX2t 1 0.007

German real imports will continue to increase with real private consumption, and real
exchange rate.

12.62
The model constants when the dummy variable equals 1 are
a. ^y =9+6 x 1 ; b 0=18
b. ^y =7 +4 x 1 ; b0=9
c. ^y =4+ 4 x 1 ; b 0=12

12.63
The model constants and the slope coefficients of when the dummy variable equals 1 are
a. , b0 = 3.2, slope coefficient of = 13.5
b. , b0 = -3.3, slope coefficient of = 8.6
c. , b0 = 21.1, slope coefficient of = 7.1

12.64 The interpretation of the dummy variable is that we can conclude that for a given
difference between the spot price in the current year and spot price in the previous year,
the difference between the OPEC price in the current year and OPEC price in the previous
years is $5.22 higher in 1974 during the oil embargo than in other years.
12-32 Statistics for Business and Economics, 9th Edition, Global Edition

Dummy variable x2 is equal to 1 for the year 1974 and 0 otherwise to represent the specific
effect of the oil embargo of that year. The graph indicates that the OPEC price in the previous
years is $5.22 higher in 1974 during the oil embargo than in other years keeping other factors
constant.

12.65
a. All else being equal, expected selling price is higher by €3,054 if a car has an airbag.
b. All else being equal, expected selling price is higher by €1,969 if a car has a sunroof.
c. 95% CI:3054 ± 1.984 ( 738 ) =€ 1,590 up ¿ € 4,518
1969
d. H 0 : β 5=0 ; H 1 : β 5 >0 ; t= =2.668 ; t 100 ,0.05=1.660; therefore, reject H 0at the 5%
738
level of significance as the test statistic is greater than the critical value.

12.66 a. All else being equal, the price-earnings ratio is higher by 1.23 for a regional company
than a national company.

b. , , = 2.462, 2.756
Therefore, reject at the 2% level but not at the 1% level
c. At least one

, F 2,29,.05 = 3.33
Therefore, reject at the 5% level and conclude that at least one coefficient is not
equal to 0. Hence at least one predictor variable is a significant predictor of the price-
earnings ratio.

12.67 35.6% of the variation in overall performance in law school can be explained by the
variation in undergraduate GPA, scores on the LSATs, and whether the student’s letter of

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-33

recommendation are unusually strong. The overall model is significant since we can
reject the null hypothesis that the model has no explanatory power in favor of the
alternative hypothesis that the model has significant explanatory power. The individual
regression coefficients that are significantly different from zero include the scores on the
LSAT and whether the student’s letters of recommendation were unusually strong. The
coefficient on undergraduate GPA was not found to be significant at the 5% level.

12.68
a. All else being equal, the annual salary of the attorney general is, on average, €5,787
higher if justices of the state supreme court can be removed from office by the
governor, judicial review board, or majority vote of the supreme court.
b. All else being equal, the annual salary of the attorney general of the state is €2,630
lower if the supreme court justices are elected on partisan ballots.
5787
c. H 0 : β 5=0 ; H 1 : β 5 >0 ; t= =2.042 ; t 27 ,0.05=1.703 .Therefore, reject H 0at the 5%
2834
level
−2630
d. H 0 : β 6=0 ; H 1 : β6 < 0; t= =−1.613 ; t 27 ,0.05=1.703 .Therefore, do not reject H 0
1630
at the 5% level
e. t 27 ,0.025=2.052
95% CI:570 ± 2.052 ( 134.9 )=293.185 up ¿ 846.815
Since this confidence interval does not include 0, it can be stated that with 95%
confidence that there is a positive correlation between x i∧^y .

12.69 a. All else being equal, the average rating of a course is 6.21 units higher if a guest
visiting lecturer is brought in than if otherwise.

b. , , = 1.725
Therefore, reject at the 5% level
c. 56.9% of the variation in the average course rating can be explained by the variation in
the percentage of time spent in group discussions, the dollars spent on preparing the
course materials, the dollars spent on food and drinks, and whether a guest lecturer is
brought in.
At least one

F 4,20,.01 = 4.43
Therefore, reject at the 1% level
d. = 2.086
95% CI: .52  2.086(.21); .0819 up to .9581
Therefore, the 95% confidence interval for the expected increase in the average
course rating resulting from a one dollar increase of money spent on preparing the
12-34 Statistics for Business and Economics, 9th Edition, Global Edition

course material, if the other variables do not change, runs from .0819 to .9581. Also,
since the interval does not include zero, the coefficient is statistically significant.

12.70 34.4% of the variation in a test on understanding business statistics can be explained by
which course was taken, the student’s GPA, the teacher that taught the course, the gender
of the student, the pre-test score, the number of credit hours completed, and the age of the
student. The regression model has significant explanatory power:
At least one

Therefore, reject at the 1% level and conclude that at least one coefficient is not
equal to 0. Hence at least one predictor variable is a significant predictor of the business
statistics test.

12.71
Results for: Student Performance.xls
Regression Analysis: Y versus X1, X2, X3, X4, X5
The regression equation is
Y = 2.00 + 0.0099 X1 + 0.0763 X2 - 0.137 X3 + 0.064 X4 + 0.138 X5

Predictor Coef SE Coef T P VIF


Constant 1.997 1.273 1.57 0.132
X1 0.00990 0.01654 0.60 0.556 1.3
X2 0.07629 0.05654 1.35 0.192 1.2
X3 -0.13652 0.06922 -1.97 0.062 1.1
X4 0.0636 0.2606 0.24 0.810 1.4
X5 0.13794 0.07521 1.83 0.081 1.1

S = 0.5416 R-Sq = 26.5% R-Sq(adj) = 9.0%

Analysis of Variance
Source DF SS MS F P
Regression 5 2.2165 0.4433 1.51 0.229
Residual Error 21 6.1598 0.2933
Total 26 8.3763

The model is not significant (p-value of the F-test = .229). The model only explains 26.5%
of the variation in GPA with the hours spent studying, hours spent preparing for tests, hours
spent in bars, whether or not students take notes or mark highlights when reading texts, and
the average number of credit hours taken per semester. The only independent variables that
are marginally significant (10% level but not the 5% level) include number of hours spent
in bars and average number of credit hours. The other independent variables are not
significant at common levels of alpha.

12.72 a. Begin the analysis with the correlation matrix – identify important independent
variables as well as correlations between the independent variables
Correlations: Salary, Experience, yearsenior, Gender_1F
Salary Experien yearseni
Experien 0.883
0.000

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-35

yearseni 0.777 0.674


0.000 0.000

Gender_1 -0.429 -0.378 -0.292


0.000 0.000 0.000

Regression Analysis: Salary versus Experience, yearsenior, Gender_1F


The regression equation is
Salary = 22644 + 437 Experience + 415 yearsenior - 1443 Gender_1F
Predictor Coef SE Coef T P VIF
Constant 22644.1 521.8 43.40 0.000
Experien 437.10 31.41 13.92 0.000 2.0
yearseni 414.71 55.31 7.50 0.000 1.8
Gender_1 -1443.2 519.8 -2.78 0.006 1.2

S = 2603 R-Sq = 84.9% R-Sq(adj) = 84.6%

Analysis of Variance
Source DF SS MS F P
Regression 3 5559163505 1853054502 273.54 0.000
Residual Error 146 989063178 6774405
Total 149 6548226683

84.9% of the variation in annual salary (in dollars) can be explained by the variation in the
years of experience, the years of seniority, and the gender of the employee. All of the
variables are significant at the .01 level of significance (p-values of .000, .000, and .006
respectively). The F-test of the significance of the overall model shows that we reject
that all of the slope coefficients are jointly equal to zero in favor of that at least one
slope coefficient is not equal to zero. The F-test yielded a p-value of .000.

b.

, = -2.326
Therefore, reject at the 1% level. And conclude that the annual salaries for
females are statistically significantly lower than they are for males.

c. Adding an interaction term and testing for the significance of the slope coefficient on
the interaction term.

Adding the interaction term of Salary and Gender_1F, the regression equation is
obtained as:
Salary = 23336 + 388 Experience + 467 yearsenior - 12468 Gender_1F + 0.425 int.
term

, = -1.282, -1.645
12-36 Statistics for Business and Economics, 9th Edition, Global Edition

Therefore, do not reject at either level. And conclude that the rate of salary
increase for females is not statistically significantly lower than they are for males at
either level.

12.73
Two variables are included as predictor variables. Following is the effect on the estimated
slope coefficients when these two variables have a correlation equal to
a. 0.91. A very strong linear relationship between the two variables will have a major effect
on the estimated slope coefficients.
b. 0.38. The linear relationship between the two variables is moderately weak, which will
have a slight effect on the estimated slope coefficients.
c. -0.64. This indicates a moderately strong linear relationship between the two variables
and will have a moderate effect on the estimated slope coefficients.
d. -0.11. It shows a very weak linear relationship between the two variables, which will
have almost no effect on the estimated slope coefficients.

12.74 n = 34 and four independent variables. r = .23.


Correlation between the independent variable and the dependent variable is not
necessarily evidence of a small Student’s t statistic. A high correlation among the
independent variables could result in a very small Student’s t statistic as the correlation
creates a high variance.

12.75
n = 58 with four independent variables. One of the independent variables has a
correlation of 0.48 with the dependent variable.
Correlation between the independent variable and the dependent variable is not
necessarily evidence of a large Student’s t statistic. A high correlation among the
independent variables could result in a very small Student’s t statistic as the correlation
creates a high variance.

12.76 n = 49 with two independent variables. One of the independent variables has a correlation
of .56 with the dependent variable.
Correlation between the independent variable and the dependent variable is not
necessarily evidence of a small Student’s t statistic. A high correlation among the
independent variables could result in a very small Student’s t statistic as the correlation
creates a high variance.

12.77–12.79 Reports can be written by following the extended Case Study on the data file
Cotton – see Section 12.9
12.77
51.5% of the variation in the ratio of company‘s payments for state and local taxes to
total state and local tax revenue can be explained by the variation in insurance company
state concentration ratio, per capita income, ratio of nonfarm income to the sum of farm
and nonfarm income, ratio of insurance company‘s net after-tax income to insurance
reserves, and average of insurance reserves. The only independent variables that are
marginally significant (5% level in fact all common levels of alpha) include ratio of

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-37

nonfarm income to the sum of farm and nonfarm income (X3) and average of insurance
reserves (X5). The other independent variables are not significant at common levels of
alpha.

12.78
31.2% of the variation in the overall opinion of residence hall can be explained by the
variation in the satisfaction with roommates, with floor, with hall, and with resident
advisor. The F-test of the significance of the overall model shows that we reject that
all of the slope coefficients are jointly equal to zero in favor of that at least one slope
coefficient is not equal to zero. The F-test yielded a p-value of .000.
All of the variables are significant at the .1 level of significance but not at the .05 level of
significance except the variable – satisfaction with resident advisor. The variable –
satisfaction with resident advisor is significant at all common levels of alpha.

12.79
73% of the variation in the commercial paper certificate of deposit rate less commercial
paper rate can be explained by the variation in commercial paper rate, and ratio of loans
and investments to capital. The two independent variables are significant at the .1 and .05
levels of significance. Commercial paper rate is significant in explaining commercial
paper certificate of deposit rate less commercial paper rate at all common levels of alpha.

12.80 Begin the analysis by selecting all the variables. Stepwise, delete the predictor
variables which are not significant. The final regression model includes the predictor variables
per capita disposable income and percent of population in urban areas

Regression Analysis: Fat Rate versus Percap Disp, P Urban

The regression equation is


Fat Rate = 2.80 - 0.000033 Percap Disp - 0.00631 P Urban
Predictor Coef SE Coef T P VIF
Constant 2.7998 0.2971 9.42 0.000
Percap Disp -0.00003325 0.00001225 -2.72 0.009 1.456
P Urban -0.006312 0.003588 -1.76 0.085 1.456

S = 0.321204 R-Sq = 32.4% R-Sq(adj) = 29.6%

Analysis of Variance

Source DF SS MS F P
Regression 2 2.3760 1.1880 11.51 0.000
Residual Error 48 4.9523 0.1032
Total 50 7.3282

12.81
Regression Analysis: Female Emplo versus Percap Disp, Male Unemplo, ...
The regression equation is
Female Employ = 62.3 + 0.000421 Percap Disp_1 - 0.507 Male Unemploy
- 0.000146 Mfg Pcap - 1.34 Female Unemploy
12-38 Statistics for Business and Economics, 9th Edition, Global Edition

Predictor Coef SE Coef T P VIF


Constant 62.277 5.221 11.93 0.000
Percap Disp_1 0.0004207 0.0001167 3.61 0.001 1.200
Male Unemploy -0.5069 0.7380 -0.69 0.496 4.019
Mfg Pcap -0.0001458 0.0001028 -1.42 0.163 1.161
Female Unemploy -1.3395 0.8016 -1.67 0.101 3.981

S = 3.46364 R-Sq = 43.2% R-Sq(adj) = 38.2%

Analysis of Variance
Source DF SS MS F P
Regression 4 419.30 104.82 8.74 0.000
Residual Error 46 551.85 12.00
Total 50 971.15
43.2% of the variation in the percentage of females in the labor force can be explained by
the variation in per capita disposable personal income, the percentage of males
unemployed, the manufacturing payroll per worker, and the unemployment rate of
women. The only independent variable that is statistically significant (10% level in fact
all common levels of alpha) is per capita disposable personal income (X1). The other
independent variables are not significant at common levels of alpha. Also, the F-test of
the significance of the overall model shows that we reject that all of the slope
coefficients are jointly equal to zero in favor of that at least one slope coefficient is
not equal to zero. The F-test yielded a p-value of .000.

12.82
Regression Analysis: y_manufgrowt versus x1_aggrowth, x2_exportgro, ...
The regression equation is
y_manufgrowth = 2.15 + 0.493 x1_aggrowth + 0.270 x2_exportgrowth
- 0.117 x3_inflation
Predictor Coef SE Coef T P VIF
Constant 2.1505 0.9695 2.22 0.032
x1_aggro 0.4934 0.2020 2.44 0.019 1.0
x2_expor 0.26991 0.06494 4.16 0.000 1.0
x3_infla -0.11709 0.05204 -2.25 0.030 1.0

S = 3.624 R-Sq = 39.3% R-Sq(adj) = 35.1%


Analysis of Variance
Source DF SS MS F P
Regression 3 373.98 124.66 9.49 0.000
Residual Error 44 577.97 13.14
Total 47 951.95

Source DF Seq SS
x1_aggro 1 80.47
x2_expor 1 227.02
x3_infla 1 66.50

39.3% of the variation in the manufacturing growth can be explained by the variation in
agricultural growth, exports growth, and rate of inflation.
The F-test of the significance of the overall model shows that we reject that all of the
slope coefficients are jointly equal to zero in favor of that at least one slope
coefficient is not equal to zero. The F-test yielded a p-value of .000.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-39

All the independent variables are significant at the .1 and .05 levels of significance.
Exports growth is significant in explaining the manufacturing growth at all common
levels of alpha.
All else being equal, the agricultural growth and the exports growth variables have the
expected sign. This is because, as the percentage of agricultural growth and the exports
growth increases, the percentage of manufacturing growth also increases. The rate of
inflation variable is negative which indicates that higher the inflation rate, lower is the
manufacturing growth.
12-40 Statistics for Business and Economics, 9th Edition, Global Edition

12.83 The method of least squares in regression analysis yields estimators that are BLUE –
Best Linear Unbiased Estimators. This result holds when the assumptions regarding the
behavior of the error term are true. BLUE estimators are the most efficient (best)
estimators out of the class of all unbiased estimators. The advent of computing power
incorporating the method of least squares has dramatically increased its use.

12.84 The analysis of variance table identifies how the total variability of the dependent
variable (SST) is split up between the portion of variability that is explained by the
regression model (SSR) and the part that is unexplained (SSE). The Coefficient of
Determination (R2) is derived as the ratio of SSR to SST. The analysis of variance table
also computes the F statistic for the test of the significance of the overall regression –
whether all of the slope coefficients are jointly equal to zero. The associated p-value is also
generally reported in this table.

12.85 a. False – If the regression model does not explain a large enough portion of the
variability of the dependent variable, then the error sum of squares can be larger than
the regression sum of squares.
b. False – The sum of several simple linear regressions will not equal a multiple regression
since the assumption of ‘all else being equal’ will be violated in the simple linear regressions.
The multiple regression ‘holds’ ‘all else being equal’ in calculating the partial effect that a
change in one of the independent variables has on the dependent variable.
c. True
d. False – While the regular coefficient of determination (R2) cannot be negative, the
adjusted coefficient of determination can become negative. If the independent
variables added into a regression equation have very little explanatory power, the loss
of degrees of freedom may more than offset the added explanatory power.
e. True

12.86 If one model contains more explanatory variables, then SST remains the same for both
models but SSR will be higher for the model with more explanatory variables. Since SST
= SSR1 + SSE1 which is equivalent to SSR2 + SSE2 and given that SSR2 > SSR1, then SSE1
> SSE2. Hence, the coefficient of determination will be higher with a greater number of
explanatory variables and the coefficient of determination must be interpreted in
conjunction with whether or not the regression slope coefficients on the explanatory
variables are significantly different from zero.

12.87 This is a classic example of what happens when there is a high degree of correlation
between the independent variables. The overall model can be shown to have significant
explanatory power and yet none of the slope coefficients on the independent variables are
significantly different from zero. This is due to the effect that high correlation among the
independent variables has on the variance of the estimated slope coefficients.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-41

12.88

12.89 a. All else being equal, a unit change in population, industry size, measure of economic
quality, measure of political quality, measure of environmental quality, measure of health
and educational quality, and measure of social life results in a respective 4.983, 2.198,
3.816, -.310, -.886, 3.215, and .085 increase in the new business starts in the industry.
b. . 76.6% of the variability in new business starts in the industry can be
explained by the variability in the independent variables; population, industry size,
economic, political, environmental, health and educational, and social quality of life.
c. t 62,.05 = 1.67, therefore, the 90% CI: = .3708 up to 7.2612

d. , , = ± 1.999
Therefore, do not reject at the 5% level

e. , , = ± 1.999
Therefore, reject at the 5% level
f. At least one

F 7,62,.01 = 2.79. Therefore, reject at the 1% level

12.90 a. All else being equal, an increase of one question results in a decrease of 1.834 in
expected percentage of responses received. All else being equal, an increase in one
word in length of the questionnaire results in a decrease of .016 in expected percentage
of responses received.
b. 63.7% of the variability in the percentage of responses received can be explained by the
variability in the number of questions asked and the number of words.
c. At least one

F 2,27,.01 = 5.49. Therefore, reject at the 1% level


12-42 Statistics for Business and Economics, 9th Edition, Global Edition

d. = 2.771, 99% CI: -1.8345  2.771(.6349); -3.5938 up to -.0752


Therefore, the 99% confidence interval for the expected increase in the percentage of
responses received resulting from an increase of one question, if the number of words
do not change, runs from -3.5938 to -.0752. Also, since the interval does not include
zero, the coefficient of number of questions asked is statistically significant.

e. t = -1.78, = -1.703, -2.052.


Therefore, reject at the 5% level but not at the 2.5% level. And we conclude that the
length of questionnaire in number of words is statistically significant in explaining the
percentage of responses received at the 5% level but not at the 2.5% level.

12.91 a. All else being equal, a 1% increase in course time spent in group discussions
results in an expected increase of .3817 in the average rating of the course. All else
being equal, a dollar increase in money spent on the preparation of subject matter
materials results in an expected increase of .5172 in the average rating by participants
of the course. All else being equal, a dollar increase in expenditure on non-course-
related materials results in an expected increase of .0753 in the average rating of the
course.
b. 57.9% of the variation in the average rating of the course can be explained by the linear
relationship with % of class time spent on discussions, money spent on the preparation
of subject matter materials, and money spent on non-class-related materials.
c. At least one

F 3,21,.05 = 3.07
Therefore, reject at the 5% level
d. = 1.721, 90% CI: .3817  1.721(.2018); .0344 up to .729
Therefore, the 90% confidence interval for the expected increase in the average rating
of the course resulting from a 1% increase of class time spent on discussions, if the
other variables do not change, runs from .0344 up to .729. Also, since the interval
does not include zero, the coefficient is statistically significant.

e. t = 2.64, = 2.518, 2.831


Therefore, reject at the 1% level but not at the .5% level. And we conclude that the
money spent on the preparation of subject matter materials is statistically significant in
explaining the average rating of the course at the 1% level but not at the .5% level.

f. t = 1.09, = 1.721.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-43

Therefore, do not reject at the 10% level. And we conclude that the expenditure on
non-course-related materials is not statistically significant in explaining the average rating
of the course at the 10% level.

12.92
Regression Analysis: y_rating versus x1_expgrade, x2_Numstudents

The regression equation is


y_rating = - 0.200 + 1.41 x1_expgrade - 0.0158 x2_Numstudents
Predictor Coef SE Coef T P VIF
Constant -0.2001 0.6968 -0.29 0.777
x1_expgr 1.4117 0.1780 7.93 0.000 1.5
x2_Numst -0.015791 0.003783 -4.17 0.001 1.5

S = 0.1866 R-Sq = 91.5% R-Sq(adj) = 90.5%


Analysis of Variance
Source DF SS MS F P
Regression 2 6.3375 3.1687 90.99 0.000
Residual Error 17 0.5920 0.0348
Total 19 6.9295

91.5% of the variation in the rating can be explained by the linear dependence on the
expected grade and the number of students in the class.
The F-test of the significance of the overall model shows that we reject that all of the
slope coefficients are jointly equal to zero in favor of that at least one slope
coefficient is not equal to zero. The F-test yielded a p-value of .000.
The coefficients of expected grade and number of students are significant at common
levels of alpha.
All else being equal, a 1 point increase in the expected grade is associated with a 1.41
increase in the rating. The number of students variable is negative which indicates that
higher the student count, the lower is the rating.

12.93
At least one

, F 5,55,.01 = 3.37
Therefore, reject at the 1% level

12.94 a. All else being equal, each extra point in a student’s expected score leads to an
expected increase of .469 in the actual score.
b. t 103,.025 = 1.96, therefore, the 95% CI: = 2.4752 up to 4.2628
Therefore, the 95% confidence interval for the expected increase in a student’s actual
score resulting from an increase of 1 hour time spent on the course, if the other
variables do not change, runs from 2.4752 up to 4.2628. Also, since the interval does
not include zero, the coefficient is statistically significant.
12-44 Statistics for Business and Economics, 9th Edition, Global Edition

c. , , = 1.96, 2.58
Therefore, reject at the 5% level but not at the 1% level. And we conclude that a
student’s grade point average is statistically significant in explaining a student’s actual
score in the examination at the 5% level but not at the 1% level.

d. 68.6% of the variation in the exam scores is explained by their linear dependence on a
student’s expected score, hours per week spent working on the course, and a student’s
grade point average.
e. At least one

, F 3,103,.01 = 3.95
Reject at any common levels of alpha
f. . This is the sample correlation between the observed and the
predicted values of a student’s actual scores in the examination.

g.

12.95 a. t 22,.005 = 2.819, therefore, the 99% CI:


= .0368 up to .1580
Therefore, the 99% confidence interval for the expected % increase in the real deposit
rate resulting from a 1 unit increase of real per capita income, if the real interest rate do
not change, runs from .0368 up to .1580. Also, since the interval does not include zero,
the coefficient is statistically significant.

b. , , = 1.717, 2.074.
Therefore, reject at the 5% level but not the 2.5% level

c.
d. At least one

, F 2,22,.01 = 5.72
Reject at any common levels of alpha
e. . This is the sample correlation between the observed and the
predicted values of the change in the real deposit rate.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-45

12.96 a. t 2669,.05 = 1.645, therefore, the 90% CI: = 110.0795 up to


850.0005
Therefore, the 90% confidence interval for the expected increase in the minutes played
in the season resulting from a 1 unit increase of steals per minute, if the other variables
do not change, runs from 110.0795 up to 850.0005. Also, since the interval does not
include zero, the coefficient is statistically significant.
b. t 2669,.005 = 2.576, therefore, the 99% CI: = 803.4152 up to
1897.1848
Therefore, the 99% confidence interval for the expected increase in the minutes played
in the season resulting from a 1 unit increase of blocked shots per minute, if the other
variables do not change, runs from 803.4152 up to 1897.1848. Also, since the interval
does not include zero, the coefficient is statistically significant.

c. ,
= -2.576, therefore, reject at the .5% level. And we conclude that the
turnovers per minute is statistically significant in explaining the minutes played in the
season at all common levels of alpha.

d. ,
= 2.576, therefore, reject at the .5% level. And we conclude that the
assists per minute is statistically significant in explaining the minutes played in the
season at all common levels of alpha.

e. 52.39% of the variability in minutes played in the season can be explained by the
variability in all 9 variables.
f. . This is the sample correlation between the observed and the
predicted values of the minutes played in season.

12.97 a.

, = 2.66, 2.915, therefore, reject at the 1% level but


not at the .5% level. And we conclude that the real income per capita is statistically
significant in explaining the growth rate in real GDP at the 1% level but not at the .5%
level.

b. ,
12-46 Statistics for Business and Economics, 9th Edition, Global Edition

= 1.296, therefore, do not reject at the 20% level. And we conclude that the
average tax rate, as a proportion of GNP is not statistically significant in explaining the
growth rate in real GDP at any common level of alpha.
c. 17% of the variation in the growth rate in GDP can be explained by the variations in
real income per capita and the average tax rate, as a proportion of GNP.
d. . This is the sample correlation between the observed and the
predicted values of the growth rate in GDP.

12.98 57.3% of the variation in the female amateur golfers winnings per tournament can be
explained by variations in average length of drive, percentage times drive ends, percentage
times green reached, percentage times par saved after hitting into sand trap, average
number of putts taken on greens reached, average number of putts taken on greens not
reached in regulation, and the number of years the golfer has played.

The independent variables that are significant (10% level in fact all common levels of
alpha) include average length of drive (X1) and average number of putts taken on greens
reached in regulation (X5). The independent variable, average number of putts taken on
greens not reached in regulation (X6) is significant at 10% level but not at the 5% level.
The other independent variables are not significant at common levels of alpha.

The F-test of the significance of the overall model shows that we reject in favor of
that at least one slope coefficient is not equal to zero. The F-test yielded a p-value of .000.
The sample correlation between the observed and the predicted values of the female
amateur golfers winnings per tournament is .7572.

A report can be written by following the Cotton Case Study and testing the significance of the
model. See section 12.9

12.99 a. Starting with the correlation matrix:


Correlations: EconGPA, SATverb, SATmath, HSPct
EconGPA SATverb SATmath
SATverb 0.478
0.000

SATmath 0.427 0.353


0.000 0.003

HSPct 0.362 0.201 0.497


0.000 0.121 0.000

Regression Analysis: EconGPA versus SATverb, SATmath, HSPct


The regression equation is
EconGPA = 0.612 + 0.0239 SATverb + 0.0117 SATmath + 0.00530 HSPct
61 cases used 51 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 0.6117 0.4713 1.30 0.200
SATverb 0.023929 0.007386 3.24 0.002 1.2
SATmath 0.011722 0.007887 1.49 0.143 1.5
HSPct 0.005303 0.004213 1.26 0.213 1.3

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-47

S = 0.4238 R-Sq = 32.9% R-Sq(adj) = 29.4%


Analysis of Variance
Source DF SS MS F P
Regression 3 5.0171 1.6724 9.31 0.000
Residual Error 57 10.2385 0.1796
Total 60 15.2556

Source DF Seq SS
SATverb 1 3.7516
SATmath 1 0.9809
HSPct 1 0.2846

The regression model indicates positive coefficients, as expected, for all three
independent variables. The greater the high school rank, and the higher the SAT verbal
and SAT math scores, the larger the Econ GPA. The high school rank variable has the
smallest t-statistic and is removed from the model:

Regression Analysis: EconGPA versus SATverb, SATmath


The regression equation is
EconGPA = 0.755 + 0.0230 SATverb + 0.0174 SATmath
67 cases used 45 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 0.7547 0.4375 1.72 0.089
SATverb 0.022951 0.006832 3.36 0.001 1.1
SATmath 0.017387 0.006558 2.65 0.010 1.1

S = 0.4196 R-Sq = 30.5% R-Sq(adj) = 28.3%


Analysis of Variance
Source DF SS MS F P
Regression 2 4.9488 2.4744 14.05 0.000
Residual Error 64 11.2693 0.1761
Total 66 16.2181

Source DF Seq SS
SATverb 1 3.7109
SATmath 1 1.2379

Both SAT variables are now statistically significant at the .05 level and appear to pick up
separate influences on the dependent variable. The simple correlation coefficient
between SAT math and SAT verbal is relatively low at .353. Thus, multicollinearity will
not be dominant in this regression model.

The final regression model, with conditional t-statistics in parentheses under the
coefficients, is:

(3.36) (2.65)
S = .4196 R2 = .305 n = 67

b. Starting with the correlation matrix:


Correlations: EconGPA, Acteng, ACTmath, ACTss, ACTcomp, HSPct
EconGPA Acteng ACTmath ACTss ACTcomp
Acteng 0.387
0.001

ACTmath 0.338 0.368


12-48 Statistics for Business and Economics, 9th Edition, Global Edition

0.003 0.001

ACTss 0.442 0.448 0.439


0.000 0.000 0.000

ACTcomp 0.474 0.650 0.765 0.812


0.000 0.000 0.000 0.000

HSPct 0.362 0.173 0.290 0.224 0.230


0.000 0.150 0.014 0.060 0.053

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-49

Regression Analysis: EconGPA versus Acteng, ACTmath, ...


The regression equation is
EconGPA = - 0.207 + 0.0266 Acteng - 0.0023 ACTmath + 0.0212 ACTss
+ 0.0384 ACTcomp + 0.0128 HSPct
71 cases used 41 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant -0.2069 0.6564 -0.32 0.754
Acteng 0.02663 0.02838 0.94 0.352 2.2
ACTmath -0.00229 0.03031 -0.08 0.940 4.2
ACTss 0.02118 0.02806 0.75 0.453 4.6
ACTcomp 0.03843 0.07287 0.53 0.600 12.7
HSPct 0.012817 0.005271 2.43 0.018 1.2

S = 0.5034 R-Sq = 31.4% R-Sq(adj) = 26.1%


Analysis of Variance
Source DF SS MS F P
Regression 5 7.5253 1.5051 5.94 0.000
Residual Error 65 16.4691 0.2534
Total 70 23.9945

Source DF Seq SS
Acteng 1 3.5362
ACTmath 1 1.0529
ACTss 1 1.4379
ACTcomp 1 0.0001
HSPct 1 1.4983

The regression shows that only high school rank is significant at the .05 level. We may
suspect multicollinearity between the variables, particularly since there is a ‘total’ ACT
score (ACT composite) as well as the components that make up the ACT composite.
Since conditional significance is dependent on which other independent variables are
included in the regression equation, drop one variable at a time. ACTmath has the lowest
t-statistic and is removed:

Regression Analysis: EconGPA versus Acteng, ACTss, ACTcomp, HSPct


The regression equation is
EconGPA = - 0.195 + 0.0276 Acteng + 0.0224 ACTss + 0.0339 ACTcomp
+ 0.0127 HSPct
71 cases used 41 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant -0.1946 0.6313 -0.31 0.759
Acteng 0.02756 0.02534 1.09 0.281 1.8
ACTss 0.02242 0.02255 0.99 0.324 3.0
ACTcomp 0.03391 0.04133 0.82 0.415 4.2
HSPct 0.012702 0.005009 2.54 0.014 1.1

S = 0.4996 R-Sq = 31.4% R-Sq(adj) = 27.2%


Analysis of Variance
Source DF SS MS F P
Regression 4 7.5239 1.8810 7.54 0.000
Residual Error 66 16.4706 0.2496
Total 70 23.9945

Source DF Seq SS
Acteng 1 3.5362
ACTss 1 2.1618
ACTcomp 1 0.2211
HSPct 1 1.6048
12-50 Statistics for Business and Economics, 9th Edition, Global Edition

Again, high school rank is the only conditionally significant variable. ACTcomp has the
lowest t-statistic and is removed:

Regression Analysis: EconGPA versus Acteng, ACTss, HSPct


The regression equation is
EconGPA = 0.049 + 0.0390 Acteng + 0.0364 ACTss + 0.0129 HSPct
71 cases used 41 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 0.0487 0.5560 0.09 0.930
Acteng 0.03897 0.02114 1.84 0.070 1.3
ACTss 0.03643 0.01470 2.48 0.016 1.3
HSPct 0.012896 0.004991 2.58 0.012 1.1

S = 0.4983 R-Sq = 30.7% R-Sq(adj) = 27.6%


Analysis of Variance
Source DF SS MS F P
Regression 3 7.3558 2.4519 9.87 0.000
Residual Error 67 16.6386 0.2483
Total 70 23.9945

Source DF Seq SS
Acteng 1 3.5362
ACTss 1 2.1618
HSPct 1 1.6579

Now ACTss and high school rank are conditionally significant. ACTenglish has a t-
statistic less than 2 and is removed:

Regression Analysis: EconGPA versus ACTss, HSPct


The regression equation is
EconGPA = 0.566 + 0.0479 ACTss + 0.0137 HSPct
71 cases used 41 cases contain missing values
Predictor Coef SE Coef T P VIF
Constant 0.5665 0.4882 1.16 0.250
ACTss 0.04790 0.01355 3.53 0.001 1.1
HSPct 0.013665 0.005061 2.70 0.009 1.1

S = 0.5070 R-Sq = 27.1% R-Sq(adj) = 25.0%


Analysis of Variance
Source DF SS MS F P
Regression 2 6.5123 3.2562 12.67 0.000
Residual Error 68 17.4821 0.2571
Total 70 23.9945

Source DF Seq SS
ACTss 1 4.6377
HSPct 1 1.8746

Both of the independent variables are statistically significant at the .05 level and hence,
the final regression model, with conditional t-statistics in parentheses under the
coefficients, is:

(3.53) (2.70)
S = .5070 R2 = .271 n = 71

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-51

c. The regression model with the SAT variables is the better predictor because the
standard error of the estimate is smaller than for the ACT model (.4196 vs. .5070). The
R2 measure cannot be directly compared due to the sample size differences.

12.100
Correlations: hseval, Comper, Homper, Indper, sizehse, incom72
hseval Comper Homper Indper sizehse
Comper -0.335
0.001

Homper 0.145 -0.499


0.171 0.000

Indper -0.086 -0.140 -0.564


0.419 0.188 0.000

sizehse 0.542 -0.278 0.274 -0.245


0.000 0.008 0.009 0.020

incom72 0.426 -0.198 -0.083 0.244 0.393


0.000 0.062 0.438 0.020 0.000
The correlation matrix indicates that the size of the house, income, and percent homeowners
have a positive relationship with house value. There is a negative relationship between the
percent industrial and percent commercial and the house value.

Regression Analysis: hseval versus Comper, Homper, ...


The regression equation is
hseval = - 19.0 - 26.4 Comper - 12.1 Homper - 15.5 Indper + 7.22 sizehse
+ 0.00408 incom72
Predictor Coef SE Coef T P VIF
Constant -19.02 13.20 -1.44 0.153
Comper -26.393 9.890 -2.67 0.009 2.2
Homper -12.123 7.508 -1.61 0.110 3.0
Indper -15.531 8.630 -1.80 0.075 2.6
sizehse 7.219 2.138 3.38 0.001 1.5
incom72 0.004081 0.001555 2.62 0.010 1.4

S = 3.949 R-Sq = 40.1% R-Sq(adj) = 36.5%


Analysis of Variance
Source DF SS MS F P
Regression 5 876.80 175.36 11.25 0.000
Residual Error 84 1309.83 15.59
Total 89 2186.63
12-52 Statistics for Business and Economics, 9th Edition, Global Edition

All variables are conditionally significant with the exception of Indper and Homper.
Since Homper has the smaller t-statistic, it is removed:

Regression Analysis: hseval versus Comper, Indper, sizehse, incom72


The regression equation is
hseval = - 30.9 - 15.2 Comper - 5.73 Indper + 7.44 sizehse + 0.00418
incom72
Predictor Coef SE Coef T P VIF
Constant -30.88 11.07 -2.79 0.007
Comper -15.211 7.126 -2.13 0.036 1.1
Indper -5.735 6.194 -0.93 0.357 1.3
sizehse 7.439 2.154 3.45 0.001 1.5
incom72 0.004175 0.001569 2.66 0.009 1.4

S = 3.986 R-Sq = 38.2% R-Sq(adj) = 35.3%


Analysis of Variance
Source DF SS MS F P
Regression 4 836.15 209.04 13.16 0.000
Residual Error 85 1350.48 15.89
Total 89 2186.63

Indper is not significant and is removed:

Regression Analysis: hseval versus Comper, sizehse, incom72


The regression equation is
hseval = - 34.2 - 13.9 Comper + 8.27 sizehse + 0.00364 incom72
Predictor Coef SE Coef T P VIF
Constant -34.24 10.44 -3.28 0.002
Comper -13.881 6.974 -1.99 0.050 1.1
sizehse 8.270 1.957 4.23 0.000 1.2
incom72 0.003636 0.001456 2.50 0.014 1.2

S = 3.983 R-Sq = 37.6% R-Sq(adj) = 35.4%

Analysis of Variance
Source DF SS MS F P
Regression 3 822.53 274.18 17.29 0.000
Residual Error 86 1364.10 15.86
Total 89 2186.63

This becomes the final regression model. The three independent variables are
conditionally significant in explaining the house value at .05 level and hence, the final
regression model, with conditional t-statistics in parentheses under the coefficients, is:

(6.97) (1.96) (0.00146)


S = 3.983 R2 = .376 n = 90

The selection of a community with the objective of having larger house values would
include communities where the percent of commercial property is low, the median rooms
per residence is high and the per capita income is high.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-53

12.101 a. Correlation matrix:


Correlations: deaths, vehwt, impcars, lghttrks, carage
deaths vehwt impcars lghttrks
vehwt 0.244
0.091

impcars -0.284 -0.943


0.048 0.000

lghttrks 0.726 0.157 -0.175


0.000 0.282 0.228

carage -0.422 0.123 0.011 -0.329


0.003 0.400 0.943 0.021
Crash deaths are positively related to vehicle weight and percentage of light trucks and
negatively related to percent imported cars and car age. Light trucks will have the
strongest linear association of any independent variable followed by car age.
Multicollinearity is likely to exist due to the strong correlation between impcars and
vehicle weight.

b.
Regression Analysis: deaths versus vehwt, impcars, lghttrks, carage
The regression equation is
deaths = 2.60 +0.000064 vehwt - 0.00121 impcars + 0.00833 lghttrks
- 0.0395 carage
Predictor Coef SE Coef T P VIF
Constant 2.597 1.247 2.08 0.043
vehwt 0.0000643 0.0001908 0.34 0.738 10.9
impcars -0.001213 0.005249 -0.23 0.818 10.6
lghttrks 0.008332 0.001397 5.96 0.000 1.2
carage -0.03946 0.01916 -2.06 0.045 1.4

S = 0.05334 R-Sq = 59.5% R-Sq(adj) = 55.8%


Analysis of Variance
Source DF SS MS F P
Regression 4 0.183634 0.045909 16.14 0.000
Residual Error 44 0.125174 0.002845
Total 48 0.308809
Light trucks is a significant positive variable. Since impcars has the smallest t-statistic, it
is removed from the model:

Regression Analysis: deaths versus vehwt, lghttrks, carage


The regression equation is
deaths = 2.55 +0.000106 vehwt + 0.00831 lghttrks - 0.0411 carage
Predictor Coef SE Coef T P VIF
Constant 2.555 1.220 2.09 0.042
vehwt 0.00010622 0.00005901 1.80 0.079 1.1
lghttrks 0.008312 0.001380 6.02 0.000 1.2
carage -0.04114 0.01754 -2.34 0.024 1.2

S = 0.05277 R-Sq = 59.4% R-Sq(adj) = 56.7%

Analysis of Variance
Source DF SS MS F P
Regression 3 0.183482 0.061161 21.96 0.000
Residual Error 45 0.125326 0.002785
Total 48 0.308809
12-54 Statistics for Business and Economics, 9th Edition, Global Edition

Also, remove vehicle weight using the same argument:

Regression Analysis: deaths versus lghttrks, carage


The regression equation is
deaths = 2.51 + 0.00883 lghttrks - 0.0352 carage
Predictor Coef SE Coef T P VIF
Constant 2.506 1.249 2.01 0.051
lghttrks 0.008835 0.001382 6.39 0.000 1.1
carage -0.03522 0.01765 -2.00 0.052 1.1

S = 0.05404 R-Sq = 56.5% R-Sq(adj) = 54.6%

Analysis of Variance
Source DF SS MS F P
Regression 2 0.174458 0.087229 29.87 0.000
Residual Error 46 0.134351 0.002921
Total 48 0.308809

The model has light trucks and car age as the significant variables. Note that car age is
marginally significant (p-value of .052) and hence could also be dropped from the
model.

c. The regression modeling indicates that the percentage of light trucks is conditionally
significant in all of the models and hence is an important predictor in the model. Car
age and imported cars are marginally significant predictors when only light trucks is
included in the model.

12.102 a. Correlation matrix:


Correlations: deaths, Prurpop, Ruspeed, Prsurf
deaths Prurpop Ruspeed
Prurpop 0.594
0.000

Ruspeed 0.305 0.224


0.033 0.121

Prsurf -0.556 -0.207 -0.232


0.000 0.153 0.109

Descriptive Statistics: deaths, Prurpop, Prsurf, Ruspeed


Variable N Mean Median TrMean StDev SE Mean
deaths 49 0.1746 0.1780 0.1675 0.0802 0.0115
Prurpop 49 0.4110 0.3689 0.5992 0.2591 0.0370
Prsurf 49 0.7980 0.8630 0.8117 0.1928 0.0275
Ruspeed 49 58.186 58.400 58.222 1.683 0.240

Variable Minimum Maximum Q1 Q3


deaths 0.0569 0.5505 0.1240 0.2050
Prurpop 0.0311 1.0000 0.1887 0.5915
Prsurf 0.2721 1.0000 0.6563 0.9485
Ruspeed 53.500 62.200 57.050 59.150

The proportion of urban population and rural roads that are surfaced are positively related
to crash deaths. Average rural speed is positively related, but the relationship is not as
strong as the proportion of urban population and surfaced roads. The simple correlation

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-55

coefficients among the independent variables are relatively low and hence
multicollinearity should not be dominant in this model. Note the relatively narrow range
for average rural speed. This would indicate that there is not much variability in this
independent variable.

b. Multiple regression
Regression Analysis: deaths versus Prurpop, Prsurf, Ruspeed
The regression equation is
deaths = -0.0086 - 0.149 Prurpop - 0.181 Prsurf + 0.00457 Ruspeed

Predictor Coef SE Coef T P VIF


Constant -0.0086 .2943 -0.029 0.977
Purbanpo -0.14946 0.03192 -4.68 0.000 1.1
Prsurf -0.18058 0.04299 -4.20 0.000 1.1
Ruspeed 0.004569 0.004942 0.92 0.360 1.1

S = 0.05510 R-Sq = 55.8% R-Sq(adj) = 52.8%

Analysis of Variance
Source DF SS MS F P
Regression 3 0.172207 0.057402 18.91 0.000
Residual Error 45 0.136602 0.003036
Total 48 0.308809

The model has conditionally significant variables for percent urban population and
percent surfaced roads. Since average rural speed is not conditionally significant, it is
dropped from the model:

Regression Analysis: deaths versus Prurpop, Prsurf


The regression equation is
deaths = 0.26116 + 0.155 Prurpop - 0.188 Prsurf
Predictor Coef SE Coef T P VIF
Constant 0.26116 0.03919 6.66 0.000
Prurpop 0.15493 0.03132 4.95 0.000 1.0
Prsurf -0.18831 0.04210 -4.47 0.000 1.0

S = 0.05501 R-Sq = 54.9% R-Sq(adj) = 53.0%

Analysis of Variance
Source DF SS MS F P
Regression 2 0.169612 0.084806 28.03 0.000
Residual Error 46 0.139197 0.003026
Total 48 0.308809

This becomes the final model since both variables are conditionally significant.

f. Conclude that the proportions of urban populations and the percent of rural roads that
are surfaced are important independent variables in explaining crash deaths. All else
being equal, the higher the proportion of urban population, the higher the crash
deaths. All else being equal, increases in the proportion of rural roads that are
surfaced will result in lower crash deaths. The average rural speed is not conditionally
significant.
12-56 Statistics for Business and Economics, 9th Edition, Global Edition

12.103 a. Correlation matrix and descriptive statistics


Correlations: hseval, sizehse, Taxhse, Comper, incom72, totexp
hseval sizehse Taxhse Comper incom72
sizehse 0.542
0.000
Taxhse 0.248 0.289
0.019 0.006
Comper -0.335 -0.278 -0.114
0.001 0.008 0.285
incom72 0.426 0.393 0.261 -0.198
0.000 0.000 0.013 0.062
totexp 0.261 -0.022 0.228 0.269 0.376
0.013 0.834 0.030 0.010 0.000

The correlation matrix shows that multicollinearity is not likely to be a problem in this
model since all of the correlations among the independent variables are relatively low.

Descriptive Statistics: hseval, sizehse, Taxhse, Comper, incom72, totexp


Variable N Mean Median TrMean StDev SE Mean
hseval 90 21.031 20.301 20.687 4.957 0.522
sizehse 90 5.4778 5.4000 5.4638 0.2407 0.0254
Taxhse 90 130.13 131.67 128.31 48.89 5.15
Comper 90 0.16211 0.15930 0.16206 0.06333 0.00668
incom72 90 3360.9 3283.0 3353.2 317.0 33.4
totexp 90 1488848 1089110 1295444 1265564 133402

Variable Minimum Maximum Q1 Q3


hseval 13.300 35.976 17.665 24.046
sizehse 5.0000 6.2000 5.3000 5.6000
Taxhse 35.04 399.60 98.85 155.19
Comper 0.02805 0.28427 0.11388 0.20826
incom72 2739.0 4193.0 3114.3 3585.3
totexp 361290 7062330 808771 1570275

The range for applying the regression model (variable means + / - 2 standard errors):
Hseval 21.03 +/- 2(4.957) = 11.12 to 30.94
Sizehse 5.48 +/- 2(.24) = 5.0 to 5.96
Taxhse 130.13 +/- 2(48.89) = 32.35 to 227.91
Comper .16 +/- 2(.063) = .034 to .286
Incom72 3361 +/- 2(317) = 2727 to 3995
Totexp 1488848 +/- 2(1265564) = not a good approximation

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-57

b. Regression models:
Regression Analysis: hseval versus sizehse, Taxhse, ...
The regression equation is
hseval = - 31.1 + 9.10 sizehse - 0.00058 Taxhse - 22.2 Comper + 0.00120 incom72
+ 0.000001 totexp

Predictor Coef SE Coef T P VIF


Constant -31.07 10.09 -3.08 0.003
sizehse 9.105 1.927 4.72 0.000 1.3
Taxhse -0.000584 0.008910 -0.07 0.948 1.2
Comper -22.197 7.108 -3.12 0.002 1.3
incom72 0.001200 0.001566 0.77 0.445 1.5
totexp 0.00000125 0.00000038 3.28 0.002 1.5

S = 3.785 R-Sq = 45.0% R-Sq(adj) = 41.7%

Analysis of Variance
Source DF SS MS F P
Regression 5 982.98 196.60 13.72 0.000
Residual Error 84 1203.65 14.33
Total 89 2186.63

Taxhse is not conditionally significant, nor is income; however, dropping one variable at a time,
eliminate Taxhse first, then eliminate income:

Regression Analysis: hseval versus sizehse, Comper, totexp


The regression equation is
hseval = - 29.9 + 9.61 sizehse - 23.5 Comper + 0.000001 totexp
Predictor Coef SE Coef T P VIF
Constant -29.875 9.791 -3.05 0.003
sizehse 9.613 1.724 5.58 0.000 1.1
Comper -23.482 6.801 -3.45 0.001 1.2
totexp 0.00000138 0.00000033 4.22 0.000 1.1

S = 3.754 R-Sq = 44.6% R-Sq(adj) = 42.6%

Analysis of Variance
Source DF SS MS F P
Regression 3 974.55 324.85 23.05 0.000
Residual Error 86 1212.08 14.09
Total 89 2186.63

This is the final regression model. All of the independent variables are conditionally
significant.
Both the size of house and total government expenditures enhances market value of
homes while the percent of commercial property tends to reduce market values of homes.

c. In the final regression model, the tax variable was not found to be conditionally
significant and hence it is difficult to support the developer’s claim.
12-58 Statistics for Business and Economics, 9th Edition, Global Edition

12.104 a. Correlation matrix


Correlations: Retsales, Unemploy, PerInc
Retsales Unemploy
Unemploy -0.454
0.001

PerInc -0.098 -0.031


0.495 0.828

There is a negative association between the dependent and the independent variables.
High correlation among the independent variables does not appear to be a problem since
the correlation between the independent variables is low.

Descriptive Statistics: Retsales, Unemploy, PerInc


Variable N Mean SE Mean TrMean StDev Minimum Q1 Median
Retsales 51 14.099 0.292 14.071 2.087 6.817 12.883 13.817
Unemploy 51 5.341 0.172 5.333 1.230 2.900 4.400 5.400
PerInc 51 32161 800 31645 5712 24317 28172 31029

Variable Q3 Maximum
Retsales 15.015 20.307
Unemploy 6.400 8.300
PerInc 34867 53448

Regression Analysis: Retsales versus Unemploy, PerInc


The regression equation is
Retsales = 19.6 - 0.777 Unemploy - 0.000041 PerInc

Predictor Coef SE Coef T P VIF


Constant 19.565 1.941 10.08 0.000
Unemploy -0.7768 0.2166 -3.59 0.001 1.001
PerInc -0.00004097 0.00004664 -0.88 0.384 1.001

S = 1.88261 R-Sq = 21.9% R-Sq(adj) = 18.6%

Analysis of Variance

Source DF SS MS F P
Regression 2 47.674 23.837 6.73 0.003
Residual Error 48 170.122 3.544
Total 50 217.796

The 95% confidence intervals for the regression slope coefficients:

: –.7768 +/– 2.011(.2166) = –.7768 +/– .436

: –.000041 +/– 2.011(.000047) = –.000041 +/– .000095

b. All things equal, the conditional effect of a $1,000 decrease in per capita income on retail
sales would be to improve retail sales by $.041.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-59

c. Adding state population as a predictor yields the following regression results:

Regression Analysis: Retsales versus Unemploy, PerInc, Population


The regression equation is
Retsales = 19.4 - 0.732 Unemploy - 0.000038 PerInc - 0.000025 Population

Predictor Coef SE Coef T P VIF


Constant 19.384 1.979 9.80 0.000
Unemploy -0.7318 0.2313 -3.16 0.003 1.126
PerInc -0.00003822 0.00004720 -0.81 0.422 1.011
Population -0.00002476 0.00004244 -0.58 0.562 1.133

S = 1.89568 R-Sq = 22.5% R-Sq(adj) = 17.5%

Analysis of Variance

Source DF SS MS F P
Regression 3 48.897 16.299 4.54 0.007
Residual Error 47 168.899 3.594
Total 50 217.796

The population variable is not conditionally significant and adds little explanatory power,
therefore, it will not improve the multiple regression model.

12.105

a. Final regression model 1 to predict residential investment using prime interest rate, GDP,
Money supply, and Price index for finished goods:

Regression Analysis: Residential Inve versus Prime Interest R, GDP, ...


The regression equation is
Residential Investment = - 165 - 7.17 Prime Interest Rate + 0.0981 GDP
- 0.149 Money Supply + 0.258 Price index_Finished goods

208 cases used, 48 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant -164.82 19.66 -8.38 0.000
Prime Interest Rate -7.173 1.314 -5.46 0.000 1.484
GDP 0.09813 0.01044 9.40 0.000 99.559
Money Supply -0.148772 0.007976 -18.65 0.000 43.043
Price index_Finished goods 0.25766 0.04401 5.85 0.000 110.172

S = 49.6967 R-Sq = 88.6% R-Sq(adj) = 88.4%

Analysis of Variance

Source DF SS MS F P
Regression 4 3903137 975784 395.09 0.000
Residual Error 203 501362 2470
Total 207 440449
12-60 Statistics for Business and Economics, 9th Edition, Global Edition

This will be the final model with prime rate as the interest rate variable since all of the
independent variables except government spending are conditionally significant. Note the
significant multicollinearity that exists between the independent variables.

The variables, prime interest rate and money supply, are negatively related whereas the variables,
GDP and price index, have the expected sign as the dependent variable, residential investment.
The standard error of estimate of 49.7 indicates a significant variation between observed and
predicted values.

Final regression model 2 to predict residential investment using Federal funds interest rate, GDP,
Money supply, and Price index for finished goods:

Regression Analysis: Residential Inve versus Fed Funds Rate, GDP, ...
The regression equation is
Residential Investment = - 170 - 6.81 Fed Funds Rate + 0.0922 GDP
- 0.151 Money Supply + 0.283 Price index_Finished goods

208 cases used, 48 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant -170.41 19.64 -8.68 0.000
Fed Funds Rate -6.809 1.268 -5.37 0.000 1.548
GDP 0.092190 0.009950 9.27 0.000 90.022
Money Supply -0.150987 0.008130 -18.57 0.000 44.541
Price index_Finished goods 0.28285 0.04276 6.62 0.000 103.532

S = 49.7996 R-Sq = 88.6% R-Sq(adj) = 88.3%

Analysis of Variance

Source DF SS MS F P
Regression 4 3901058 975265 393.25 0.000
Residual Error 203 503441 2480
Total 207 4404499

The model with the federal funds rate as the interest rate variable is also the final model with all
of the independent variables, except government spending, are conditionally significant. Again,
high correlation among the independent variables will be a problem with this regression model.

As expected, the variables, federal funds interest rate and money supply, are negatively related
whereas the variables, GDP and price index, have the expected sign as the dependent variable,
residential investment. The standard error of estimate of 49.8 indicates a significant variation
between observed and predicted values.

In both the regression models, 88.6% of the variation in the residential investment can be
explained by variations in the independent variables and the standard error of the estimate are
almost equal. Hence, both the equations provide the best predictions.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-61

b.
Prime interest rate as the interest rate variable:
b 1±t n−K−1 , α /2 s b
1 : –7.173 +/– 1.97(1.314) = –7.173 +/– 2.589 or (–9.762, –4.584)

Federal funds rate as the interest rate variable:


b 1±t n−K−1 , α /2 s b
1 : –6.809 +/– 1.97(1.268) = –6.809 +/– 2.498 or (–9.307, –4.311)

12.106
Regression analysis to predict Breast Cancer Death Rate:

Correlations: BCncrDthRte, Nurses, Female Smoke, Alcohol B


BCncrDthRte Nurses Female Smoke
Nurses 0.313
0.025
Female Smoke 0.454 0.167
0.001 0.243
Alcohol B -0.096 0.528 -0.019
0.503 0.000 0.893
Regression Analysis: BCncrDthRte versus Nurses, Female Smoke, Alcohol B

The regression equation is


BCncrDthRte = 0.00983 + 0.000004 Nurses + 0.000194 Female Smoke
- 0.000169 Alcohol B

Predictor Coef SE Coef T P VIF


Constant 0.009828 0.001950 5.04 0.000
Nurses 0.00000426 0.00000149 2.86 0.006 1.449
Female Smoke 0.00019419 0.00006243 3.11 0.003 1.046
Alcohol B -0.00016866 0.00007826 -2.16 0.036 1.409

S = 0.00147421 R-Sq = 33.0% R-Sq(adj) = 28.7%

Analysis of Variance

Source DF SS MS F P
Regression 3 0.000050366 0.000016789 7.72 0.000
Residual Error 47 0.000102145 0.000002173
Total 50 0.000152511

This is the final regression model. All else being equal, an increase in 1 nurse per 100,000
population increases the B cancer death rate by .000004 per 1000 population. All else being
equal, a 1% increase in the percentage of female smokers increases the death rate by .000194 per
1000 population. All else being equal, a 1% increase in the percentage of binge drinkers
decreases the death rate by .000169 per 1000 population. The t statistics indicate that all the three
independent variables are significant at 5% level and hence a significant relationship exists. The
small standard error of estimate of .0015 indicates a small variation between observed and
predicted values. 33% of the variation in the B cancer death rate is explained by the variation in
the number of nurses, percent of female smokers, and percent of binge drinkers.
12-62 Statistics for Business and Economics, 9th Edition, Global Edition

Regression analysis to predict Lung Cancer Death Rate:

Correlations: LCncrDthRte, Nurses, Smoker Per, Alcohol B, Median Incom, ...

LCncrDthRte Nurses Smoker Per Alcohol B


Nurses 0.353
0.011
Smoker Per 0.687 0.023
0.000 0.870
Alcohol B -0.064 0.528 -0.175
0.654 0.000 0.220
Median Income -0.561 0.012 -0.670 0.183
0.000 0.931 0.000 0.198
Per Fam Pov 0.338 -0.096 0.548 -0.376
0.015 0.502 0.000 0.006

Median Income
Per Fam Pov -0.748
0.000

Regression Analysis: LCncrDthRte versus Nurses, Smoker Per, ...

The regression equation is


LCncrDthRte = 0.0473 + 0.000034 Nurses + 0.00221 Smoker Per - 0.00101 Alcohol B
- 0.000001 Median Income - 0.00137 Per Fam Pov

Predictor Coef SE Coef T P VIF


Constant 0.04732 0.02542 1.86 0.069
Nurses 0.00003449 0.00000803 4.30 0.000 1.422
Smoker Per 0.0022084 0.0004840 4.56 0.000 1.849
Alcohol B -0.0010087 0.0004628 -2.18 0.035 1.668
Median Income -0.00000060 0.00000024 -2.48 0.017 2.985
Per Fam Pov -0.0013658 0.0006742 -2.03 0.049 2.652

S = 0.00801372 R-Sq = 65.8% R-Sq(adj) = 62.0%

Analysis of Variance

Source DF SS MS F P
Regression 5 0.0055526 0.0011105 17.29 0.000
Residual Error 45 0.0028899 0.0000642
Total 50 0.0084425

This is the final regression model. All else being equal, an increase in 1 nurse per 100,000
population increases the L cancer death rate by .000034 per 1000 population. All else being
equal, a 1% increase in the percentage of smokers increases the death rate by .00221. All else
being equal, a 1% increase in the percentage of binge drinkers decreases the death rate
by .00101. All else being equal, a 1$ increase in the median household income decreases the
death rate by .000001. All else being equal, a 1% increase in the percent of families below
poverty decreases the death rate by .00137. The t statistics indicate that all the independent
variables are significant at 5% level and hence a significant relationship exists. The small
standard error of estimate of .008 indicates a small variation between observed and predicted

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-63

values. 65.8% of the variation in the L cancer death rate is explained by the variation in the
number of nurses, percent of smokers, percent of binge drinkers, median household income, and
percent of families below poverty.

12.107
Correlations: Salary, age, Experience, Years Jr, Years Senior, Gender, Market
Salary age Experience Years Jr Years Senior Gender
age 0.749
0.000
Experience 0.883 0.877
0.000 0.000
Years Jr 0.698 0.712 0.803
0.000 0.000 0.000
Years Senior 0.777 0.583 0.674 0.312
0.000 0.000 0.000 0.000
Gender -0.429 -0.234 -0.378 -0.367 -0.292
0.000 0.004 0.000 0.000 0.000
Market 0.026 -0.134 -0.150 -0.113 -0.017 0.062
0.750 0.103 0.067 0.169 0.833 0.453

The correlation matrix indicates several independent variables that should provide good
explanatory power in the regression model. We would expect that Experience, years at
junior level analyst, and years at senior level analyst are likely to be conditionally
significant:

Regression Analysis: Salary versus age, Experience, ...


The regression equation is
Salary = 36813 - 82.0 age + 530 Experience + 409 Years Jr + 754 Years Senior -
1574 Gender + 4823 Market

Predictor Coef SE Coef T P VIF


Constant 36813 2224 16.55 0.000
age -82.01 67.28 -1.22 0.225 4.561
Experience 530.31 96.17 5.51 0.000 9.968
Years Jr 409.2 113.6 3.60 0.000 4.005
Years Senior 754.09 89.51 8.42 0.000 2.609
Gender -1574.1 734.9 -2.14 0.034 1.269
Market 4823 1131 4.26 0.000 1.045

S = 3533.83 R-Sq = 87.9% R-Sq(adj) = 87.4%

Analysis of Variance

Source DF SS MS F P
Regression 6 12947735493 2157955916 172.80 0.000
Residual Error 143 1785774544 12487934
Total 149 14733510038
12-64 Statistics for Business and Economics, 9th Edition, Global Edition

Dropping the age variable yields:

Regression Analysis: Salary versus Experience, Years Jr, ...


The regression equation is
Salary = 34283 + 462 Experience + 401 Years Jr + 752 Years Senior - 1775
Gender + 4837 Market

Predictor Coef SE Coef T P VIF


Constant 34282.8 799.3 42.89 0.000
Experience 461.73 78.12 5.91 0.000 6.557
Years Jr 401.0 113.6 3.53 0.001 3.991
Years Senior 751.52 89.64 8.38 0.000 2.607
Gender -1775.3 717.3 -2.47 0.014 1.205
Market 4837 1133 4.27 0.000 1.045

S = 3539.79 R-Sq = 87.8% R-Sq(adj) = 87.3%

Analysis of Variance

Source DF SS MS F P
Regression 5 12929178260 2585835652 206.37 0.000
Residual Error 144 1804331777 12530082
Total 149 14733510038

This is the final model. All of the independent variables are conditionally significant
and the model explains a sizeable portion of the variability in salary.

To test the hypothesis that the rate of change in female salaries as a function of
Experience is less than the rate of change in male salaries as a function of Experience,
the dummy variable Gender is used to see if the slope coefficient for Experience (X1)
is different for males and females. The following model is used:

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-65

Create the variable X4X1 and then test for conditional significance in the regression
model. If it proves to be a significant predictor of salaries then there is strong evidence to
conclude that the rate of change in female salaries as a function of Experience is different
than for males:

Regression Analysis: Salary versus Experience, Fem(exp), ...


The regression equation is
Salary = 34905 + 405 Experience + 166 Fem(exp) + 434 Years Jr + 805 Years Senior
- 3521 Gender + 4921 Market

Predictor Coef SE Coef T P VIF


Constant 34905.4 876.6 39.82 0.000
Experience 404.89 84.70 4.78 0.000 7.803
Fem(exp) 166.22 99.01 1.68 0.095 3.496
Years Jr 433.8 114.5 3.79 0.000 4.111
Years Senior 805.05 94.61 8.51 0.000 2.941
Gender -3521 1261 -2.79 0.006 3.767
Market 4921 1127 4.37 0.000 1.047

S = 3517.65 R-Sq = 88.0% R-Sq(adj) = 87.5%

Analysis of Variance

Source DF SS MS F P
Regression 6 12964050859 2160675143 174.62 0.000
Residual Error 143 1769459178 12373840
Total 149 14733510038

The regression shows that the newly created variable of Fem(exp) is conditionally
significant at the 10% level but not at the 5% level. And we conclude that the rate of
change in female salaries as a function of experience is less than that of male salaries at
the 10% level. We cannot conclude that the rate of change in female salaries as a function
of Experience differs from that of male salaries at the 5% level.

a. Correlation matrix:

Correlations: EconGPA, sex, Acteng, ACTmath, ACTss, ACTcomp, HSPct


EconGPA sex Acteng ACTmath ACTss ACTcomp
sex 0.187
0.049
Acteng 0.387 0.270
0.001 0.021
ACTmath 0.338 -0.170 0.368
0.003 0.151 0.001
ACTss 0.442 -0.105 0.448 0.439
0.000 0.375 0.000 0.000
ACTcomp 0.474 -0.084 0.650 0.765 0.812
0.000 0.478 0.000 0.000 0.000
HSPct 0.362 0.216 0.173 0.290 0.224 0.230
0.000 0.026 0.150 0.014 0.060 0.053
12-66 Statistics for Business and Economics, 9th Edition, Global Edition

There exists a positive relationship between EconGPA and all of the independent variables,
which is expected. Note that there is a high correlation between the composite ACT score and
the individual components, which is again, as expected. Thus, high correlation among the
independent variables is likely to be a serious concern in this regression model.

Regression Analysis: EconGPA versus sex, Acteng, ...


The regression equation is
EconGPA = - 0.050 + 0.261 sex + 0.0099 Acteng + 0.0064 ACTmath + 0.0270 ACTss +
0.0419 ACTcomp + 0.00898 HSPct

71 cases used 41 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant -0.0504 0.6554 -0.08 0.939
sex 0.2611 0.1607 1.62 0.109 1.5
Acteng 0.00991 0.02986 0.33 0.741 2.5
ACTmath 0.00643 0.03041 0.21 0.833 4.3
ACTss 0.02696 0.02794 0.96 0.338 4.7
ACTcomp 0.04188 0.07200 0.58 0.563 12.8
HSPct 0.008978 0.005716 1.57 0.121 1.4

S = 0.4971 R-Sq = 34.1% R-Sq(adj) = 27.9%

Analysis of Variance
Source DF SS MS F P
Regression 6 8.1778 1.3630 5.52 0.000
Residual Error 64 15.8166 0.2471
Total 70 23.9945

As expected, high correlation among the independent variables is affecting the results. A
strategy of dropping the variable with the lowest t-statistic with each successive model
causes the dropping of the following variables (in order): 1) ACTmath, 2) ACTeng, 3)
ACTss, 4) HSPct. The two variables that remain are the final model of gender and
ACTcomp:

Regression Analysis: EconGPA versus sex, ACTcomp


The regression equation is
EconGPA = 0.322 + 0.335 sex + 0.0978 ACTcomp

73 cases used 39 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 0.3216 0.5201 0.62 0.538
sex 0.3350 0.1279 2.62 0.011 1.0
ACTcomp 0.09782 0.01989 4.92 0.000 1.0

S = 0.4931 R-Sq = 29.4% R-Sq(adj) = 27.3%

Analysis of Variance
Source DF SS MS F P
Regression 2 7.0705 3.5352 14.54 0.000
Residual Error 70 17.0192 0.2431
Total 72 24.0897

Both independent variables are conditionally significant.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-67

b. The model could be used in college admission decisions by creating a predicted GPA
in economics based on sex and ACT comp scores. This predicted GPA could then be
used with other factors in deciding admission. Note that this model predicts that
females will outperform males with equal test scores. Using this model as the only
source of information may lead to charges of unequal treatment.

12.109
Correlations: Real Home Pr, Year, Real Buildin, U. S. Popula, Long Interes, ...

Real Home Price Year Real Building Cost


Year 0.633
0.000
Real Building Cost 0.668 0.805
0.000 0.000
U. S. Population 0.688 0.990 0.814
0.000 0.000 0.000
Long Interest Rate 0.252 0.508 0.649
0.005 0.000 0.000
Consumer Price I 0.732 0.851 0.636
0.000 0.000 0.000

U. S. Population Long Interest Rate


Long Interest Rate 0.524
0.000
Consumer Price I 0.906 0.422
0.000 0.000

Regression Analysis: Real Home Price versus Year, Real Building Co, ...

The regression equation is


Real Home Price Index = 1932 - 1.00 Year + 0.889 Real Building Cost Index
+ 0.425 U. S. Population Millions - 3.52 Long Interest Rate + 0.205 Consumer
Price Index

Predictor Coef SE Coef T P VIF


Constant 1931.8 923.8 2.09 0.039
Year -1.0002 0.4965 -2.01 0.046 211.493
Real Building Cost Index 0.8887 0.1551 5.73 0.000 6.050
U. S. Population Millions 0.4247 0.3310 1.28 0.202 398.636
Long Interest Rate -3.5189 0.6625 -5.31 0.000 1.760
Consumer Price Index 0.2051 0.1031 1.99 0.049 27.496

S = 13.1162 R-Sq = 71.5% R-Sq(adj) = 70.3%

Analysis of Variance

Source DF SS MS F P
Regression 5 49651.0 9930.2 57.72 0.000
Residual Error 115 19783.9 172.0
Total 120 69434.9
12-68 Statistics for Business and Economics, 9th Edition, Global Edition

Regression Analysis: Real Home Price versus Year, Real Building Co, ...
The regression equation is
Real Home Price Index = 765 - 0.373 Year + 1.01 Real Building Cost Index - 3.42
Long Interest Rate + 0.328 Consumer Price Index

Predictor Coef SE Coef T P VIF


Constant 764.6 161.7 4.73 0.000
Year -0.37276 0.08598 -4.34 0.000 6.308
Real Building Cost Index 1.0114 0.1224 8.26 0.000 3.750
Long Interest Rate -3.4211 0.6599 -5.18 0.000 1.737
Consumer Price Index 0.32812 0.03807 8.62 0.000 3.727

S = 13.1527 R-Sq = 71.1% R-Sq(adj) = 70.1%

Analysis of Variance

Source DF SS MS F P
Regression 4 49368 12342 71.34 0.000
Residual Error 116 20067 173
Total 120 69435

a. The model exhibits a tendency to predict low home prices over the long time period.
This is evident from the coefficient, - 0.373 of the independent variable Year.
b. The housing price bubble can be identified by predicting the real home price index
using the obtained model for the years in the first part of the 21st century.

12.110
Correlations: Sale 2 Price, Time Interva, Sale 1 Price, Atlanta, Chicago, ...

Sale 2 Price Time Interval Sale 1 Price Atlanta


Time Interval -0.029
0.07
Sale 1 Price 0.976 -0.151
0.000 0.000
Atlanta -0.105 0.008 -0.076
0.000 0.596 0.000
Chicago -0.047 -0.018 -0.016 -0.333
0.003 0.265 0.322 0.000
Dallas -0.033 0.035 -0.047 -0.333
0.038 0.027 0.003 0.000
Oakland 0.185 -0.026 0.140 -0.333
0.000 0.105 0.000 0.000

Chicago Dallas
Dallas -0.333
0.000
Oakland -0.333 -0.333
0.000 0.000

There exists a negative relationship between Sale 2 price and all of the independent variables
except Sale 1 price and Oakland. Note that there is a high correlation between the Sale 1 price

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-69

and Sale 2 price, as expected. High correlation among the independent variables is likely to be a
serious concern in this regression model.

Regression Analysis: Sale 2 Price versus Time Interva, Sale 1 Price, ...

* Oakland is highly correlated with other X variables


* Oakland has been removed from the equation.

The regression equation is


Sale 2 Price = 14726 + 1225 Time Interval + 0.978 Sale 1 Price - 16559 Atlanta
- 16339 Chicago - 8287 Dallas

Predictor Coef SE Coef T P VIF


Constant 14725.7 834.7 17.64 0.000
Time Interval 1224.91 28.00 43.75 0.000 1.024
Sale 1 Price 0.977572 0.002770 352.89 0.000 1.044
Atlanta -16558.5 927.3 -17.86 0.000 1.527
Chicago -16339.4 923.3 -17.70 0.000 1.514
Dallas -8286.8 925.4 -8.95 0.000 1.521

S = 20551.2 R-Sq = 97.0% R-Sq(adj) = 97.0%

Analysis of Variance

Source DF SS MS F P
Regression 5 5.46776E+13 1.09355E+13 25892.07 0.000
Residual Error 3994 1.68687E+12 422350080
Total 3999 5.63645E+13

97% of the variation in the second or final sales price can be explained by variations in the
interval between house sales, and the initial house sales price with adjustments for the four major
U.S. market areas.

All the variables are highly significant in explaining the final sales price at all levels of alpha.
The time interval and initial sales price have the expected sign in the regression model. The F-
test of the significance of the overall model shows that we reject in favor of that at least
one slope coefficient is not equal to zero. The F-test yielded a p-value of .000.
The sample correlation between the observed and the predicted values of the final sales price
is .985.
12-70 Statistics for Business and Economics, 9th Edition, Global Edition

12.111
House value models:

Regression Analysis: hseval versus sizehse, taxrate, incom72, Homper


The regression equation is
hseval = - 32.7 + 6.74 sizehse - 223 taxrate + 0.00464 incom72 + 11.2 Homper

Predictor Coef SE Coef T P VIF


Constant -32.694 8.972 -3.64 0.000
sizehse 6.740 1.880 3.58 0.001 1.4
taxrate -222.96 45.39 -4.91 0.000 1.2
incom72 0.004642 0.001349 3.44 0.001 1.2
Homper 11.215 4.592 2.44 0.017 1.3

S = 3.610 R-Sq = 49.3% R-Sq(adj) = 47.0%

Analysis of Variance
Source DF SS MS F P
Regression 4 1079.08 269.77 20.70 0.000
Residual Error 85 1107.55 13.03
Total 89 2186.63

All of the independent variables are conditionally significant. Now add the percent of
commercial property to the model to see if it is significant:

Regression Analysis: hseval versus sizehse, taxrate, ...


The regression equation is
hseval = - 31.6 + 6.76 sizehse - 218 taxrate + 0.00453 incom72 + 10.3 Homper -
2.18 Comper

Predictor Coef SE Coef T P VIF


Constant -31.615 9.839 -3.21 0.002
sizehse 6.757 1.892 3.57 0.001 1.4
taxrate -217.63 49.58 -4.39 0.000 1.4
incom72 0.004534 0.001412 3.21 0.002 1.4
Homper 10.287 5.721 1.80 0.076 2.0
Comper -2.182 7.940 -0.27 0.784 1.7

S = 3.630 R-Sq = 49.4% R-Sq(adj) = 46.4%

Analysis of Variance
Source DF SS MS F P
Regression 5 1080.07 216.01 16.40 0.000
Residual Error 84 1106.56 13.17
Total 89 2186.63

With a t-statistic of -.27 we have not found strong enough evidence to reject that the slope
coefficient on percent commercial property is significantly different from zero. The conditional F
test:

.
With 1 degree of freedom in the numerator and (90-5-1) = 84 degrees of freedom in the
denominator, the critical value of F at the .05 level is 3.95. Thus, at any common level of alpha,
do not reject that the percent commercial property has no effect on house values.

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-71
12-72 Statistics for Business and Economics, 9th Edition, Global Edition

Add percent industrial property to the base model:

Regression Analysis: hseval versus sizehse, taxrate, ...


The regression equation is
hseval = - 28.6 + 6.10 sizehse - 232 taxrate + 0.00521 incom72 + 8.68 Homper -
7.50 Indper

Predictor Coef SE Coef T P VIF


Constant -28.643 9.602 -2.98 0.004
sizehse 6.096 1.956 3.12 0.003 1.5
taxrate -232.34 46.00 -5.05 0.000 1.2
incom72 0.005208 0.001431 3.64 0.000 1.4
Homper 8.681 5.070 1.71 0.091 1.6
Indper -7.505 6.427 -1.17 0.246 1.7

S = 3.602 R-Sq = 50.2% R-Sq(adj) = 47.2%

Analysis of Variance
Source DF SS MS F P
Regression 5 1096.77 219.35 16.91 0.000
Residual Error 84 1089.86 12.97
Total 89 2186.63

Likewise, the percent industrial property is not significantly different from zero. The conditional
F test:

.
With 1 degree of freedom in the numerator and (90-5-1) = 84 degrees of freedom in the
denominator, the critical value of F at the .05 level is 3.95. Again this is lower than the critical
value of F based on common levels of alpha, therefore, do not reject that the percent
industrial property has no effect on house values.

Tax rate models:

Regression Analysis: taxrate versus taxbase, expercap, Homper


The regression equation is
taxrate = - 0.0174 -0.000000 taxbase +0.000162 expercap + 0.0424 Homper

Predictor Coef SE Coef T P VIF


Constant -0.017399 0.007852 -2.22 0.029
taxbase -0.00000000 0.00000000 -0.80 0.426 1.2
expercap 0.00016204 0.00003160 5.13 0.000 1.1
Homper 0.042361 0.009378 4.52 0.000 1.2

S = 0.007692 R-Sq = 31.9% R-Sq(adj) = 29.5%

Analysis of Variance
Source DF SS MS F P
Regression 3 0.00237926 0.00079309 13.41 0.000
Residual Error 86 0.00508785 0.00005916
Total 89 0.00746711

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-73

Since taxbase is not significant, it is dropped from the model:

Regression Analysis: taxrate versus expercap, Homper


The regression equation is
taxrate = - 0.0192 +0.000158 expercap + 0.0448 Homper

Predictor Coef SE Coef T P VIF


Constant -0.019188 0.007511 -2.55 0.012
expercap 0.00015767 0.00003106 5.08 0.000 1.1
Homper 0.044777 0.008860 5.05 0.000 1.1

S = 0.007676 R-Sq = 31.4% R-Sq(adj) = 29.8%

Analysis of Variance
Source DF SS MS F P
Regression 2 0.0023414 0.0011707 19.87 0.000
Residual Error 87 0.0051257 0.0000589
Total 89 0.0074671

Both of the independent variables are significant. This becomes the base model that we now add
percent commercial property and percent industrial property sequentially:

Regression Analysis: taxrate versus expercap, Homper, Comper


The regression equation is
taxrate = - 0.0413 +0.000157 expercap + 0.0643 Homper + 0.0596 Comper

Predictor Coef SE Coef T P VIF


Constant -0.041343 0.008455 -4.89 0.000
expercap 0.00015660 0.00002819 5.55 0.000 1.1
Homper 0.064320 0.009172 7.01 0.000 1.4
Comper 0.05960 0.01346 4.43 0.000 1.3

S = 0.006966 R-Sq = 44.1% R-Sq(adj) = 42.2%

Analysis of Variance
Source DF SS MS F P
Regression 3 0.0032936 0.0010979 22.62 0.000
Residual Error 86 0.0041735 0.0000485
Total 89 0.0074671

Percent commercial property is conditionally significant and an important independent variable


as shown by the conditional F-test:

With 1 degree of freedom in the numerator and (90-3-1) = 86 degrees of freedom in the
denominator, the critical value of F at the .05 level is 3.95. Hence we would conclude that the
percentage of commercial property has a statistically significant positive impact on tax rate.
12-74 Statistics for Business and Economics, 9th Edition, Global Edition

We now add industrial property to test the effect on tax rate:

Regression Analysis: taxrate versus expercap, Homper, Indper


The regression equation is
taxrate = - 0.0150 +0.000156 expercap + 0.0398 Homper - 0.0105 Indper

Predictor Coef SE Coef T P VIF


Constant -0.015038 0.009047 -1.66 0.100
expercap 0.00015586 0.00003120 5.00 0.000 1.1
Homper 0.03982 0.01071 3.72 0.000 1.6
Indper -0.01052 0.01273 -0.83 0.411 1.5

S = 0.007690 R-Sq = 31.9% R-Sq(adj) = 29.5%

Analysis of Variance
Source DF SS MS F P
Regression 3 0.00238178 0.00079393 13.43 0.000
Residual Error 86 0.00508533 0.00005913
Total 89 0.00746711

The percent industrial property is insignificant with a t-statistic of only -.83. The F-test confirms
that the variable does not have a significant impact on tax rate:

. With 1 degree of freedom in the


numerator and (90-3-1) = 86 degrees of freedom in the denominator, the critical value of F at
the .05 level is 3.95. Hence we would conclude that the percentage of industrial property has no
statistically significant impact on tax rate.

In conclusion, we found no evidence to back three of the activists claims and strong evidence to
reject one of them. We concluded that commercial development will have no effect on house
value, while it will actually increase tax rate. In addition, we concluded that industrial
development will have no effect on house value or tax rate.
It was important to include all of the other independent variables in the regression models
because the conditional significance of any one variable is influenced by which other
independent variables are in the regression model. Therefore, it is important to test if direct
relationships can be ‘explained’ by the relationships with other predictor variables.

12.112

a.
To predict the percentage of students who graduate in 4 years from highly ranked private
colleges, the following list of potential predictor variables are selected:

1. Undergrad. Enrollment – the undergraduate enrollments in the college


2. Admission Rate – the rate of the admission of students into the college
3. Student/faculty Ratio – the ratio of the students to the faculty
4. Quality Rank – the rank of the quality of the private colleges

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-75

b.
Multiple regression using the listed predictor variables:

Regression Analysis: 4-year Grad. versus Undergrad. E, Admission Ra, ...

The regression equation is


4-year Grad. Rate = 0.814 - 0.000009 Undergrad. Enrollment
- 0.0064 Admission Rate + 0.0154 Student/faculty Ratio
- 0.00387 Quality Rank

Predictor Coef SE Coef T P VIF


Constant 0.81448 0.02972 27.40 0.000
Undergrad. Enrollment -0.00000919 0.00000175 -5.26 0.000 1.186
Admission Rate -0.00638 0.06459 -0.10 0.922 3.522
Student/faculty Ratio 0.015371 0.003920 3.92 0.000 2.008
Quality Rank -0.0038718 0.0004267 -9.07 0.000 3.480

S = 0.0650174 R-Sq = 69.9% R-Sq(adj) = 68.6%

Analysis of Variance
Source DF SS MS F P
Regression 4 0.91267 0.22817 53.98 0.000
Residual Error 93 0.39313 0.00423
Total 97 1.30580

c.
The final regression after eliminating the insignificant predictor variable, Admission rate:

Regression Analysis: 4-year Grad. versus Undergrad. E, Student/facu, ...

The regression equation is


4-year Grad. Rate = 0.814 - 0.000009 Undergrad. Enrollment
+ 0.0153 Student/faculty Ratio - 0.00390 Quality Rank

Predictor Coef SE Coef T P VIF


Constant 0.81407 0.02927 27.81 0.000
Undergrad. Enrollment -0.00000914 0.00000167 -5.46 0.000 1.100
Student/faculty Ratio 0.015268 0.003760 4.06 0.000 1.867
Quality Rank -0.0039013 0.0003034 -12.86 0.000 1.778

S = 0.0646740 R-Sq = 69.9% R-Sq(adj) = 68.9%

Analysis of Variance

Source DF SS MS F P
Regression 3 0.91262 0.30421 72.73 0.000
Residual Error 94 0.39318 0.00418
Total 97 1.30580
12-76 Statistics for Business and Economics, 9th Edition, Global Edition

d.
Undergrad.Enrollment and Quality Rank are negatively related to 4-year Grad.Rate.
Student/faculty Ratio has the expected sign for the dependent variable.
All else being equal, a 1-unit increase in Undergrad.Enrollment will decrease 4-year Grad Rate
by .000009. All else being equal, a 1-unit increase in Student/faculty Ratio will increase 4-year
Grad Rate by .0153. All else being equal, a 1-unit increase in Quality Rank will decrease 4-year
Grad Rate by .0039.

12.113

a.
To predict the cost with financial aid for students at highly ranked private colleges, the following
list of potential predictor variables are selected:
Undergrad. Enrollment, Admission Rate, Cost After Need-based Aid, Need Met, Cost After
Non-Need-Based Aid, Average Debt, and Cost Rank

b.
Multiple regression using the listed predictor variables:
Regression Analysis: FinaidCost versus Undergrad. E, Admission Ra, ...

The regression equation is

FinaidCost = 6824 - 0.143 Undergrad. Enrollment - 9901 Admission Rate


- 0.136 Cost After Need-based Aid + 21511 Need Met
+ 0.243 Cost After Non-Need-Based Aid - 0.0340 Average Debt
+ 24.1 Cost Rank

97 cases used, 1 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 6824 4367 1.56 0.122
Undergrad. Enrollment -0.1426 0.1067 -1.34 0.185 1.330
Admission Rate -9901 2604 -3.80 0.000 1.682
Cost After Need-based Aid -0.1363 0.1744 -0.78 0.436 2.586
Need Met 21511 3985 5.40 0.000 1.601
Cost After Non-Need-Based Aid 0.24341 0.09085 2.68 0.009 3.130
Average Debt -0.03403 0.06465 -0.53 0.600 1.104
Cost Rank 24.14 28.19 0.86 0.394 4.627

S = 3743.51 R-Sq = 68.7% R-Sq(adj) = 66.2%

Analysis of Variance

Source DF SS MS F P
Regression 7 2736003281 390857612 27.89 0.000
Residual Error 89 1247231043 14013832
Total 96 3983234323

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-77

c.
The final regression after eliminating the insignificant predictor variables - Undergrad.
Enrollment, Cost After Need-based Aid, Average Debt, and Cost Rank sequentially:

Regression Analysis: FinaidCost versus Admission Rate, Need Met, ...

The regression equation is


FinaidCost = 2911 - 10314 Admission Rate + 22484 Need Met
+ 0.270 Cost After Non-Need-Based Aid

97 cases used, 1 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 2911 3667 0.79 0.429
Admission Rate -10314 2348 -4.39 0.000 1.382
Need Met 22484 3580 6.28 0.000 1.306
Cost After Non-Need-Based Aid 0.26978 0.06473 4.17 0.000 1.606

S = 3723.75 R-Sq = 67.6% R-Sq(adj) = 66.6%

Analysis of Variance

Source DF SS MS F P
Regression 3 2693664779 897888260 64.75 0.000
Residual Error 93 1289569544 13866339
Total 96 3983234323

d.
Admission Rate is negatively related to Cost with financial aid. Need met and Cost After Non-
Need-Based Aid are positively related to Cost with financial aid.
All else being equal, a 1% increase in Admission Rate will decrease Cost with financial aid by
$10,314. All else being equal, a 1% increase in Need met will increase Cost with financial aid by
$22,484. All else being equal, a 1$ increase in Cost After Non-Need-Based Aid will increase
Cost with financial aid by $.27.
12-78 Statistics for Business and Economics, 9th Edition, Global Edition

12.114
a. daycode2 is a dummy variable where first interview is coded 0 and second interview
coded 1.

Regression Analysis: HEI2005 versus doc_bp, waistper, ...

The regression equation is


HEI2005 = 47.2 - 0.734 doc_bp - 7.15 waistper + 0.0447 BMI
+ 0.522 sr_overweight + 3.64 female + 0.182 age + 2.33 daycode2

8217 cases used, 373 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 47.177 1.139 41.42 0.000
doc_bp -0.7339 0.3638 -2.02 0.044 1.269
waistper -7.149 2.377 -3.01 0.003 6.896
BMI 0.04466 0.06199 0.72 0.471 6.332
sr_overweight 0.5219 0.3799 1.37 0.170 1.519
female 3.6403 0.3780 9.63 0.000 1.513
age 0.181604 0.009224 19.69 0.000 1.338
daycode2 2.3331 0.3073 7.59 0.000 1.000

S = 13.9149 R-Sq = 6.9% R-Sq(adj) = 6.8%

Analysis of Variance

Source DF SS MS F P
Regression 7 117606 16801 86.77 0.000
Residual Error 8209 1589454 194
Total 8216 1707060

b. Adding the dummy variable, immigrant to the regression model

Regression Analysis: HEI2005 versus doc_bp, waistper, ...

The regression equation is


HEI2005 = 44.0 - 0.352 doc_bp - 4.29 waistper - 0.0134 BMI + 0.823 sr_overweight
+ 3.45 female + 0.184 age + 2.36 daycode2 + 7.38 immigrant

8217 cases used, 373 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 44.048 1.126 39.13 0.000
doc_bp -0.3517 0.3563 -0.99 0.324 1.273
waistper -4.291 2.329 -1.84 0.065 6.923
BMI -0.01344 0.06069 -0.22 0.825 6.347
sr_overweight 0.8234 0.3718 2.21 0.027 1.521
female 3.4510 0.3698 9.33 0.000 1.514
age 0.184361 0.009022 20.43 0.000 1.338
daycode2 2.3583 0.3005 7.85 0.000 1.000
immigrant 7.3772 0.3809 19.37 0.000 1.018

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-79

S = 13.6082 R-Sq = 11.0% R-Sq(adj) = 10.9%

Analysis of Variance

Source DF SS MS F P
Regression 8 187086 23386 126.29 0.000
Residual Error 8208 1519974 185
Total 8216 1707060

c. Adding the dummy variable, single to the initial regression model

Regression Analysis: HEI2005 versus doc_bp, waistper, ...

The regression equation is


HEI2005 = 48.0 - 0.654 doc_bp - 7.69 waistper + 0.0691 BMI + 0.146 sr_overweight
+ 4.02 female + 0.180 age + 2.31 daycode2 - 2.40 single

8215 cases used, 375 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 48.039 1.141 42.11 0.000
doc_bp -0.6538 0.3628 -1.80 0.072 1.270
waistper -7.692 2.370 -3.25 0.001 6.904
BMI 0.06910 0.06187 1.12 0.264 6.352
sr_overweight 0.1463 0.3817 0.38 0.702 1.544
female 4.0179 0.3803 10.57 0.000 1.542
age 0.180269 0.009194 19.61 0.000 1.339
daycode2 2.3059 0.3063 7.53 0.000 1.000
single -2.4030 0.3197 -7.52 0.000 1.035

S = 13.8664 R-Sq = 7.5% R-Sq(adj) = 7.4%

Analysis of Variance

Source DF SS MS F P
Regression 8 128336 16042 83.43 0.000
Residual Error 8206 1577814 192
Total 8214 1706151

d. Adding the dummy variable, fsp to the initial regression model

Regression Analysis: HEI2005 versus doc_bp, waistper, ...

The regression equation is


HEI2005 = 47.3 - 0.706 doc_bp - 6.29 waistper + 0.0470 BMI + 0.260 sr_overweight
+ 3.70 female + 0.171 age + 2.30 daycode2 - 3.51 fsp

8114 cases used, 476 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 47.281 1.141 41.44 0.000
doc_bp -0.7064 0.3642 -1.94 0.052 1.266
waistper -6.288 2.382 -2.64 0.008 6.897
12-80 Statistics for Business and Economics, 9th Edition, Global Edition

BMI 0.04699 0.06200 0.76 0.449 6.322


sr_overweight 0.2598 0.3816 0.68 0.496 1.526
female 3.6984 0.3789 9.76 0.000 1.514
age 0.171463 0.009334 18.37 0.000 1.360
daycode2 2.2953 0.3079 7.46 0.000 1.000
fsp -3.5064 0.4716 -7.43 0.000 1.037

S = 13.8547 R-Sq = 7.4% R-Sq(adj) = 7.4%

Analysis of Variance

Source DF SS MS F P
Regression 8 125197 15650 81.53 0.000
Residual Error 8105 1555766 192
Total 8113 1680963

12.115
a.
daycode2 is a dummy variable where first interview is coded 0 and second interview
coded 1. activity_level1 is a dummy variable where 1 is coded 0 and 2 is coded 1.
Regression Analysis: HEI2005 versus sr_did_lm_wt, smoker, ...

The regression equation is


HEI2005 = 51.6 + 1.25 sr_did_lm_wt - 6.87 smoker - 0.659 screen_hours
- 0.805 activity_level1 - 0.110 pff + 0.0601 p_ate_at_home
+ 3.39 col_grad + 0.000007 hh_income_est + 1.78 daycode2

5328 cases used, 3262 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 51.5951 0.7762 66.47 0.000
sr_did_lm_wt 1.2459 0.3768 3.31 0.001 1.005
smoker -6.8654 0.4320 -15.89 0.000 1.034
screen_hours -0.65883 0.09947 -6.62 0.000 1.023
activity_level1 -0.8050 0.4149 -1.94 0.052 1.026
pff -0.110314 0.008474 -13.02 0.000 1.096
p_ate_at_home 0.060128 0.005887 10.21 0.000 1.132
col_grad 3.3889 0.5298 6.40 0.000 1.175
hh_income_est 0.00000696 0.00000764 0.91 0.362 1.177
daycode2 1.7764 0.3601 4.93 0.000 1.002

S = 13.1196 R-Sq = 13.9% R-Sq(adj) = 13.8%

Analysis of Variance

Source DF SS MS F P
Regression 9 148319 16480 95.74 0.000
Residual Error 5318 915359 172
Total 5327 1063678

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-81

b. Adding the dummy variable, immigrant to the regression model

Regression Analysis: HEI2005 versus sr_did_lm_wt, smoker, ...

The regression equation is


HEI2005 = 49.0 + 1.29 sr_did_lm_wt - 6.54 smoker - 0.379 screen_hours
- 1.15 activity_level1 - 0.105 pff + 0.0604 p_ate_at_home
+ 3.59 col_grad + 0.000018 hh_income_est + 1.81 daycode2
+ 5.81 immigrant

5328 cases used, 3262 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 49.0207 0.7883 62.18 0.000
sr_did_lm_wt 1.2864 0.3708 3.47 0.001 1.005
smoker -6.5391 0.4258 -15.36 0.000 1.037
screen_hours -0.3786 0.1002 -3.78 0.000 1.071
activity_level1 -1.1531 0.4092 -2.82 0.005 1.030
pff -0.104564 0.008350 -12.52 0.000 1.099
p_ate_at_home 0.060428 0.005793 10.43 0.000 1.132
col_grad 3.5950 0.5216 6.89 0.000 1.176
hh_income_est 0.00001795 0.00000756 2.37 0.018 1.192
daycode2 1.8111 0.3544 5.11 0.000 1.002
immigrant 5.8134 0.4401 13.21 0.000 1.077

S = 12.9108 R-Sq = 16.7% R-Sq(adj) = 16.5%

Analysis of Variance

Source DF SS MS F P
Regression 10 177400 17740 106.43 0.000
Residual Error 5317 886279 167
Total 5327 1063678

c. Adding the dummy variable, single to the initial regression model

Regression Analysis: HEI2005 versus sr_did_lm_wt, smoker, ...

The regression equation is


HEI2005 = 52.2 + 1.22 sr_did_lm_wt - 6.81 smoker - 0.628 screen_hours
- 0.800 activity_level1 - 0.110 pff + 0.0596 p_ate_at_home
+ 3.45 col_grad + 0.000001 hh_income_est + 1.78 daycode2 - 1.06 single

5326 cases used, 3264 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 52.1564 0.8052 64.77 0.000
sr_did_lm_wt 1.2203 0.3767 3.24 0.001 1.006
smoker -6.8065 0.4321 -15.75 0.000 1.035
screen_hours -0.62777 0.09987 -6.29 0.000 1.030
activity_level1 -0.7995 0.4147 -1.93 0.054 1.026
pff -0.110174 0.008469 -13.01 0.000 1.096
p_ate_at_home 0.059636 0.005885 10.13 0.000 1.132
col_grad 3.4489 0.5300 6.51 0.000 1.175
12-82 Statistics for Business and Economics, 9th Edition, Global Edition

hh_income_est 0.00000113 0.00000793 0.14 0.887 1.268


daycode2 1.7755 0.3600 4.93 0.000 1.002
single -1.0646 0.3851 -2.76 0.006 1.100

S = 13.1100 R-Sq = 14.0% R-Sq(adj) = 13.9%

Analysis of Variance

Source DF SS MS F P
Regression 10 149300 14930 86.87 0.000
Residual Error 5315 913505 172
Total 5325 1062805

d. Adding the dummy variable, fsp to the initial regression model

Regression Analysis: HEI2005 versus sr_did_lm_wt, smoker, ...

The regression equation is


HEI2005 = 52.2 + 1.18 sr_did_lm_wt - 6.77 smoker - 0.659 screen_hours
- 0.725 activity_level1 - 0.108 pff + 0.0610 p_ate_at_home
+ 3.31 col_grad - 0.000002 hh_income_est + 1.74 daycode2 - 2.31 fsp

5298 cases used, 3292 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 52.1612 0.7855 66.41 0.000
sr_did_lm_wt 1.1773 0.3778 3.12 0.002 1.008
smoker -6.7662 0.4350 -15.55 0.000 1.045
screen_hours -0.65904 0.09950 -6.62 0.000 1.023
activity_level1 -0.7255 0.4149 -1.75 0.080 1.025
pff -0.108072 0.008482 -12.74 0.000 1.100
p_ate_at_home 0.061035 0.005896 10.35 0.000 1.134
col_grad 3.3141 0.5300 6.25 0.000 1.175
hh_income_est -0.00000249 0.00000793 -0.31 0.754 1.267
daycode2 1.7365 0.3605 4.82 0.000 1.002
fsp -2.3126 0.5467 -4.23 0.000 1.127

S = 13.0953 R-Sq = 14.3% R-Sq(adj) = 14.1%

Analysis of Variance

Source DF SS MS F P
Regression 10 151259 15126 88.20 0.000
Residual Error 5287 906658 171
Total 5297 1057916

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-83

12.116
a. daycode2 is a dummy variable where first interview is coded 0 and second interview
coded 1.

Regression Analysis: daily_cost versus doc_bp, waistper, ...

The regression equation is


daily_cost = 6.83 - 0.189 doc_bp + 1.42 waistper - 0.0192 BMI
+ 0.107 sr_overweight - 1.35 female - 0.0359 age - 0.205 daycode2

8217 cases used, 373 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 6.8293 0.2347 29.10 0.000
doc_bp -0.18874 0.07497 -2.52 0.012 1.269
waistper 1.4179 0.4898 2.90 0.004 6.896
BMI -0.01916 0.01277 -1.50 0.134 6.332
sr_overweight 0.10659 0.07828 1.36 0.173 1.519
female -1.35316 0.07789 -17.37 0.000 1.513
age -0.035866 0.001901 -18.87 0.000 1.338
daycode2 -0.20460 0.06331 -3.23 0.001 1.000

S = 2.86735 R-Sq = 9.4% R-Sq(adj) = 9.3%

Analysis of Variance

Source DF SS MS F P
Regression 7 7009.6 1001.4 121.80 0.000
Residual Error 8209 67491.9 8.2
Total 8216 74501.5

b. Adding the dummy variable, immigrant to the regression model

Regression Analysis: daily_cost versus doc_bp, waistper, ...

The regression equation is


daily_cost = 6.98 - 0.207 doc_bp + 1.28 waistper - 0.0164 BMI
+ 0.0923 sr_overweight - 1.34 female - 0.0360 age - 0.206 daycode2
- 0.351 immigrant

8217 cases used, 373 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 6.9780 0.2369 29.45 0.000
doc_bp -0.20691 0.07500 -2.76 0.006 1.273
waistper 1.2821 0.4902 2.62 0.009 6.923
BMI -0.01639 0.01277 -1.28 0.199 6.347
sr_overweight 0.09226 0.07826 1.18 0.238 1.521
female -1.34416 0.07783 -17.27 0.000 1.514
age -0.035997 0.001899 -18.96 0.000 1.338
daycode2 -0.20580 0.06325 -3.25 0.001 1.000
immigrant -0.35068 0.08016 -4.37 0.000 1.018
12-84 Statistics for Business and Economics, 9th Edition, Global Edition

S = 2.86419 R-Sq = 9.6% R-Sq(adj) = 9.5%

Analysis of Variance

Source DF SS MS F P
Regression 8 7166.63 895.83 109.20 0.000
Residual Error 8208 67334.92 8.20
Total 8216 74501.55

c. Adding the dummy variable, single to the initial regression model

Regression Analysis: daily_cost versus doc_bp, waistper, ...

The regression equation is


daily_cost = 6.89 - 0.187 doc_bp + 1.36 waistper - 0.0169 BMI
+ 0.0805 sr_overweight - 1.32 female - 0.0359 age - 0.208 daycode2
- 0.182 single

8215 cases used, 375 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 6.8942 0.2358 29.24 0.000
doc_bp -0.18678 0.07498 -2.49 0.013 1.270
waistper 1.3629 0.4899 2.78 0.005 6.904
BMI -0.01695 0.01279 -1.33 0.185 6.352
sr_overweight 0.08051 0.07889 1.02 0.308 1.544
female -1.32153 0.07860 -16.81 0.000 1.542
age -0.035904 0.001900 -18.89 0.000 1.339
daycode2 -0.20834 0.06330 -3.29 0.001 1.000
single -0.18199 0.06607 -2.75 0.006 1.035

S = 2.86594 R-Sq = 9.5% R-Sq(adj) = 9.4%

Analysis of Variance

Source DF SS MS F P
Regression 8 7066.32 883.29 107.54 0.000
Residual Error 8206 67400.82 8.21
Total 8214 74467.14

d. Adding the dummy variable, fsp to the initial regression model

Regression Analysis: daily_cost versus doc_bp, waistper, ...

The regression equation is


daily_cost = 6.84 - 0.165 doc_bp + 1.55 waistper - 0.0171 BMI
+ 0.0684 sr_overweight - 1.35 female - 0.0378 age - 0.206 daycode2
- 0.783 fsp

8114 cases used, 476 cases contain missing values

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-85

Predictor Coef SE Coef T P VIF


Constant 6.8409 0.2354 29.06 0.000
doc_bp -0.16534 0.07513 -2.20 0.028 1.266
waistper 1.5483 0.4913 3.15 0.002 6.897
BMI -0.01707 0.01279 -1.33 0.182 6.322
sr_overweight 0.06845 0.07872 0.87 0.385 1.526
female -1.34695 0.07817 -17.23 0.000 1.514
age -0.037831 0.001926 -19.65 0.000 1.360
daycode2 -0.20646 0.06351 -3.25 0.001 1.000
fsp -0.78343 0.09729 -8.05 0.000 1.037

S = 2.85811 R-Sq = 10.1% R-Sq(adj) = 10.0%

Analysis of Variance

Source DF SS MS F P
Regression 8 7438.61 929.83 113.83 0.000
Residual Error 8105 66208.22 8.17
Total 8113 73646.83

12.117
a.
daycode2 is a dummy variable where first interview is coded 0 and second interview
coded 1. activity_level1 is a dummy variable where 1 is coded 0 and 2 is coded 1.
Regression Analysis: daily_cost versus sr_did_lm_wt, smoker, ...

The regression equation is


daily_cost = 5.51 - 0.0698 sr_did_lm_wt + 0.309 smoker
+ 0.0847 screen_hours
- 0.379 activity_level1 + 0.0225 pff - 0.0191 p_ate_at_home
+ 0.594 col_grad + 0.000013 hh_income_est - 0.193 daycode2

5328 cases used, 3262 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 5.5126 0.1626 33.90 0.000
sr_did_lm_wt -0.06981 0.07894 -0.88 0.377 1.005
smoker 0.30875 0.09051 3.41 0.001 1.034
screen_hours 0.08470 0.02084 4.06 0.000 1.023
activity_level1 -0.37938 0.08694 -4.36 0.000 1.026
pff 0.022463 0.001775 12.65 0.000 1.096
p_ate_at_home -0.019147 0.001233 -15.52 0.000 1.132
col_grad 0.5944 0.1110 5.35 0.000 1.175
hh_income_est 0.00001301 0.00000160 8.13 0.000 1.177
daycode2 -0.19320 0.07546 -2.56 0.010 1.002

S = 2.74876 R-Sq = 13.8% R-Sq(adj) = 13.7%

Analysis of Variance

Source DF SS MS F P
Regression 9 6451.80 716.87 94.88 0.000
Residual Error 5318 40181.04 7.56
Total 5327 46632.84
12-86 Statistics for Business and Economics, 9th Edition, Global Edition

Copyright © 2020 Pearson Education Ltd.


.
Chapter 12: Multiple Regression 12-87

b. Adding the dummy variable, immigrant to the regression model

Regression Analysis: daily_cost versus sr_did_lm_wt, smoker, ...

The regression equation is


daily_cost = 5.44 - 0.0686 sr_did_lm_wt + 0.318 smoker + 0.0928 screen_hours
- 0.390 activity_level1 + 0.0226 pff - 0.0191 p_ate_at_home
+ 0.600 col_grad + 0.000013 hh_income_est - 0.192 daycode2
+ 0.169 immigrant

5328 cases used, 3262 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 5.4377 0.1678 32.41 0.000
sr_did_lm_wt -0.06863 0.07893 -0.87 0.385 1.005
smoker 0.31824 0.09064 3.51 0.000 1.037
screen_hours 0.09285 0.02132 4.35 0.000 1.071
activity_level1 -0.38951 0.08710 -4.47 0.000 1.030
pff 0.022630 0.001777 12.73 0.000 1.099
p_ate_at_home -0.019138 0.001233 -15.52 0.000 1.132
col_grad 0.6004 0.1110 5.41 0.000 1.176
hh_income_est 0.00001333 0.00000161 8.28 0.000 1.192
daycode2 -0.19219 0.07544 -2.55 0.011 1.002
immigrant 0.16916 0.09369 1.81 0.071 1.077

S = 2.74817 R-Sq = 13.9% R-Sq(adj) = 13.7%

Analysis of Variance

Source DF SS MS F P
Regression 10 6476.42 647.64 85.75 0.000
Residual Error 5317 40156.42 7.55
Total 5327 46632.84

c. Adding the dummy variable, single to the initial regression model

Regression Analysis: daily_cost versus sr_did_lm_wt, smoker, ...

The regression equation is


daily_cost = 5.61 - 0.0761 sr_did_lm_wt + 0.310 smoker + 0.0869 screen_hours
- 0.382 activity_level1 + 0.0226 pff - 0.0192 p_ate_at_home
+ 0.591 col_grad + 0.000012 hh_income_est - 0.196 daycode2
- 0.162 single

5326 cases used, 3264 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 5.6078 0.1688 33.23 0.000
sr_did_lm_wt -0.07608 0.07897 -0.96 0.335 1.006
smoker 0.30975 0.09058 3.42 0.001 1.035
screen_hours 0.08687 0.02093 4.15 0.000 1.030
activity_level1 -0.38200 0.08692 -4.39 0.000 1.026
pff 0.022558 0.001775 12.71 0.000 1.096
p_ate_at_home -0.019152 0.001234 -15.53 0.000 1.132
12-88 Statistics for Business and Economics, 9th Edition, Global Edition

col_grad 0.5915 0.1111 5.32 0.000 1.175


hh_income_est 0.00001211 0.00000166 7.29 0.000 1.268
daycode2 -0.19568 0.07545 -2.59 0.010 1.002
single -0.16168 0.08072 -2.00 0.045 1.100

S = 2.74788 R-Sq = 13.9% R-Sq(adj) = 13.7%

Analysis of Variance

Source DF SS MS F P
Regression 10 6462.21 646.22 85.58 0.000
Residual Error 5315 40132.81 7.55
Total 5325 46595.02

d. Adding the dummy variable, fsp to the initial regression model

Regression Analysis: daily_cost versus sr_did_lm_wt, smoker, ...

The regression equation is


daily_cost = 5.53 - 0.0760 sr_did_lm_wt + 0.312 smoker + 0.0858 screen_hours
- 0.371 activity_level1 + 0.0226 pff - 0.0191 p_ate_at_home
+ 0.600 col_grad + 0.000013 hh_income_est - 0.194 daycode2
- 0.090 fsp

5298 cases used, 3292 cases contain missing values

Predictor Coef SE Coef T P VIF


Constant 5.5285 0.1652 33.47 0.000
sr_did_lm_wt -0.07597 0.07945 -0.96 0.339 1.008
smoker 0.31205 0.09148 3.41 0.001 1.045
screen_hours 0.08584 0.02092 4.10 0.000 1.023
activity_level1 -0.37082 0.08724 -4.25 0.000 1.025
pff 0.022578 0.001784 12.66 0.000 1.100
p_ate_at_home -0.019077 0.001240 -15.39 0.000 1.134
col_grad 0.5999 0.1115 5.38 0.000 1.175
hh_income_est 0.00001261 0.00000167 7.56 0.000 1.267
daycode2 -0.19352 0.07581 -2.55 0.011 1.002
fsp -0.0899 0.1150 -0.78 0.434 1.127

S = 2.75375 R-Sq = 13.8% R-Sq(adj) = 13.6%

Analysis of Variance

Source DF SS MS F P
Regression 10 6405.63 640.56 84.47 0.000
Residual Error 5287 40092.04 7.58
Total 5297 46497.67

Copyright © 2020 Pearson Education Ltd.


.

You might also like