0% found this document useful (0 votes)
22 views43 pages

BBS11 ISM Ch13

Basic business statistics notes ch13

Uploaded by

motvbox80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views43 pages

BBS11 ISM Ch13

Basic business statistics notes ch13

Uploaded by

motvbox80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

780 Chapter 13: Simple Linear Regression

CHAPTER 13

13.1 (a) When X = 0, the estimated expected value of Y is 2.


(b) For each increase in the value X by 1 unit, you can expect an increase by an estimated
5 units in the value of Y.
(c) Yˆ = 2 + 5 X = 2 + 5(3) = 17

13.2 (a) yes, (b) no, (c) no, (d) yes

13.3 (a) When X = 0, the estimated expected value of Y is 16.


(b) For each increase in the value X by 1 unit, you can expect a decrease in an estimated
0.5 units in the value of Y.
(c) Yˆ = 16 − 0.5 X = 16 − 0.5(6) = 13

13.4 (a)
Weekly Sales, Y

4
3
2
1
0
0 5 10 15 20
Shelf Space, X

(b) For each increase in shelf space of an additional foot, there is an expected increase in
weekly sales of an estimated $7.4.
(c) Yˆ = 145 + 7.4 X = 145 + 7.4(8) = $204.2

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


781

13.5 (a)
Scatter Diagram

450

400

Y, Audited (thousands)
350

300

250

200

150

100

50

0
0 100 200 300 400 500 600 700
X, Reported (thousands)

(b) For each additional thousand units increase in reported newsstand sales, the mean
audited sales will increase by an estimated 0.5719 thousand units.
(c) Yˆ = 26.7240 + 0.5719 ( 400 ) = 255.4788 thousands.

13.6 (a)
Scatter Diagram
90

80

70

60

50
Y

40

30

20

10

0
0 200 400 600 800 1000 1200 1400 1600

(b) Partial Excel output:

Coefficients Standard Error t Stat P-value


Intercept -2.3697 2.0733 -1.1430 0.2610
Feet 0.0501 0.0030 16.5223 0.0000
(c) The estimated mean amount of labor will increase by 0.05 hour for each additional
cubic foot moved.
(d) Yˆ = −2.3697 + 0.0501(500) = 22.6705

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


782 Chapter 13: Simple Linear Regression

13.7 (a)
Scatter Diagram
6
)s
e 5
t
u
in4
M
(
s
e3
m
iT
g2
n
it
ia
W1
0
0 10 20 30 40 50
Number of Customers

(b) Yˆ = −0.4480 + 0.1285 X


(c) For each increase of one additional customer, the estimated mean waiting time will
increase by 0.1285 minutes.
(d) Yˆ = −0.4480 + 0.1285(20) = 2.1215

13.8 (a)
Scatter Diagram

1200

1000

800

600
Y

400

200

0
0 50 100 150 200 250 300
X

(b) b0 = -368.2846 b1 = 4.7306


(c) For each additional million dollars increase in revenue, the mean annual value will
increase by an estimated 4.7306 million dollars. Literal interpretation of b0 is not
meaningful because an operating franchise can not have zero revenue.
(d) Yˆ = -368.2846 + 4.7306 X = -368.2846 + 4.7306(150 ) = 341.3027 million
dollars

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


783

13.9 (a)

Scatter Plot

2500

2000

Monthly Rent ($) 1500

1000

500

0
0 500 1000 1500 2000 2500
Size (square feet)

(b) Yˆ = 177.1 + 1.065 X


(c) For each increase of 1 square foot in space, the expected monthly rental is estimated
to increase by $1.065. Since X cannot be zero, 177.1 has no practical interpretation.
(d) Yˆ = 177.1 + 1.065 X = 177.1 + 1.065(1000) = $1242.10
(e) An apartment with 500 square feet is outside the relevant range for the
independent variable.
(f) The apartment with 1200 square feet has the more favorable rent relative to size.
Based on the regression equation, a 1200 square foot apartment would have an
expected monthly rent of $1455.10, while a 1000 square foot apartment would have
an expected monthly rent of $1242.10.

13.10 (a)
Scatter Diagram
400
350
300
250
Y200
150 Videos
100
50
0
0 50 100 150
X
(b)
Coefficients Standard Error t Stat P-value
Intercept -140.1202629 34.15524465 -4.102452326 0.000319082
Box Office 4.333108105 0.500843491 8.651621088 2.1259E-09
Yˆ = b0 + b1 X = − 140.1203 + 4.3331X

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


784 Chapter 13: Simple Linear Regression

13.10 (c) For each increase of one additional million dollars of box office gross, the
cont. estimated mean number of DVDs sold will increase by 4.3331 thousands.
(d) Yˆ = b0 + b1 X = − 140.1203 + 4.3331(75) = 184.86285 thousands

13.11 80% of the variation in the dependent variable can be explained by the variation in
the independent variable.

13.12 SST = 40 and r2 = 0.90. So, 90% of the variation in the dependent variable can be
explained by the variation in the independent variable.

13.13 r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the
variation in the independent variable.

13.14 r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the
variation in the independent variable.

13.15 Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at
least as large as SSR.
SSR 20,535
13.16 (a) r2 = = = 0.684. So, 68.4% of the variation in the dependent variable
SST 30,025
can be explained by the variation in the independent variable.

∑ (Y )
n 2
i − Yˆi
SSE i =1 9490
(b) S YX = = = = 30.8058
n−2 n−2 10
(c) Based on (a) and (b), the model should be very useful for predicting sales.

13.17 (a) r2 = 0.9015. So, 90.15% of the variation in audit newsstand sales can be explained by
the variation in reported newsstand sales.
(b) S YX = 42.1859
(c) Based on (a) and (b), the model should be very useful for predicting audited sales.

13.18 (a) r2 = 0.8892. So, 88.92% of the variation in the dependent variable can be explained
by the variation in the independent variable.
(b) S YX = 5.0314
(c) Based on (a) and (b), the model should be very useful for predicting labor hours.

13.19 (a) r2 = 0.7873. So, 78.73% of the variation in waiting time can be explained by the
variation in the number of customers.
(b) S YX = 0.5952
(c) Based on (a) and (b), the model should be moderately useful for predicting the
waiting time.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


785

13.20 (a) r2 = 0.9334. So, 93.34% of the variation in value of a baseball franchise can be
explained by the variation in its annual revenue.
(b) S YX = 42.4335
(c) Based on (a) and (b), the model should be very useful for predicting the value of a
baseball franchise.

13.21 (a) r2 = 0.723. So, 72.3% of the variation in monthly rent can be explained by the
variation in square footage.
(b) S YX = 194.6
(c) Based on (a) and (b), the model should be very useful for predicting monthly rent.

13.22 (a) r2 = 0.7278. So, 72.78% of the variation in the number of DVDs sold can be
explained by the variation in box office gross.
(b) . The variation of number of DVDs sold around the prediction line
is 47.8668 thousands. The typical difference between the actual number of DVDs
sold and the predicted number of DVDs sold using the regression equation is
approximately 47.8668 thousands.
(c) Based on (a) and (b), the model is somewhat useful for predicting DVDs sold.
(d) Other variables that might explain the variation in DVDs sales could be the amount
spent on advertising, the timing of the release of the DVDs and the distribution
channels of the DVDs.

13.23 A residual analysis of the data indicates no apparent pattern. The assumptions of
regression appear to be met.

13.24 A residual analysis of the data indicates a pattern, with sizeable clusters of
consecutive residuals that are either all positive or all negative. If the data is cross-
sectional, this pattern indicates a violation of the assumption of linearity and a
quadratic model should be investigated. If the data is time-series, the pattern
indicates a violation of the assumption of independence of errors.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


786 Chapter 13: Simple Linear Regression

13.25 (a)

Reported Residual Plot

60

40

20

0
Residuals

-20

-40

-60

-80

-100
0.0 100.0 200.0 300.0 400.0 500.0 600.0 700.0
Reported

Based on the residual plot, there does not appear to be a pattern in the residual plot.

(b)

Normal Probability Plot

60

40

20

0
Residuals

-1.5 -1 -0.5 0 0.5 1 1.5


-20

-40

-60

-80

-100
Z Value

Based on the residual plot, there appears to be some heteroscedasticity effect. The
normal probability plot of the residuals indicates a departure from the normality
assumption. The error distribution appears to be left-skewed.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


787

13.26 (a)

Space Residual Plot

0.5
0.4
0.3
0.2
Residuals

0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
0 5 10 15 20 25
Space

Based on the residual plot, there does not appear to be a pattern in the residual plot.

(b)
Normal Probability Plot

0.5

0.4

0.3

0.2

0.1
Residuals

0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-0.1

-0.2

-0.3

-0.4

-0.5
Z Value

Based on the residual plot, there is not apparent heteroscedasticity effect. The normal
probability plot of the residuals indicates a departure from the normality assumption.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


788 Chapter 13: Simple Linear Regression

13.27
Customers Residual Plot
2
1.5
1
lsa
u 0.5
id
s 0
e
R
-0.5
-1
-1.5
0.00 10.00 20.00 30.00 40.00 50.00
Customers

The residual plot does not reveal any obvious pattern. So a linear fit appears to be
adequate.

Normal Probability Plot


2
1.5
1
sl
a 0.5
u
d
is
e 0
R
-0.5
-1
-1.5
-3 -2 -1 0 1 2 3
Z Value
The residual plot does not reveal any possible violation of the homoscedasticity
assumption. This is not a time series data, so you do not need to evaluate the
independence assumption. The normal probability plot does not reveal any apparent
deviation from the normal distribution.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


789

13.28

Feet Residual Plot


15

10

5
Residuals
0

-5

-10

-15
0 500 1000 1500
Feet

Based on the residual plot, there appears to be a nonlinear pattern in the residuals. A
quadratic model should be investigated.

Normal Probability Plot


15

10

5
Residuals

0
-3 -2 -1 0 1 2 3
-5

-10

-15
Z Value
The assumptions of normality and equal variance do not appear to be seriously
violated.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


790 Chapter 13: Simple Linear Regression

13.29 (a)
Size Residual Plot

500
400
300
200
Residuals 100
0
-100
-200
-300
-400
-500
0 500 1000 1500 2000 2500
Size

Based on a residual analysis of the residuals versus size, the model appears to be
adequate.
(b)

Normal Probability Plot

500
400
300
200
Residuals

100
0
-100 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-200
-300
-400
-500

Z Value

The normal probability plot shows that the distribution has a thicker left tail than a
normal distribution but there is no sign of severe skewness. The assumptions of
regression do not appear to be seriously violated.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


791

13.30 (a)

Revenue Residual Plot

140
120
100
80
60
Residuals

40
20
0
-20
-40
-60
-80
0 50 100 150 200 250 300
Revenue

Based on the residual plot, there appears to be a nonlinear pattern in the residuals. A
quadratic model should be investigated.
(b)

Normal Probability Plot

140

120

100

80

60
Residuals

40

20

0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
-20

-40

-60

-80
Z Value

The normal probability plot of the residuals reveals that the distribution of the errors
might be slightly skewed to the right. The residual plot suggests that the assumption
of equal variance might be violated. The variance at the lower and upper ends of
revenue appears to be smaller than that in the other region.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


792 Chapter 13: Simple Linear Regression

13.31

120
Box Office Residual Plot
100
80
60
sl40
a20
u
id
s0
e
-20R
-40
-60
-80
-100
0.00 20.00 40.00 Box
60.00
Office 80.00 100.00 120.00
The residual plot does not reveal any obvious pattern. So a linear fit appears to be
adequate.

Normal Probability Plot


120
100
80
60
sl 40
a
u 20
id
s 0
e
R -20
-40
-60
-80
-100
-3 -2 -1 0 1 2 3
Z Value
The residual plot does not reveal any possible violation of the homoscedasticity
assumption. This is not a time series data, so you do not need to evaluate the
independence assumption. The normal probability plot does not show any apparent
departure from the normal distribution.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


793

13.32 (a)
Residual Plot

10

Residuals
0
0 2 4 6 8 10
-5

-10
Tim e Period

An increasing linear relationship exists.


(b) There appears to be strong positive autocorrelation among the residuals.

13.33 (a)

Scatter Plot

8
6
4
2
Residual

0
-2 0 5 10 15 20
-4
-6
-8
Time Period

There is no apparent pattern in the residuals over time.


(b) D = 1.661>1.36. There is no evidence of positive autocorrelation among the
residuals.
(c) The data are not positively autocorrelated.

13.34 (a) No, it is not necessary to compute the Durbin-Watson statistic since the data have
been collected for a single period for a set of stores.
(b) If a single store was studied over a period of time and the amount of shelf space
varied over time, computation of the Durbin-Watson statistic would be necessary.

13.35 (a) b0 = 169.455, b1 = –1.8579


(b) Yˆ = 169.455 − 1.8579 X = 169.455 − 1.8579(50) = 76.56

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


794 Chapter 13: Simple Linear Regression

13.35 (c)
cont.
Residuals vs Time Period

25
20
15
10
5
Residuals

0
-5 0 5 10 15 20 25 30
-10
-15
-20
-25
-30
Time Period

(d) D = 1.18<1.27. There is evidence of positive autocorrelation among the residuals.


(e) The plot of the residuals versus temperature indicates that positive residuals tend to
occur for the lowest and highest temperatures in the data set. A nonlinear model
might be more appropriate. The evidence of positive autocorrelation is another reason
to question the validity of the model.

SSXY 201399.05
13.36 (a) b1 = = = 0.0161
SSX 12495626
b0 = Y − b1 X = 71.2621 − 0.0161( 4393) = 0.458
(b) Yˆ = 0.458 + 0.0161X = 0.458 + 0.0161(4500) = 72.908 or $72,908
(c)
Residuals

15

10

5
Residuals

0
0 5 10 15 20 25 30
-5

-10

-15
Time Period

∑ (e − e )
2
i i −1
1243.2244
(d) D= i=2
n
= = 2.08>1.45. There is no evidence of positive
599.0683
∑e
i =1
2
i

autocorrelation among the residuals.


(e) Based on a residual analysis, the model appears to be adequate.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


795

13.37 (a) Yˆ = 17.0833 − 5 X


(b) Yˆ = 17.0833 − 5 ( 0.5 ) = 14.5833 seconds
(c)
Residuals Plot
6

2
Residuals

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-2

-4

-6

-8

Time Order

There is no noticeable pattern in the plot.


(d) H0: There is no autocorrelation.
H1: There is positive autocorrelation.
PHStat output:
Durbin-Watson Calculations
Sum of Squared Difference of Residuals 238.4375
Sum of Squared Residuals 138.3333333
Durbin-Watson Statistic 1.723644578
dL = 1.27, dU = 1.45. Since D = 1.7236 > 1.45, do not reject H0. There is no evidence
of autocorrelation.
(e) Based on the results of (c) and (d), there is no reason to question the validity of the
model.

13.38 (a) b0 = –2.535, b1 = 0.060728


(b) Yˆ = −2.535 + 0.060728 X = −2.535 + 0.060728(83) = 2.5054 or $2505.40
(c)
Residuals

0.3

0.2

0.1
Residuals

0
0 5 10 15 20 25
-0.1

-0.2

-0.3

-0.4
Time Period

(d) D = 1.64>1.42. There is no evidence of positive autocorrelation among the residuals.


(e) The plot of the residuals versus time period shows some clustering of positive and
negative residuals for intervals in the domain, suggesting a nonlinear model might be
better. Otherwise, the model appears to be adequate.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


796 Chapter 13: Simple Linear Regression

13.39 (a) H0 : ρ = 0 H1 : ρ ≠ 0
r−ρ 0.8 − 0
t= = = 3.7712
2
1− r 1 − 0.8 2
n−2 10 − 2
(b) d.f. = 8, lower critical value = -2.3060, upper critical value = 2.3060.
(c) Since t =3.7712 is greater than the upper critical value of 2.3060, reject H0.

13.40 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
Test statistic: t = ( b1 − 0 ) / sb1 = 4.5 /1.5 = 3.00
(b) With n = 18, df = 18 – 2 =16, t16 = ±2.1199 .
(c) Reject H0. There is evidence that the fitted linear regression model is useful.
(d) b1 − t16 sb1 ≤ β1 ≤ b1 + t16 sb1
4.5 − 2.1199(1.5) ≤ β 1 ≤ 4.5 + 2.1199(1.5)
1.32 ≤ β 1 ≤ 7.68

13.41 (a) MSR = SSR / k = 60 /1 = 60


MSE = SSE /(n − k − 1) = 40 /18 = 2.222
F = MSR / MSE = 60 / 2.222 = 27
(b) F1,18 = 4.41
(c) Reject H0. There is evidence that the fitted linear regression model is useful.
SSR 60
(d) r2 = = = 0.6 r = − 0.60 = −0.7746
SST 100
(e) H0 : ρ = 0 There is no correlation between X and Y.
H1 : ρ ≠ 0 There is correlation between X and Y.
d.f. = 18. Decision rule: Reject H 0 if tcal >2.1009.
r−ρ −0.7746
Test statistic: t = = = −5.196 .
1− r2 1 − 0.6
n−2 18
Since tcal = −5.196 is less than the lower critical bound of –2.1009, reject H 0 .
There is enough evidence to conclude that the correlation between X and Y is
significant.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


797

13.42 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
b1 − β1 7.4
t= = = 4.65 > t10 = 2.2281 with 10 degrees of freedom for α = 0.05 .
S b1 1.59
Reject H0. There is enough evidence to conclude that the fitted linear regression
model is useful.
(b) b1 ± t n − 2 S b1 = 7.4 ± 2.2281(1.59) 3.86 ≤ β 1 ≤ 10.94

13.43 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
PHStat output:
Coefficients Standard t Stat P-value Lower Upper
Error 95% 95%
Intercept 26.7240 26.5425 1.0068 0.3435 -34.4832 87.9311
Reported 0.5719 0.0668 8.5567 0.0000 0.4178 0.7260
Since the p-value is essentially zero, reject H0 at 5% level of significance. There is
evidence of a linear relationship between reported sales and audited sales.
(b) 0.4178 ≤ β1 ≤ 0.7260
13.44 (a) t = 16.5223 > t34 = 2.0322 for α = 0.05 . Reject H0. There is evidence that the
fitted linear regression model is useful.
(b) 0.0439 ≤ β1 ≤ 0.0562

13.45 (a)
Coefficients Standard t Stat P-value Lower Upper
Error 95% 95%
Intercept -0.4480 0.2783 -1.6097 0.1187 -1.0180 0.1221
Customers 0.1285 0.0126 10.1796 6.4878E-11 0.1026 0.1543
p-value is virtually 0 < 0.05. Reject H0. There is evidence that the fitted linear
regression model is useful.
(b) 0.1026 ≤ β1 ≤ 0.1543

13.46 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
Coefficients Standard t Stat P-value Lower 95% Upper 95%
Error
Intercept -368.2846 38.3722 -9.5977 2.36391E-10 -446.8865 -289.6827
Revenue 4.7306 0.2387 19.8167 5.16396E-18 4.2416 5.2196
Since the p-value is essentially zero, reject H0 at 5% level of significance. There is
evidence of a linear relationship between annual revenue sales and franchise value.
(b) 4.2416 ≤ β 1 ≤ 5.2196

13.47 (a) t = 7.74 > t 23 = 2.0687 with 23 degrees of freedom for α = 0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(b) 0.7803 ≤ β1 ≤ 1.3497

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


798 Chapter 13: Simple Linear Regression

13.48 (a) p-value = 2.1259E-09 < 0.05. Reject H0. There is evidence that the fitted linear
regression model is useful.
(b) 3.30718 ≤ β1 ≤ 5.3590

13.49 (a) AT&T’s stock moves only 43% as much as the overall market and is much less
volatile. The stock of Disney Company moves only 81% as much as the overall
market and is considered less volatile than the market. Alcoa’s stock moves 21%
more than the overall market and is considered as a little more volatile than the
overall market. LSI Logic’s stock moves 176% more than the overall market and is
considered as very volatile. Wave Systems’ stock moves 422% more than the overall
market and is considered as extremely volatile.
(b) Investors can use the beta value as a measure of the volatility of a stock to assess its
risk.

13.50 (a) (% daily change in UOPIX) = b0 + 2.00 (% daily change in S&P 500 Index)
(b) If the S&P gains 30% in a year, the UOPIX is expected to gain an estimated 60%.
(c) If the S&P loses 35% in a year, the UOPIX is expected to lose an estimated 70%.
(d) Since the leverage funds have higher volatility and, hence, higher risk than the market,
risk averse investors should stay away from these funds. Risk takers, on the other hand,
will benefit from the higher potential gain from these funds.

13.51 (a) r = 0.7196. There appears to be a moderate positive linear relationship between
calories and fat (in grams) of 16-ounce iced coffee drinks at Dunkin’ Donuts and
Starbucks.
(b) t = 2.3170, p-value = 0.0683 > 0.05. Do not reject H0. At the 0.05 level of
significance, there is not enough evidence of a significant linear relationship between
calories and fat (in grams) of 16-ounce iced coffee drinks at Dunkin’ Donuts and
Starbucks.

13.52 (a) r = 0.8935. There appears to be a rather high positive linear relationship between the
mileage as calculated by owners and by current government standards.
(b) t = 5.2639, p-value = 0.0012 < 0.05. Reject H0. At the 0.05 level of significance,
there is a significant linear relationship between the mileage as calculated by owners
and by current government standards.

13.53 (a) r = 0.6092. There appears to be a moderate positive linear relationship between the
coaches’ salary and revenue.
(b) t = 5.0370, p-value is virtually zero. Reject H0. At the 0.05 level of significance,
there is a significant linear relationship between the coaches’ salary and revenue.

13.54 (a) r = 0.5497. There appears to be a moderate positive linear relationship between the
average Wonderlic score of football players trying out for the NFL and the
graduation rate for football players at selected schools.
(b) t = 3.9485, p-value = 0.0004 < 0.05. Reject H0. At the 0.05 level of significance,
there is a significant linear relationship between the average Wonderlic score of
football players trying out for the NFL and the graduation rate for football players at
selected schools.
(c) There is a significant linear relationship between the average Wonderlic score of
football players trying out for the NFL and the graduation rate for football players at
selected schools but the positive linear relationship is considered as only moderate.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


799

13.55 (a) When X = 2, Yˆ = 5 + 3 X = 5 + 3(2) = 11


1 (X − X )2 1 (2 − 2) 2
h= + n i = + = 0.05
n 20 20
∑(X i − X )
i =1
2

95% confidence interval: Yˆ ± t18 s YX h = 11 ± 2 .1009 ⋅ 1 ⋅ 0 . 05


10.53 ≤ μ YX ≤ 11.47
(b) 95% prediction interval: Yˆ ± t18 s YX 1 + h = 11 ± 2 .1009 ⋅ 1 ⋅ 1 . 05
8.847 ≤ YI ≤ 13.153

13.56 (a) When X = 4, Yˆ = 5 + 3 X = 5 + 3(4) = 17


1 (X i − X )2 1 (4 − 2) 2
h= + n = + = 0.25
n 20 20
∑(X i − X )
i =1
2

95% confidence interval: Yˆ ± t18 sYX h = 17 ± 2.1009 ⋅ 1 ⋅ 0.25


15.95 ≤ μY | X = 4 ≤ 18.05
(b) 95% prediction interval: Yˆ ± t18 sYX 1 + h = 17 ± 2.1009 ⋅ 1 ⋅ 1.25
14.651 ≤ YX = 4 ≤ 19.349
(c) The intervals in this problem are wider because the value of X is farther from X .

13.57 (a) 223.5000 ≤ μY | X = 400 ≤ 287.4577


(b) 153.0765 ≤ YX = 400 ≤ 357.8812
(c) Part (b) provides an interval prediction for the individual response given a specific
value of the independent variable, and part (a) provides an interval estimate for the
mean value given a specific value of the independent variable. Since there is much
more variation in predicting an individual value than in estimating a mean value, a
prediction interval is wider than a confidence interval estimate holding everything
else fixed.

13.58 (a) Yˆi ± tα / 2 S YX hi = 204.2 ± 2.2281(30.81) 0.1373


178.76 ≤ μ Y | X =8 ≤ 229.64
(b) Yˆi ± tα / 2 S YX 1 + hi = 204.2 ± 2.2281(30.81) 1 + 0.1373
131.00 ≤ Y X =8 ≤ 277.40
(c) Part (b) provides an interval prediction for the individual response given a specific
value of the independent variable, and part (a) provides an interval estimate for the
mean value given a specific value of the independent variable. Since there is much
more variation in predicting an individual value than in estimating a mean value, a
prediction interval is wider than a confidence interval estimate holding everything
else fixed.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


800 Chapter 13: Simple Linear Regression

13.59 (a) 1.8987 ≤ μ Y | X = 20 ≤ 2.3442


(b) 0.8820 ≤ Y X = 20 ≤ 3.3609
(c) Part (b) provides an interval prediction for the individual response given a specific
value of the independent variable, and part (a) provides an interval estimate for the
mean value given a specific value of the independent variable. Since there is much
more variation in predicting an individual value than in estimating a mean value, a
prediction interval is wider than a confidence interval estimate holding everything
else fixed.

13.60 (a) 20.7990 ≤ μY | X =500 ≤ 24.5419


(b) 12.2755 ≤ YX =500 ≤ 33.0654
(c) Part (b) provides an interval prediction for the individual response given a specific
value of the independent variable, and part (a) provides an interval estimate for the
mean value given a specific value of the independent variable. Since there is much
more variation in predicting an individual value than in estimating a mean value, a
prediction interval is wider than a confidence interval estimate holding everything
else fixed.

13.61 (a) 1153.0 ≤ μY | X =1000 ≤ 1331.5


(b) 829.9 ≤ YX =1000 ≤ 1654.6
(c) Part (b) provides an interval prediction for the individual response given a specific
value of the independent variable, and part (a) provides an interval estimate for the
mean value given a specific value of the independent variable. Since there is much
more variation in predicting an individual value than in estimating a mean value, a
prediction interval is wider than a confidence interval estimate holding everything
else fixed.

13.62 (a) 325.0222 ≤ μ Y|X =150 ≤ 357.5832 367.0757 ≤ μY | X =150 ≤ 397.3254


(b) 252.8701 ≤ Y X =150 ≤ 429.7352
(c) Part (b) provides an interval prediction for the individual response given a specific
value of the independent variable, and part (a) provides an interval estimate for the
mean value given a specific value of the independent variable. Since there is much
more variation in predicting an individual value than in estimating a mean value, a
prediction interval is wider than a confidence interval estimate holding everything
else fixed.

13.63 (a) Yˆ = b0 + b1 X = − 140.1203 + 4.3331(30 ) = −10.1273 thousands


(b) The prediction interval is more useful here because you want to predict the actual
number of DVDs sold not the mean number of DVDs sold.
(c) -116.3948 thousands ≤ Y X =75 ≤ 96.1407 thousands

13.64 The slope of the line, b1, represents the estimated expected change in Y per unit change in X.
It represents the estimated mean amount that Y changes (either positively or negatively) for a
particular unit change in X. The Y intercept b0 represents the estimated mean value of Y when
X equals 0.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


801

13.65 The coefficient of determination measures the proportion of variation in Y that is explained
by the independent variable X in the regression model.

13.66 The unexplained variation or error sum of squares (SSE) will be equal to zero only when the
regression line fits the data perfectly and the coefficient of determination equals 1.

13.67 The explained variation or regression sum of squares (SSR) will be equal to zero only when
there is no relationship between the Y and X variables, and the coefficient of determination
equals 0.

13.68 Unless a residual analysis is undertaken, you will not know whether the model fit is
appropriate for the data. In addition, residual analysis can be used to check whether the
assumptions of regression have been seriously violated.

13.69 The assumptions of regression are normality of error, homoscedasticity, and independence of
errors.

13.70 The normality of error assumption can be evaluated by obtaining a histogram, box plot,
and/or normal probability plot of the residuals. The homoscedasticity assumption can be
evaluated by plotting the residuals on the vertical axis and the X variable on the horizontal
axis. The independence of errors assumption can be evaluated by plotting the residuals on the
vertical axis and the time order variable on the horizontal axis. This assumption can also be
evaluated by computing the Durbin-Watson statistic.

13.71 If the data in a regression analysis has been collected over time, then the assumption of
independence of errors needs to be evaluated using the Durbin-Watson statistic.

13.72 The confidence interval for the mean response estimates the mean response for a given X
value. The prediction interval estimates the value for a single item or individual.

13.73 (a) There is a strong positive correlation between course mean and cumulative GPA, and
between total hits and hit consistency. There is a moderate positive correlation
between course mean and hit consistency, and between cumulative GPA and hit
consistency.
(b) It is not surprising that cumulative GPA is strongly and positively related to course
mean because course mean in a course contributes to the cumulative GPA and both
measure the performance of a student. It is also not surprising that total hits and hit
consistency are highly positively related because hit consistency contributes to total
hits. Hit consistency is positively related to both cumulative GPA and course mean
because the more frequently a student visits the Internet site supporting a course, the
more current the student is in the course and presumably the better the student
performs in the course.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


802 Chapter 13: Simple Linear Regression

13.74 (a) b0 = 24.84, b1 = 0.14


(b) 24.84 is the portion of estimated mean delivery time that is not affected by the
number of cases delivered. For each additional case, the estimated mean delivery
time increases by 0.14 minutes.
(c) Yˆ = 24.84 + 0.14 X = 24.84 + 0.14(150) = 45.84
(d) No, 500 cases is outside the relevant range of the data used to fit the regression
equation.
(e) r2 = 0.972. So, 97.2% of the variation in delivery time can be explained by the
variation in the number of cases.
(f) Based on a visual inspection of the graphs of the distribution of residuals and the
residuals versus the number of cases, there is no pattern. The model appears to be
adequate.
(g) t = 24.88 > t18 = 2.1009 with 18 degrees of freedom for α = 0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(h) 44.88 ≤ μY | X =150 ≤ 46.80
(i) 41.56 ≤ YX =150 ≤ 50.12
(j) 0.1282 ≤ β 1 ≤ 0.1518
(k) One of the possible uses is that you could use the model to predict the delivery costs
and charge customers based on the number of cases that are delivered.

13.75 (a) b0 = –63.02, b1 = 0.189


(b) For each additional incoming call, the estimated mean number of trade executions
increases by 0.189. – 63.02 is the portion of the estimated mean number of trade
executions that is not affected by the number of incoming calls.
(c) Yˆ = −63.02 + 0.189 X = −63.02 + 0.189(2000) = 314.99
(d) No, 5000 incoming calls is outside the relevant range of the data used to fit the
regression equation.
(e) r2 = 0.630. So, 63.0% of the variation in trade executions can be explained by the
variation in the number of incoming calls.
(f) Based on a visual inspection of the graphs of the distribution of residuals and the
residuals versus the number of cases, there is no pattern. The model appears to be
adequate.
(g) D = 1.96
(h) D = 1.96>1.52. There is no evidence of positive autocorrelation. The model appears
to be adequate.
(i) t = 7.50 > t 33 = 2.0345 with 33 degrees of freedom for α = 0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(j) 302.07 ≤ μY | X = 2000 ≤ 327.91
(k) 253.76 ≤ YX = 2000 ≤ 376.22
(l) 0.1377 ≤ β 1 ≤ 0.2403

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


803

13.76 (a)
Scatter Diagram

250

200

Y 150

100

50

0
155 160 165 170 175 180 185 190
X

b0 = -122.3439 b1 = 1.7817
(b) For each additional thousand dollars in assessed value, the estimated mean selling price
of a house increases by 1.7817 thousand dollars. The estimated mean selling price of a
house with a 0 assessed value is –122.3439 thousand dollars. However, this
interpretation is not meaningful in the current setting since the assessed value is very
unlikely to be 0 for a house.
(c) Yˆ = -122.3439 + 1.78171X = -122.3439 + 1.78171(170 ) = 180.5475 thousand
dollars
(d) r2 = 0.9256. So, 92.56% of the variation in selling price can be explained by the
variation in assessed value.
(e)
Assessed Value Residual Plot

2
Residuals

-2

-4

-6

-8
155 160 165 170 175 180 185 190
Assessed Value

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


804 Chapter 13: Simple Linear Regression

13.76 (e)
cont.
Normal Probability Plot

2
Residuals

0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
-2

-4

-6

-8
Z Value

Both the residual plot and the normal probability plot do not reveal any potential
violation of the linearity, equal variance and normality assumptions.
(f) t = 18.6648 with 28 degrees of freedom, p-value is virtually zero. Since p-value <
0.05, reject H0. There is evidence of a linear relationship between selling price and
assessed value.
(g) 178.7066 thousand dollars ≤ μ Y |X =170 ≤ 182.3884 thousand dollars
(h) 173.1953 thousand dollars ≤ Y X =170 ≤ 187.8998 thousand dollars
(i) 1.5862 ≤ β1 ≤ 1.9773

13.77 (a)
Scatter Diagram

188
186
184
182
180
178
Y

176
174
172
170
168
0 0.5 1 1.5 2 2.5
X

b0 = 151.9153 b1 = 16.6334

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


805

13.77 (b) For each additional thousand square feet increase in the size of a house, the estimated
cont. mean assessed value increases by 16.5334 thousand dollars. The estimated mean
assessed value of a house with a size of 0 square feet is 151.9153 thousand dollars.
However, this interpretation is not meaningful in the current setting since the size of a
house is very unlikely to be 0 for a house with a positive assessed value.
(c) Yˆ = 151.9153399 + 16.6334 X = 151.9153399 + 16.6334(1.75) = 181.0237
thousand dollars
(d) r2 = 0.6593. So, 65.93% of the variation in assessed value can be explained by the
variation in the size.
(e)

Heating Area Residual Plot

2
Residuals

-2

-4

-6
0.00 0.50 1.00 1.50 2.00 2.50
Heating Area

Normal Probability Plot

2
Residuals

0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

-2

-4

-6
Z Value

Both the residual plot and the normal probability plot do not reveal any potential
violation of the linearity, equal variance and normality assumptions.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


806 Chapter 13: Simple Linear Regression

13.77 (f) t = 5.0161 with 13 degrees of freedom, p-value = 0.0002. Since p-value < 0.05,
cont. reject H0. There is evidence of a linear relationship between assessed value and
heating area.
(g) 179.2778 thousand dollars ≤ μ Y |X =1.75 ≤ 182.7697 thousand dollars
(h) 174.4805 thousand dollars ≤ Y X =1.75 ≤ 187.5669 thousand dollars
(i) 9.4695 ≤ β 1 ≤ 23.7972

13.78 (a)
Scatter Diagram
4.5
4
3.5
3
A 2.5
P
G 2
1.5 GPA
1
0.5
0
0 200 400 600 800
GMAT
b0 = 0.30, b1 = 0.00487
(b) 0.30 is the portion of estimated mean GPI index that is not affected by the GMAT
score. The mean GPI index of a student with a zero GMAT score is estimated to be
0.30. For each additional point on the GMAT score, the estimated GPI increases by
an average of 0.00487.
(c) Yˆ = 0.30 + 0.00487 X = 0.30 + 0.00487(600) = 3.222
(d) r2 = 0.7978. 79.78% of the variation in the GPI can be explained by the
variation in the GMAT score.
(e) Based on a visual inspection of the graphs of the distribution of residuals and the
residuals versus the GMAT score, there is no pattern. The model appears to be
adequate.
(f) t = 8.428 > t18 = 2.1009 with 18 degrees of freedom for α = 0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(g) 3.144 ≤ μY | X =600 ≤ 3.301
(h) 2.886 ≤ YX =600 ≤ 3.559
(i) 0.00366 ≤ β 1 ≤ 0.00608

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


807

13.79 (a)
Scatter Diagram
4.5
)s 4
r
u3.5
o
(h 3
e
im
T
2.5
n 2
o
it
le 1.5
p
m 1
o
C0.5
0
0 100 200 300 400
Invoice Processed
b0 = 0.4872, b1 = 0.0123
(b) 0.4872 is the portion of estimated mean completion time that is not affected by the
number of invoices processed. When there is no invoice to process, the mean
completion time is estimated to be 0.4872 hours. Of course, this is not a very
meaningful interpretation in the context of the problem. For each additional invoice
processed, the estimated mean completion time increases by 0.0123 hours.
(c) Yˆ = 0.4872 + 0.0123 X = 0.4872 + 0.0123(150) = 2.3304
(d) r2 = 0.8623. 86.23% of the variation in completion time can be explained by the
variation in the number of invoices processed.
(e)
Invoices Residual Plot
0.8
0.6
0.4
lsa 0.2
u
d
is 0
e
R-0.2
-0.4
-0.6
-0.8
0 100 200 300 400
Invoices

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


808 Chapter 13: Simple Linear Regression

13.79 (e)
cont.
Invoices Residual Plot
0.8
0.6
0.4
lsa 0.2
u
d
is 0
e
R-0.2
-0.4
-0.6
-0.8
0 10 20 30 40
Invoices
(f) Based on a visual inspection of the graphs of the distribution of residuals and the
residuals versus the number of invoices and time, there appears to be autocorrelation
in the residuals.
(g) D = 0.69 < 1.37 = dL. There is evidence of positive autocorrelation. The
model does not appear to be adequate. The number of invoices and, hence, the time
that needs to process them, tend to be high for a few days in a row during historically
heavier shopping days or during advertised sales days. This could be the possible
causes for positive autocorrelation.
(h) Due to the violation of the independence of errors assumption, the prediction made in
(c) is very likely to be erroneous.

13.80 (a)
Scatter Plot

12
O-ring Damage Index

10

0
0 10 20 30 40 50 60 70 80
Temperature (degrees F)

There is not any clear relationship between atmospheric temperature and O-ring
damage from the scatter plot.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


809

13.80 (b),(f)
cont.

12

10

O-ring Damage Index


8

0
0 20 40 60 80 100
-2

-4
Temperature (degrees F)

(c) In (b), there are 16 observations with an O-ring damage index of 0 for a variety of
temperatures. If one concentrates on these observations with no O-ring damage,
there is obviously no relationship between O-ring damage index and temperature. If
all observations are used, the observations with no O-ring damage will bias the
estimated relationship. If the intention is to investigate the relationship between the
degrees of O-ring damage and atmospheric temperature, it makes sense to focus only
on the flights in which there was O-ring damage.
(d) Prediction should not be made for an atmospheric temperature of 31 0F because it is
outside the range of the temperature variable in the data. Such prediction will
involve extrapolation, which assumes that any relationship between two variables
will continue to hold outside the domain of the temperature variable.
(e) Yˆ = 18.036 − 0.240X
(g) A nonlinear model is more appropriate for these data.
(h)

Temperature Residual Plot

7
6
5
4
Residuals

3
2
1
0
-1
-2
-3
0 20 40 60 80 100
Temperature

The string of negative residuals and positive residuals that lie on a straight line with a
positive slope in the lower-right corner of the plot is a strong indication that a
nonlinear model should be used if all 23 observations are to be used in the fit.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


810 Chapter 13: Simple Linear Regression

13.81 (a) b0 = 138.1026, b1 = -13.0054


(b) For a team that has an E.R.A. of 0, the estimated mean number of wins is 138.1026.
For each additional unit increase in team E.R.A., the estimated mean number of wins
decreases by 13.0054.
(c) Yˆ = 138.1026 − 13.0054(4.5) = 79.5781
(d) r2 = 0.5933. So, 59.33% of the variation in number of wins can be explained by the
variation in the team E.R.A..
(e)
E.R.A. Residual Plot

20

15

10
Residuals

-5

-10

-15
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00
E.R.A.

Based on a visual inspection of the graphs of the distribution of the residuals versus
E.R.A., there is no pattern. The model appears to be adequate.
(f) H 0 : β1 = 0 H1 : β1 ≠ 0
p-value is virtually zero. Reject H0 at the 5% level of significance. There is evidence
that the fitted linear regression model is useful.
(g) 76.9092 ≤ μ Y | X = X i ≤ 82.2470
(h) 64.9295 ≤ Y X = Xi ≤ 94.2267
(i) -17.1737 ≤ β1 ≤ -8.8372
(j) The “population” might be considered to be all the teams in recent years in which
baseball has been played.
(k) Other independent variables that might be considered for inclusion in the models are
(i) runs scored, (ii) hits allowed, (iii) walks allowed, (iv) number of errors, etc.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


811

13.82 (a) b0 = 14.6816 b1 = 0.1135


(b) For each additional percentage increase in graduation rate, the estimated mean average
Wonderlic score increases by 0.1135. The estimated mean average Wonderlic score is
14.6816 for a school that has 0% graduation rate. However, this interpretation is not
meaningful in the current setting since graduation rate is very unlikely to be 0% for any
school.
(c) Yˆ = 14.6816 + 0.11347 X = 14.6816 + 0.11347(50 ) = 20.4
(d) r2 = 0.3022. So, 30.22% of the variation in average Wonderlic score can be explained
by the variation in graduation rate.
(e)
Graduation% Residual Plot

2
Residuals

-2

-4

-6
0 20 40 60 80 100
Graduation%

Normal Probability Plot

2
Residuals

0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5

-2

-4

-6
Z Value

Both the residual plot and the normal probability plot do not reveal any potential
violation of the linearity, equal variance and normality assumptions.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


812 Chapter 13: Simple Linear Regression

13.82 (f) t = 3.9485 with 36 degrees of freedom, p-value = 0.0004. Since p-value < 0.05,
cont. reject H0. There is evidence of a linear relationship between the Wonderlic score for
football players trying out for the NFL from a school and the graduation rate.
(g) 19.6 ≤ μ Y |X =50 ≤ 21.1
(h) 15.9 ≤ Y X =50 ≤ 24.8
(i) 0.0552 ≤ β1 ≤ 0.1718

13.83 (a) b0 = 0.3665, b1 = 0.0595


(b) Literal interpretation of the intercept is meaningless since no team can have 0
revenues. For each additional million dollars increase in team revenues, the
estimated mean salary of the coach increases by 0.0595 million dollars.
(c) Yˆ = 0.3665 + 0.0595(7 ) = $0.7827 millions
(d) r2 = 0.3711. So, 37.11% of the variation in the coach’s salary can be explained by the
variation in the team revenues.
(e)
Revenues Residual Plot

1.2
1
0.8
0.6
0.4
Residuals

0.2
0
-0.2
-0.4
-0.6
-0.8
-1
0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 18.00
Revenues

Based on a visual inspection of the graphs of the distribution of the residuals versus
revenues, there is no pattern. The model appears to be adequate.
(f) H 0 : β1 = 0 H1 : β1 ≠ 0
p-value is virtually zero. Reject H0 at the 5% level of significance. There is evidence
that the fitted linear regression model is useful.
(g) $0.6762 millions ≤ μ Y | X = X i ≤ $0.8892 millions
(h) $0.0615 millions ≤ Y X = Xi ≤ $1.5039 millions
(i) 0.0357 ≤ β 1 ≤ 0.0833

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


813

13.84 (a)
Scatter Diagram

5000
4500
4000

Weight (grams)
3500
3000
2500
2000
1500
1000
500
0
0 20 40 60 80 100
Circumference (cms.)

Yˆ = −2629.222+82.4717X
(b) For each increase of one centimeter in circumference, the estimated mean weight of a
pumpkin will increase by 82.4717 grams.
(c) Yˆ = −2629.222+82.4717 ( 60 ) = 2319.080 grams.
(d) There appears to be a positive relationship between weight and circumference of a
pumpkin. It is a good idea for the farmer to sell pumpkins by circumference instead
of weight for circumference is a good predictor of weight, and it is much easier to
measure the circumference of a pumpkin than its weight.
(e) r2 = 0.9373. 93.73% of the variation in pumpkin weight can be explained by the
variation in circumference.
(f)

Circumference Residual Plot

600

400

200
Residuals

-200

-400

-600

-800
0 10 20 30 40 50 60 70 80 90
Circumference

There appears to be a nonlinear relationship between circumference and weight.


(g) p-value is virtually 0. Reject H0. There is sufficient evidence to conclude that there
is a linear relationship between the circumference and the weight of a pumpkin.
(h) 72.7875 < β1 < 92.1559
(i) 2186.9589 < μY | X =60 < 2451.2020
(j) 1726.5508 < YX =60 < 2911.6101

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


814 Chapter 13: Simple Linear Regression

13.85 (a)
Scatter Plot

4500000
4000000
3500000
3000000
2500000
Sales

2000000
1500000
1000000
500000
0
0 10000 20000 30000 40000 50000 60000
Income

(b) b0 = 299876.8059 b1 = 39.1698


Y = 299876.8059+39.1698X
(c) Since median family income of customer base cannot be 0, b0 just captures the
portion of the latest one-month sales total that varies with factors other than income.
b1 = 39.1698 means that as the median family income of customer base increases by
one dollar, the estimated mean latest one-month sales total will increase by $39.17.
(d) r 2 = 0.1472 . 14.72% of the total variation in the franchise's latest one-month sales
total can be explained by using the median family income of customer base.
(e)

Income Residual Plot

2000000
Residuals

0
0 10000 20000 30000 40000 50000 60000
-2000000
Income

There is a slight increase in the variance of the residuals at the higher end of the
median family income. In general, however, the assumption of homoscedasticity
seems to be intact.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


815

13.85 (e)
cont.
10 1 00 .0 0 %
9 9 0.0 0 %
8 8 0.0 0 %
7 7 0.0 0 %

Frequency
6 6 0.0 0 %
5 5 0.0 0 %
4 4 0.0 0 %
3 3 0.0 0 %
2 2 0.0 0 %
1 1 0.0 0 %
0 .0 0 %

-1500000

-1000000

-500000

1000000

1500000
500000

More
0
R e s id u a l

The histogram does not suggest severe asymmetry nor abnormal extreme
observations. So the normality assumption also seems to be intact.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 2.4926
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0281.
Decision: Since t = 2.4926 is greater than the upper critical bound 2.0281, reject H 0 .
There is enough evidence to conclude that there is a linear relationship between one-
month sales total and median income of customer base.
(g) b1 ± tn − 2 Sb1 = 39.1697 ± 2.0281(15.7143) 7.2995 ≤ β1 ≤ 71.0400
You are 95% confident that the slope is somewhere between 7.2995 and 71.04.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


816 Chapter 13: Simple Linear Regression

13.86 (a)

Scatter Plot

5000000

4000000

3000000
Sales

2000000

1000000

0
0 5 10 15 20 25 30 35 40
Age

(b) Yˆ = 931626.16+21782.76X
(c) Since median age of customer base cannot be 0, b0 just captures the portion of the
latest one-month mean sales total that varies with factors other than median age.
b1 = 21782.76 means that as the median age of customer base increases by one year,
the estimated mean latest one-month sales total will increase by $21782.76.
(d) r 2 = 0.0017 . Only 0.17% of the total variation in the franchise's latest one-month
sales total can be explained by using the median age of customer base.
(e)

Age Residual Plot

4000000
Residuals

2000000
0
-2000000 0 10 20 30 40
Age

The residuals are very evenly spread out across different range of median age.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 0.2482
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0289.
Decision: Since t = 0.2482 is less than the upper critical bound 2.4926, do not reject
H 0 . There is not enough evidence to conclude that there is a linear relationship
between one-month sales total and median age of customer base.
(g) b1 ± tn − 2 Sb1 = 21782.76354 ± 2.0281( 87749.63)
-156181.50 ≤ β1 ≤ 199747.02

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


817

13.87 (a)
Scatter Diagram

4500000
4000000
3500000
3000000
Sales 2500000
2000000
1500000
1000000
500000
0
0 20 40 60 80 100
HS

There appears to be some positive linear relationship between total sales and
percentage of customer base with high school diploma.
(b) Y = -2969741.23+59660.09X
(c) b1 = 59660.09 indicates that as the percent of customer base with a high school
diploma increases by one, the estimated mean latest one-month sales total will
increase by $59660.09.
(d) r 2 = 0.2405 . 24.05% of the total variation in the franchise's latest one-month sales
total can be explained by the percentage of customer base with a high school
diploma.
(e)
HS Residual Plot

2000000
Residuals

0
0 20 40 60 80 100
-2000000
HS

The residual plot suggests there might be a violation of the homoscedasticity


assumption since the variance of the residuals increases as the percentage of
customer base with a high school diploma increases.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 3.3766
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0289.
Decision: Since t = 3.3766 is greater than the upper critical bound 2.4926, reject H 0 .
There is enough evidence to conclude that there is a linear relationship between one-
month sales total and percentage of customer base with a high school diploma.
(g) b1 ± tn − 2 Sb1 = 59660.09 ± 2.0281(17668.885 )
23825.98 ≤ β1 ≤ 95494.21

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


818 Chapter 13: Simple Linear Regression

13.88 (a)
Scatter Diagram

4500000
4000000
3500000
3000000
2500000
Sales

2000000
1500000
1000000
500000
0
0 10 20 30 40 50
Collge

There is a positive linear relationship between total sales and percentage of


customer base with a college diploma.
(b) Yˆ = 789847.38 + 35854.15 X
(c) b1 = 35854.15 means that as the percent of customer base with a college diploma
increases by one, the estimated mean latest one-month sales total will increase by
$35854.15.
(d) r 2 = 0.1036 . 10.36% of the total variation in the franchise's latest one-month sales
total can be explained by the percentage of customer base with a college diploma.
(e)
College Residual Plot

4000000
Residuals

2000000
0
-2000000 0 10 20 30 40 50
College

The residuals are quite evenly spread out around zero even though there might be a
slight tendency for the variance to increase as the percentage of customer base with a
college diploma increases.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 2.0392
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0281.
Decision: Since t = 2.0392 is greater than the upper critical bound 2.0281, reject H 0 .
There is enough evidence to conclude that there is a linear relationship between one-
month sales total and percentage of customer base with a college diploma.
(g) b1 ± tn − 2 Sb1 = 35854.15 ± 2.0281(17582.269 )
195.75 ≤ β1 ≤ 71512.60

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


819

13.89 (a)
Scatter Diagram

4500000
4000000
3500000
3000000
2500000
Sales
2000000
1500000
1000000
500000
0
-5 0 5 10 15 20 25
Growth

It is not obvious that there is any linear relationship between total sales and annual
population growth rate of customer base over the past 10 years.
(b) Y = 1595571.48+26833.54X
(c) b0 =1595571 means the estimated mean latest one-month sales total is $1595571
when the annual population growth rate of customer base over the past 10 years is
zero. b1 = 26833.54 means that as the annual population growth rate increases by
1%, the estimated mean latest one-month sales total will increase by $26833.54.
(d) r 2 = 0.0126 . Only 1.26% of the total variation in the franchise's latest one-month
sales total can be explained by the annual population growth rate of customer base
over the past 10 years.
(e)
Growth Residual Plot

4000000
Residuals

2000000
0
-5
-2000000 0 5 10 15 20 25
Growth

There seems to be a diamond shape pattern of the residual distribution and, hence, a
violation of the homoscedasticity assumption. The variance is larger when the
growth rate is closer to zero.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 0.6776
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0289.
Decision: Since t = 0.6776 is less than the upper critical bound 2.4926, do not reject
H 0 . There is not enough evidence to conclude that there is a linear relationship
between one-month sales total and the annual population growth rate of customer
base over the past 10 years.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


820 Chapter 13: Simple Linear Regression

13.89 (g) b1 ± tn − 2 Sb1 = 26833.54 ± 2.0281( 39601.427 )


cont. -53481.77 ≤ β1 ≤ 107148.86

13.90 (a) Minitab output:


Regression Analysis: Price versus Summated Rating

The regression equation is


Price = - 39.0 + 1.39 Summated Rating

Predictor Coef SE Coef T P


Constant -39.014 5.930 -6.58 0.000
Summated Rating 1.3878 0.1009 13.75 0.000

S = 6.97627 R-Sq = 65.9% R-Sq(adj) = 65.5%

Analysis of Variance

Source DF SS MS F P
Regression 1 9199.2 9199.2 189.02 0.000
Residual Error 98 4769.5 48.7
Total 99 13968.8

Excel output:
Regression Statistics
Multiple R 0.811516638
R Square 0.658559254
Adjusted R 0.655075165
Square
Standard 6.97627204
Error
Observations 100

ANOVA
df SS MS F Significance F
Regression 1 9199.249585 9199.249585 189.0190546 1.33706E-24
Residual 98 4769.500415 48.66837158
Total 99 13968.75

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept -39.01372152 5.930125143 -6.578903577 2.33666E-09 -50.78186156 -27.24558147
Summated 1.387790907 0.100941846 13.74842008 1.33706E-24 1.187475103 1.588106711
Rating
=
(b) For each increase of one additional point on the summated rating, the estimated mean
price per person will increase by $1.39.
(c) = = $30.38
(d) r2 = 0.6586. So, 65.86% of the variation in price per person can be explained by the
variation in summated rating.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


821

13.90 (e)
cont. Excel output:
Summated Rating Residual Plot
20
15
10
lsa 5
u 0
d
si -5
e
R-10
-15
-20
-25
0 20 40 60 80 100
Summated Rating

Minitab output:
Residuals Versus Summated Rating
(response is Price)
20

10
Residual

-10

-20

40 50 60 70 80
Summated Rating

Based on a visual inspection of the residual plot of summated rating, the residuals are
quite evenly spread out across different range of the summated rating so the model
appears to be adequate.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
Test statistic: = 13.7484

Decision rule: Reject H 0 when p-value < 0.05.


Decision: Since the p-value is virtually 0, reject H 0 . There is enough evidence to
conclude that there is a linear relationship between price per person and summated
rating.
(g) $28.21 ≤ μ Y | X =50 ≤ $32.55
(h) $16.36 ≤ Y X =50 ≤ $44.39
(i) =
$1.19 $1.59
(j) The linear regression model appears to have provided an adequate fit and shown a
significant linear relationship between price per person and summated rating. Since
65.86% of the variation in price per person can be explained by the variation in
summated rating, summated rating is moderately useful in predicting the price per
person.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.


822 Chapter 13: Simple Linear Regression

13.91 (a) GE:


Coefficients Standard Error t Stat P-value
Intercept -0.0007 0.0023 -0.2916 0.7719
S&P % 0.8821 0.1670 5.2820 2.92119E-06
(b) GE’s stock moves only 88.21% as much as the overall market and is less volatile.
(c) (a) Target:
Coefficients Standard Error t Stat P-value
Intercept -0.0006 0.0033 -0.1876 0.8519
S&P % 0.9410 0.2419 3.8893 0.0003
(b) Target’s stock moves only 5.8% as much as the overall market and is much
less volatile.
(d) (a) Sara Lee:
Coefficients Standard Error t Stat P-value
Intercept -0.0034 0.0037 -0.9200 0.3621
S&P % 0.8485 0.2702 3.1401 0.0029
(b) Sara Lee’s stock moves 84.85% as much as the overall market and is less
volatile.

13.92 (a)
Target Sara Lee GE
Target 1
Sara Lee 0.036384 1
GE 0.727482 -0.07796 1
(b) There is a fairly strong positive linear relationship between the stock price of Target
and GE, almost no linear relationship between the stock price of Sara Lee and Target,
and between the stock price of GE and Sara Lee.
(c) It is not a good idea to have all the stocks in an individual’s portfolio be strongly,
positively correlated among each other because the portfolio risk can be reduced
when a pair of stock prices is negatively related in a two-stock portfolio.

13.93 (a) r = -0.4623. There appears to be a moderate negative relationship between the daily
performance of stocks and bonds.
(b) p-value = 0.0002 < 0.05. At the 5% level of significance, there is evidence of a linear
relationship between the daily performance of stocks and bonds.

Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall.

You might also like