BBS11 ISM Ch13
BBS11 ISM Ch13
CHAPTER 13
13.4 (a)
Weekly Sales, Y
4
3
2
1
0
0 5 10 15 20
Shelf Space, X
(b) For each increase in shelf space of an additional foot, there is an expected increase in
weekly sales of an estimated $7.4.
(c) Yˆ = 145 + 7.4 X = 145 + 7.4(8) = $204.2
13.5 (a)
Scatter Diagram
450
400
Y, Audited (thousands)
350
300
250
200
150
100
50
0
0 100 200 300 400 500 600 700
X, Reported (thousands)
(b) For each additional thousand units increase in reported newsstand sales, the mean
audited sales will increase by an estimated 0.5719 thousand units.
(c) Yˆ = 26.7240 + 0.5719 ( 400 ) = 255.4788 thousands.
13.6 (a)
Scatter Diagram
90
80
70
60
50
Y
40
30
20
10
0
0 200 400 600 800 1000 1200 1400 1600
13.7 (a)
Scatter Diagram
6
)s
e 5
t
u
in4
M
(
s
e3
m
iT
g2
n
it
ia
W1
0
0 10 20 30 40 50
Number of Customers
13.8 (a)
Scatter Diagram
1200
1000
800
600
Y
400
200
0
0 50 100 150 200 250 300
X
13.9 (a)
Scatter Plot
2500
2000
1000
500
0
0 500 1000 1500 2000 2500
Size (square feet)
13.10 (a)
Scatter Diagram
400
350
300
250
Y200
150 Videos
100
50
0
0 50 100 150
X
(b)
Coefficients Standard Error t Stat P-value
Intercept -140.1202629 34.15524465 -4.102452326 0.000319082
Box Office 4.333108105 0.500843491 8.651621088 2.1259E-09
Yˆ = b0 + b1 X = − 140.1203 + 4.3331X
13.10 (c) For each increase of one additional million dollars of box office gross, the
cont. estimated mean number of DVDs sold will increase by 4.3331 thousands.
(d) Yˆ = b0 + b1 X = − 140.1203 + 4.3331(75) = 184.86285 thousands
13.11 80% of the variation in the dependent variable can be explained by the variation in
the independent variable.
13.12 SST = 40 and r2 = 0.90. So, 90% of the variation in the dependent variable can be
explained by the variation in the independent variable.
13.13 r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the
variation in the independent variable.
13.14 r2 = 0.75. So, 75% of the variation in the dependent variable can be explained by the
variation in the independent variable.
13.15 Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at
least as large as SSR.
SSR 20,535
13.16 (a) r2 = = = 0.684. So, 68.4% of the variation in the dependent variable
SST 30,025
can be explained by the variation in the independent variable.
∑ (Y )
n 2
i − Yˆi
SSE i =1 9490
(b) S YX = = = = 30.8058
n−2 n−2 10
(c) Based on (a) and (b), the model should be very useful for predicting sales.
13.17 (a) r2 = 0.9015. So, 90.15% of the variation in audit newsstand sales can be explained by
the variation in reported newsstand sales.
(b) S YX = 42.1859
(c) Based on (a) and (b), the model should be very useful for predicting audited sales.
13.18 (a) r2 = 0.8892. So, 88.92% of the variation in the dependent variable can be explained
by the variation in the independent variable.
(b) S YX = 5.0314
(c) Based on (a) and (b), the model should be very useful for predicting labor hours.
13.19 (a) r2 = 0.7873. So, 78.73% of the variation in waiting time can be explained by the
variation in the number of customers.
(b) S YX = 0.5952
(c) Based on (a) and (b), the model should be moderately useful for predicting the
waiting time.
13.20 (a) r2 = 0.9334. So, 93.34% of the variation in value of a baseball franchise can be
explained by the variation in its annual revenue.
(b) S YX = 42.4335
(c) Based on (a) and (b), the model should be very useful for predicting the value of a
baseball franchise.
13.21 (a) r2 = 0.723. So, 72.3% of the variation in monthly rent can be explained by the
variation in square footage.
(b) S YX = 194.6
(c) Based on (a) and (b), the model should be very useful for predicting monthly rent.
13.22 (a) r2 = 0.7278. So, 72.78% of the variation in the number of DVDs sold can be
explained by the variation in box office gross.
(b) . The variation of number of DVDs sold around the prediction line
is 47.8668 thousands. The typical difference between the actual number of DVDs
sold and the predicted number of DVDs sold using the regression equation is
approximately 47.8668 thousands.
(c) Based on (a) and (b), the model is somewhat useful for predicting DVDs sold.
(d) Other variables that might explain the variation in DVDs sales could be the amount
spent on advertising, the timing of the release of the DVDs and the distribution
channels of the DVDs.
13.23 A residual analysis of the data indicates no apparent pattern. The assumptions of
regression appear to be met.
13.24 A residual analysis of the data indicates a pattern, with sizeable clusters of
consecutive residuals that are either all positive or all negative. If the data is cross-
sectional, this pattern indicates a violation of the assumption of linearity and a
quadratic model should be investigated. If the data is time-series, the pattern
indicates a violation of the assumption of independence of errors.
13.25 (a)
60
40
20
0
Residuals
-20
-40
-60
-80
-100
0.0 100.0 200.0 300.0 400.0 500.0 600.0 700.0
Reported
Based on the residual plot, there does not appear to be a pattern in the residual plot.
(b)
60
40
20
0
Residuals
-40
-60
-80
-100
Z Value
Based on the residual plot, there appears to be some heteroscedasticity effect. The
normal probability plot of the residuals indicates a departure from the normality
assumption. The error distribution appears to be left-skewed.
13.26 (a)
0.5
0.4
0.3
0.2
Residuals
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
0 5 10 15 20 25
Space
Based on the residual plot, there does not appear to be a pattern in the residual plot.
(b)
Normal Probability Plot
0.5
0.4
0.3
0.2
0.1
Residuals
0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-0.1
-0.2
-0.3
-0.4
-0.5
Z Value
Based on the residual plot, there is not apparent heteroscedasticity effect. The normal
probability plot of the residuals indicates a departure from the normality assumption.
13.27
Customers Residual Plot
2
1.5
1
lsa
u 0.5
id
s 0
e
R
-0.5
-1
-1.5
0.00 10.00 20.00 30.00 40.00 50.00
Customers
The residual plot does not reveal any obvious pattern. So a linear fit appears to be
adequate.
13.28
10
5
Residuals
0
-5
-10
-15
0 500 1000 1500
Feet
Based on the residual plot, there appears to be a nonlinear pattern in the residuals. A
quadratic model should be investigated.
10
5
Residuals
0
-3 -2 -1 0 1 2 3
-5
-10
-15
Z Value
The assumptions of normality and equal variance do not appear to be seriously
violated.
13.29 (a)
Size Residual Plot
500
400
300
200
Residuals 100
0
-100
-200
-300
-400
-500
0 500 1000 1500 2000 2500
Size
Based on a residual analysis of the residuals versus size, the model appears to be
adequate.
(b)
500
400
300
200
Residuals
100
0
-100 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-200
-300
-400
-500
Z Value
The normal probability plot shows that the distribution has a thicker left tail than a
normal distribution but there is no sign of severe skewness. The assumptions of
regression do not appear to be seriously violated.
13.30 (a)
140
120
100
80
60
Residuals
40
20
0
-20
-40
-60
-80
0 50 100 150 200 250 300
Revenue
Based on the residual plot, there appears to be a nonlinear pattern in the residuals. A
quadratic model should be investigated.
(b)
140
120
100
80
60
Residuals
40
20
0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
-20
-40
-60
-80
Z Value
The normal probability plot of the residuals reveals that the distribution of the errors
might be slightly skewed to the right. The residual plot suggests that the assumption
of equal variance might be violated. The variance at the lower and upper ends of
revenue appears to be smaller than that in the other region.
13.31
120
Box Office Residual Plot
100
80
60
sl40
a20
u
id
s0
e
-20R
-40
-60
-80
-100
0.00 20.00 40.00 Box
60.00
Office 80.00 100.00 120.00
The residual plot does not reveal any obvious pattern. So a linear fit appears to be
adequate.
13.32 (a)
Residual Plot
10
Residuals
0
0 2 4 6 8 10
-5
-10
Tim e Period
13.33 (a)
Scatter Plot
8
6
4
2
Residual
0
-2 0 5 10 15 20
-4
-6
-8
Time Period
13.34 (a) No, it is not necessary to compute the Durbin-Watson statistic since the data have
been collected for a single period for a set of stores.
(b) If a single store was studied over a period of time and the amount of shelf space
varied over time, computation of the Durbin-Watson statistic would be necessary.
13.35 (c)
cont.
Residuals vs Time Period
25
20
15
10
5
Residuals
0
-5 0 5 10 15 20 25 30
-10
-15
-20
-25
-30
Time Period
SSXY 201399.05
13.36 (a) b1 = = = 0.0161
SSX 12495626
b0 = Y − b1 X = 71.2621 − 0.0161( 4393) = 0.458
(b) Yˆ = 0.458 + 0.0161X = 0.458 + 0.0161(4500) = 72.908 or $72,908
(c)
Residuals
15
10
5
Residuals
0
0 5 10 15 20 25 30
-5
-10
-15
Time Period
∑ (e − e )
2
i i −1
1243.2244
(d) D= i=2
n
= = 2.08>1.45. There is no evidence of positive
599.0683
∑e
i =1
2
i
2
Residuals
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-2
-4
-6
-8
Time Order
0.3
0.2
0.1
Residuals
0
0 5 10 15 20 25
-0.1
-0.2
-0.3
-0.4
Time Period
13.39 (a) H0 : ρ = 0 H1 : ρ ≠ 0
r−ρ 0.8 − 0
t= = = 3.7712
2
1− r 1 − 0.8 2
n−2 10 − 2
(b) d.f. = 8, lower critical value = -2.3060, upper critical value = 2.3060.
(c) Since t =3.7712 is greater than the upper critical value of 2.3060, reject H0.
13.40 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
Test statistic: t = ( b1 − 0 ) / sb1 = 4.5 /1.5 = 3.00
(b) With n = 18, df = 18 – 2 =16, t16 = ±2.1199 .
(c) Reject H0. There is evidence that the fitted linear regression model is useful.
(d) b1 − t16 sb1 ≤ β1 ≤ b1 + t16 sb1
4.5 − 2.1199(1.5) ≤ β 1 ≤ 4.5 + 2.1199(1.5)
1.32 ≤ β 1 ≤ 7.68
13.42 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
b1 − β1 7.4
t= = = 4.65 > t10 = 2.2281 with 10 degrees of freedom for α = 0.05 .
S b1 1.59
Reject H0. There is enough evidence to conclude that the fitted linear regression
model is useful.
(b) b1 ± t n − 2 S b1 = 7.4 ± 2.2281(1.59) 3.86 ≤ β 1 ≤ 10.94
13.43 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
PHStat output:
Coefficients Standard t Stat P-value Lower Upper
Error 95% 95%
Intercept 26.7240 26.5425 1.0068 0.3435 -34.4832 87.9311
Reported 0.5719 0.0668 8.5567 0.0000 0.4178 0.7260
Since the p-value is essentially zero, reject H0 at 5% level of significance. There is
evidence of a linear relationship between reported sales and audited sales.
(b) 0.4178 ≤ β1 ≤ 0.7260
13.44 (a) t = 16.5223 > t34 = 2.0322 for α = 0.05 . Reject H0. There is evidence that the
fitted linear regression model is useful.
(b) 0.0439 ≤ β1 ≤ 0.0562
13.45 (a)
Coefficients Standard t Stat P-value Lower Upper
Error 95% 95%
Intercept -0.4480 0.2783 -1.6097 0.1187 -1.0180 0.1221
Customers 0.1285 0.0126 10.1796 6.4878E-11 0.1026 0.1543
p-value is virtually 0 < 0.05. Reject H0. There is evidence that the fitted linear
regression model is useful.
(b) 0.1026 ≤ β1 ≤ 0.1543
13.46 (a) H 0 : β1 = 0 H1 : β1 ≠ 0
Coefficients Standard t Stat P-value Lower 95% Upper 95%
Error
Intercept -368.2846 38.3722 -9.5977 2.36391E-10 -446.8865 -289.6827
Revenue 4.7306 0.2387 19.8167 5.16396E-18 4.2416 5.2196
Since the p-value is essentially zero, reject H0 at 5% level of significance. There is
evidence of a linear relationship between annual revenue sales and franchise value.
(b) 4.2416 ≤ β 1 ≤ 5.2196
13.47 (a) t = 7.74 > t 23 = 2.0687 with 23 degrees of freedom for α = 0.05 . Reject H0. There
is evidence that the fitted linear regression model is useful.
(b) 0.7803 ≤ β1 ≤ 1.3497
13.48 (a) p-value = 2.1259E-09 < 0.05. Reject H0. There is evidence that the fitted linear
regression model is useful.
(b) 3.30718 ≤ β1 ≤ 5.3590
13.49 (a) AT&T’s stock moves only 43% as much as the overall market and is much less
volatile. The stock of Disney Company moves only 81% as much as the overall
market and is considered less volatile than the market. Alcoa’s stock moves 21%
more than the overall market and is considered as a little more volatile than the
overall market. LSI Logic’s stock moves 176% more than the overall market and is
considered as very volatile. Wave Systems’ stock moves 422% more than the overall
market and is considered as extremely volatile.
(b) Investors can use the beta value as a measure of the volatility of a stock to assess its
risk.
13.50 (a) (% daily change in UOPIX) = b0 + 2.00 (% daily change in S&P 500 Index)
(b) If the S&P gains 30% in a year, the UOPIX is expected to gain an estimated 60%.
(c) If the S&P loses 35% in a year, the UOPIX is expected to lose an estimated 70%.
(d) Since the leverage funds have higher volatility and, hence, higher risk than the market,
risk averse investors should stay away from these funds. Risk takers, on the other hand,
will benefit from the higher potential gain from these funds.
13.51 (a) r = 0.7196. There appears to be a moderate positive linear relationship between
calories and fat (in grams) of 16-ounce iced coffee drinks at Dunkin’ Donuts and
Starbucks.
(b) t = 2.3170, p-value = 0.0683 > 0.05. Do not reject H0. At the 0.05 level of
significance, there is not enough evidence of a significant linear relationship between
calories and fat (in grams) of 16-ounce iced coffee drinks at Dunkin’ Donuts and
Starbucks.
13.52 (a) r = 0.8935. There appears to be a rather high positive linear relationship between the
mileage as calculated by owners and by current government standards.
(b) t = 5.2639, p-value = 0.0012 < 0.05. Reject H0. At the 0.05 level of significance,
there is a significant linear relationship between the mileage as calculated by owners
and by current government standards.
13.53 (a) r = 0.6092. There appears to be a moderate positive linear relationship between the
coaches’ salary and revenue.
(b) t = 5.0370, p-value is virtually zero. Reject H0. At the 0.05 level of significance,
there is a significant linear relationship between the coaches’ salary and revenue.
13.54 (a) r = 0.5497. There appears to be a moderate positive linear relationship between the
average Wonderlic score of football players trying out for the NFL and the
graduation rate for football players at selected schools.
(b) t = 3.9485, p-value = 0.0004 < 0.05. Reject H0. At the 0.05 level of significance,
there is a significant linear relationship between the average Wonderlic score of
football players trying out for the NFL and the graduation rate for football players at
selected schools.
(c) There is a significant linear relationship between the average Wonderlic score of
football players trying out for the NFL and the graduation rate for football players at
selected schools but the positive linear relationship is considered as only moderate.
13.64 The slope of the line, b1, represents the estimated expected change in Y per unit change in X.
It represents the estimated mean amount that Y changes (either positively or negatively) for a
particular unit change in X. The Y intercept b0 represents the estimated mean value of Y when
X equals 0.
13.65 The coefficient of determination measures the proportion of variation in Y that is explained
by the independent variable X in the regression model.
13.66 The unexplained variation or error sum of squares (SSE) will be equal to zero only when the
regression line fits the data perfectly and the coefficient of determination equals 1.
13.67 The explained variation or regression sum of squares (SSR) will be equal to zero only when
there is no relationship between the Y and X variables, and the coefficient of determination
equals 0.
13.68 Unless a residual analysis is undertaken, you will not know whether the model fit is
appropriate for the data. In addition, residual analysis can be used to check whether the
assumptions of regression have been seriously violated.
13.69 The assumptions of regression are normality of error, homoscedasticity, and independence of
errors.
13.70 The normality of error assumption can be evaluated by obtaining a histogram, box plot,
and/or normal probability plot of the residuals. The homoscedasticity assumption can be
evaluated by plotting the residuals on the vertical axis and the X variable on the horizontal
axis. The independence of errors assumption can be evaluated by plotting the residuals on the
vertical axis and the time order variable on the horizontal axis. This assumption can also be
evaluated by computing the Durbin-Watson statistic.
13.71 If the data in a regression analysis has been collected over time, then the assumption of
independence of errors needs to be evaluated using the Durbin-Watson statistic.
13.72 The confidence interval for the mean response estimates the mean response for a given X
value. The prediction interval estimates the value for a single item or individual.
13.73 (a) There is a strong positive correlation between course mean and cumulative GPA, and
between total hits and hit consistency. There is a moderate positive correlation
between course mean and hit consistency, and between cumulative GPA and hit
consistency.
(b) It is not surprising that cumulative GPA is strongly and positively related to course
mean because course mean in a course contributes to the cumulative GPA and both
measure the performance of a student. It is also not surprising that total hits and hit
consistency are highly positively related because hit consistency contributes to total
hits. Hit consistency is positively related to both cumulative GPA and course mean
because the more frequently a student visits the Internet site supporting a course, the
more current the student is in the course and presumably the better the student
performs in the course.
13.76 (a)
Scatter Diagram
250
200
Y 150
100
50
0
155 160 165 170 175 180 185 190
X
b0 = -122.3439 b1 = 1.7817
(b) For each additional thousand dollars in assessed value, the estimated mean selling price
of a house increases by 1.7817 thousand dollars. The estimated mean selling price of a
house with a 0 assessed value is –122.3439 thousand dollars. However, this
interpretation is not meaningful in the current setting since the assessed value is very
unlikely to be 0 for a house.
(c) Yˆ = -122.3439 + 1.78171X = -122.3439 + 1.78171(170 ) = 180.5475 thousand
dollars
(d) r2 = 0.9256. So, 92.56% of the variation in selling price can be explained by the
variation in assessed value.
(e)
Assessed Value Residual Plot
2
Residuals
-2
-4
-6
-8
155 160 165 170 175 180 185 190
Assessed Value
13.76 (e)
cont.
Normal Probability Plot
2
Residuals
0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
-2
-4
-6
-8
Z Value
Both the residual plot and the normal probability plot do not reveal any potential
violation of the linearity, equal variance and normality assumptions.
(f) t = 18.6648 with 28 degrees of freedom, p-value is virtually zero. Since p-value <
0.05, reject H0. There is evidence of a linear relationship between selling price and
assessed value.
(g) 178.7066 thousand dollars ≤ μ Y |X =170 ≤ 182.3884 thousand dollars
(h) 173.1953 thousand dollars ≤ Y X =170 ≤ 187.8998 thousand dollars
(i) 1.5862 ≤ β1 ≤ 1.9773
13.77 (a)
Scatter Diagram
188
186
184
182
180
178
Y
176
174
172
170
168
0 0.5 1 1.5 2 2.5
X
b0 = 151.9153 b1 = 16.6334
13.77 (b) For each additional thousand square feet increase in the size of a house, the estimated
cont. mean assessed value increases by 16.5334 thousand dollars. The estimated mean
assessed value of a house with a size of 0 square feet is 151.9153 thousand dollars.
However, this interpretation is not meaningful in the current setting since the size of a
house is very unlikely to be 0 for a house with a positive assessed value.
(c) Yˆ = 151.9153399 + 16.6334 X = 151.9153399 + 16.6334(1.75) = 181.0237
thousand dollars
(d) r2 = 0.6593. So, 65.93% of the variation in assessed value can be explained by the
variation in the size.
(e)
2
Residuals
-2
-4
-6
0.00 0.50 1.00 1.50 2.00 2.50
Heating Area
2
Residuals
0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-2
-4
-6
Z Value
Both the residual plot and the normal probability plot do not reveal any potential
violation of the linearity, equal variance and normality assumptions.
13.77 (f) t = 5.0161 with 13 degrees of freedom, p-value = 0.0002. Since p-value < 0.05,
cont. reject H0. There is evidence of a linear relationship between assessed value and
heating area.
(g) 179.2778 thousand dollars ≤ μ Y |X =1.75 ≤ 182.7697 thousand dollars
(h) 174.4805 thousand dollars ≤ Y X =1.75 ≤ 187.5669 thousand dollars
(i) 9.4695 ≤ β 1 ≤ 23.7972
13.78 (a)
Scatter Diagram
4.5
4
3.5
3
A 2.5
P
G 2
1.5 GPA
1
0.5
0
0 200 400 600 800
GMAT
b0 = 0.30, b1 = 0.00487
(b) 0.30 is the portion of estimated mean GPI index that is not affected by the GMAT
score. The mean GPI index of a student with a zero GMAT score is estimated to be
0.30. For each additional point on the GMAT score, the estimated GPI increases by
an average of 0.00487.
(c) Yˆ = 0.30 + 0.00487 X = 0.30 + 0.00487(600) = 3.222
(d) r2 = 0.7978. 79.78% of the variation in the GPI can be explained by the
variation in the GMAT score.
(e) Based on a visual inspection of the graphs of the distribution of residuals and the
residuals versus the GMAT score, there is no pattern. The model appears to be
adequate.
(f) t = 8.428 > t18 = 2.1009 with 18 degrees of freedom for α = 0.05 . Reject H0.
There is evidence that the fitted linear regression model is useful.
(g) 3.144 ≤ μY | X =600 ≤ 3.301
(h) 2.886 ≤ YX =600 ≤ 3.559
(i) 0.00366 ≤ β 1 ≤ 0.00608
13.79 (a)
Scatter Diagram
4.5
)s 4
r
u3.5
o
(h 3
e
im
T
2.5
n 2
o
it
le 1.5
p
m 1
o
C0.5
0
0 100 200 300 400
Invoice Processed
b0 = 0.4872, b1 = 0.0123
(b) 0.4872 is the portion of estimated mean completion time that is not affected by the
number of invoices processed. When there is no invoice to process, the mean
completion time is estimated to be 0.4872 hours. Of course, this is not a very
meaningful interpretation in the context of the problem. For each additional invoice
processed, the estimated mean completion time increases by 0.0123 hours.
(c) Yˆ = 0.4872 + 0.0123 X = 0.4872 + 0.0123(150) = 2.3304
(d) r2 = 0.8623. 86.23% of the variation in completion time can be explained by the
variation in the number of invoices processed.
(e)
Invoices Residual Plot
0.8
0.6
0.4
lsa 0.2
u
d
is 0
e
R-0.2
-0.4
-0.6
-0.8
0 100 200 300 400
Invoices
13.79 (e)
cont.
Invoices Residual Plot
0.8
0.6
0.4
lsa 0.2
u
d
is 0
e
R-0.2
-0.4
-0.6
-0.8
0 10 20 30 40
Invoices
(f) Based on a visual inspection of the graphs of the distribution of residuals and the
residuals versus the number of invoices and time, there appears to be autocorrelation
in the residuals.
(g) D = 0.69 < 1.37 = dL. There is evidence of positive autocorrelation. The
model does not appear to be adequate. The number of invoices and, hence, the time
that needs to process them, tend to be high for a few days in a row during historically
heavier shopping days or during advertised sales days. This could be the possible
causes for positive autocorrelation.
(h) Due to the violation of the independence of errors assumption, the prediction made in
(c) is very likely to be erroneous.
13.80 (a)
Scatter Plot
12
O-ring Damage Index
10
0
0 10 20 30 40 50 60 70 80
Temperature (degrees F)
There is not any clear relationship between atmospheric temperature and O-ring
damage from the scatter plot.
13.80 (b),(f)
cont.
12
10
0
0 20 40 60 80 100
-2
-4
Temperature (degrees F)
(c) In (b), there are 16 observations with an O-ring damage index of 0 for a variety of
temperatures. If one concentrates on these observations with no O-ring damage,
there is obviously no relationship between O-ring damage index and temperature. If
all observations are used, the observations with no O-ring damage will bias the
estimated relationship. If the intention is to investigate the relationship between the
degrees of O-ring damage and atmospheric temperature, it makes sense to focus only
on the flights in which there was O-ring damage.
(d) Prediction should not be made for an atmospheric temperature of 31 0F because it is
outside the range of the temperature variable in the data. Such prediction will
involve extrapolation, which assumes that any relationship between two variables
will continue to hold outside the domain of the temperature variable.
(e) Yˆ = 18.036 − 0.240X
(g) A nonlinear model is more appropriate for these data.
(h)
7
6
5
4
Residuals
3
2
1
0
-1
-2
-3
0 20 40 60 80 100
Temperature
The string of negative residuals and positive residuals that lie on a straight line with a
positive slope in the lower-right corner of the plot is a strong indication that a
nonlinear model should be used if all 23 observations are to be used in the fit.
20
15
10
Residuals
-5
-10
-15
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00
E.R.A.
Based on a visual inspection of the graphs of the distribution of the residuals versus
E.R.A., there is no pattern. The model appears to be adequate.
(f) H 0 : β1 = 0 H1 : β1 ≠ 0
p-value is virtually zero. Reject H0 at the 5% level of significance. There is evidence
that the fitted linear regression model is useful.
(g) 76.9092 ≤ μ Y | X = X i ≤ 82.2470
(h) 64.9295 ≤ Y X = Xi ≤ 94.2267
(i) -17.1737 ≤ β1 ≤ -8.8372
(j) The “population” might be considered to be all the teams in recent years in which
baseball has been played.
(k) Other independent variables that might be considered for inclusion in the models are
(i) runs scored, (ii) hits allowed, (iii) walks allowed, (iv) number of errors, etc.
2
Residuals
-2
-4
-6
0 20 40 60 80 100
Graduation%
2
Residuals
0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
-2
-4
-6
Z Value
Both the residual plot and the normal probability plot do not reveal any potential
violation of the linearity, equal variance and normality assumptions.
13.82 (f) t = 3.9485 with 36 degrees of freedom, p-value = 0.0004. Since p-value < 0.05,
cont. reject H0. There is evidence of a linear relationship between the Wonderlic score for
football players trying out for the NFL from a school and the graduation rate.
(g) 19.6 ≤ μ Y |X =50 ≤ 21.1
(h) 15.9 ≤ Y X =50 ≤ 24.8
(i) 0.0552 ≤ β1 ≤ 0.1718
1.2
1
0.8
0.6
0.4
Residuals
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 18.00
Revenues
Based on a visual inspection of the graphs of the distribution of the residuals versus
revenues, there is no pattern. The model appears to be adequate.
(f) H 0 : β1 = 0 H1 : β1 ≠ 0
p-value is virtually zero. Reject H0 at the 5% level of significance. There is evidence
that the fitted linear regression model is useful.
(g) $0.6762 millions ≤ μ Y | X = X i ≤ $0.8892 millions
(h) $0.0615 millions ≤ Y X = Xi ≤ $1.5039 millions
(i) 0.0357 ≤ β 1 ≤ 0.0833
13.84 (a)
Scatter Diagram
5000
4500
4000
Weight (grams)
3500
3000
2500
2000
1500
1000
500
0
0 20 40 60 80 100
Circumference (cms.)
Yˆ = −2629.222+82.4717X
(b) For each increase of one centimeter in circumference, the estimated mean weight of a
pumpkin will increase by 82.4717 grams.
(c) Yˆ = −2629.222+82.4717 ( 60 ) = 2319.080 grams.
(d) There appears to be a positive relationship between weight and circumference of a
pumpkin. It is a good idea for the farmer to sell pumpkins by circumference instead
of weight for circumference is a good predictor of weight, and it is much easier to
measure the circumference of a pumpkin than its weight.
(e) r2 = 0.9373. 93.73% of the variation in pumpkin weight can be explained by the
variation in circumference.
(f)
600
400
200
Residuals
-200
-400
-600
-800
0 10 20 30 40 50 60 70 80 90
Circumference
13.85 (a)
Scatter Plot
4500000
4000000
3500000
3000000
2500000
Sales
2000000
1500000
1000000
500000
0
0 10000 20000 30000 40000 50000 60000
Income
2000000
Residuals
0
0 10000 20000 30000 40000 50000 60000
-2000000
Income
There is a slight increase in the variance of the residuals at the higher end of the
median family income. In general, however, the assumption of homoscedasticity
seems to be intact.
13.85 (e)
cont.
10 1 00 .0 0 %
9 9 0.0 0 %
8 8 0.0 0 %
7 7 0.0 0 %
Frequency
6 6 0.0 0 %
5 5 0.0 0 %
4 4 0.0 0 %
3 3 0.0 0 %
2 2 0.0 0 %
1 1 0.0 0 %
0 .0 0 %
-1500000
-1000000
-500000
1000000
1500000
500000
More
0
R e s id u a l
The histogram does not suggest severe asymmetry nor abnormal extreme
observations. So the normality assumption also seems to be intact.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 2.4926
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0281.
Decision: Since t = 2.4926 is greater than the upper critical bound 2.0281, reject H 0 .
There is enough evidence to conclude that there is a linear relationship between one-
month sales total and median income of customer base.
(g) b1 ± tn − 2 Sb1 = 39.1697 ± 2.0281(15.7143) 7.2995 ≤ β1 ≤ 71.0400
You are 95% confident that the slope is somewhere between 7.2995 and 71.04.
13.86 (a)
Scatter Plot
5000000
4000000
3000000
Sales
2000000
1000000
0
0 5 10 15 20 25 30 35 40
Age
(b) Yˆ = 931626.16+21782.76X
(c) Since median age of customer base cannot be 0, b0 just captures the portion of the
latest one-month mean sales total that varies with factors other than median age.
b1 = 21782.76 means that as the median age of customer base increases by one year,
the estimated mean latest one-month sales total will increase by $21782.76.
(d) r 2 = 0.0017 . Only 0.17% of the total variation in the franchise's latest one-month
sales total can be explained by using the median age of customer base.
(e)
4000000
Residuals
2000000
0
-2000000 0 10 20 30 40
Age
The residuals are very evenly spread out across different range of median age.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 0.2482
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0289.
Decision: Since t = 0.2482 is less than the upper critical bound 2.4926, do not reject
H 0 . There is not enough evidence to conclude that there is a linear relationship
between one-month sales total and median age of customer base.
(g) b1 ± tn − 2 Sb1 = 21782.76354 ± 2.0281( 87749.63)
-156181.50 ≤ β1 ≤ 199747.02
13.87 (a)
Scatter Diagram
4500000
4000000
3500000
3000000
Sales 2500000
2000000
1500000
1000000
500000
0
0 20 40 60 80 100
HS
There appears to be some positive linear relationship between total sales and
percentage of customer base with high school diploma.
(b) Y = -2969741.23+59660.09X
(c) b1 = 59660.09 indicates that as the percent of customer base with a high school
diploma increases by one, the estimated mean latest one-month sales total will
increase by $59660.09.
(d) r 2 = 0.2405 . 24.05% of the total variation in the franchise's latest one-month sales
total can be explained by the percentage of customer base with a high school
diploma.
(e)
HS Residual Plot
2000000
Residuals
0
0 20 40 60 80 100
-2000000
HS
13.88 (a)
Scatter Diagram
4500000
4000000
3500000
3000000
2500000
Sales
2000000
1500000
1000000
500000
0
0 10 20 30 40 50
Collge
4000000
Residuals
2000000
0
-2000000 0 10 20 30 40 50
College
The residuals are quite evenly spread out around zero even though there might be a
slight tendency for the variance to increase as the percentage of customer base with a
college diploma increases.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 2.0392
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0281.
Decision: Since t = 2.0392 is greater than the upper critical bound 2.0281, reject H 0 .
There is enough evidence to conclude that there is a linear relationship between one-
month sales total and percentage of customer base with a college diploma.
(g) b1 ± tn − 2 Sb1 = 35854.15 ± 2.0281(17582.269 )
195.75 ≤ β1 ≤ 71512.60
13.89 (a)
Scatter Diagram
4500000
4000000
3500000
3000000
2500000
Sales
2000000
1500000
1000000
500000
0
-5 0 5 10 15 20 25
Growth
It is not obvious that there is any linear relationship between total sales and annual
population growth rate of customer base over the past 10 years.
(b) Y = 1595571.48+26833.54X
(c) b0 =1595571 means the estimated mean latest one-month sales total is $1595571
when the annual population growth rate of customer base over the past 10 years is
zero. b1 = 26833.54 means that as the annual population growth rate increases by
1%, the estimated mean latest one-month sales total will increase by $26833.54.
(d) r 2 = 0.0126 . Only 1.26% of the total variation in the franchise's latest one-month
sales total can be explained by the annual population growth rate of customer base
over the past 10 years.
(e)
Growth Residual Plot
4000000
Residuals
2000000
0
-5
-2000000 0 5 10 15 20 25
Growth
There seems to be a diamond shape pattern of the residual distribution and, hence, a
violation of the homoscedasticity assumption. The variance is larger when the
growth rate is closer to zero.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
r
Test statistic: t = = 0.6776
1− r2
n−2
Decision rule: Reject H 0 when |t|>2.0289.
Decision: Since t = 0.6776 is less than the upper critical bound 2.4926, do not reject
H 0 . There is not enough evidence to conclude that there is a linear relationship
between one-month sales total and the annual population growth rate of customer
base over the past 10 years.
Analysis of Variance
Source DF SS MS F P
Regression 1 9199.2 9199.2 189.02 0.000
Residual Error 98 4769.5 48.7
Total 99 13968.8
Excel output:
Regression Statistics
Multiple R 0.811516638
R Square 0.658559254
Adjusted R 0.655075165
Square
Standard 6.97627204
Error
Observations 100
ANOVA
df SS MS F Significance F
Regression 1 9199.249585 9199.249585 189.0190546 1.33706E-24
Residual 98 4769.500415 48.66837158
Total 99 13968.75
13.90 (e)
cont. Excel output:
Summated Rating Residual Plot
20
15
10
lsa 5
u 0
d
si -5
e
R-10
-15
-20
-25
0 20 40 60 80 100
Summated Rating
Minitab output:
Residuals Versus Summated Rating
(response is Price)
20
10
Residual
-10
-20
40 50 60 70 80
Summated Rating
Based on a visual inspection of the residual plot of summated rating, the residuals are
quite evenly spread out across different range of the summated rating so the model
appears to be adequate.
(f) H0 : ρ = 0 H1 : ρ ≠ 0
Test statistic: = 13.7484
13.92 (a)
Target Sara Lee GE
Target 1
Sara Lee 0.036384 1
GE 0.727482 -0.07796 1
(b) There is a fairly strong positive linear relationship between the stock price of Target
and GE, almost no linear relationship between the stock price of Sara Lee and Target,
and between the stock price of GE and Sara Lee.
(c) It is not a good idea to have all the stocks in an individual’s portfolio be strongly,
positively correlated among each other because the portfolio risk can be reduced
when a pair of stock prices is negatively related in a two-stock portfolio.
13.93 (a) r = -0.4623. There appears to be a moderate negative relationship between the daily
performance of stocks and bonds.
(b) p-value = 0.0002 < 0.05. At the 5% level of significance, there is evidence of a linear
relationship between the daily performance of stocks and bonds.