Dat Sol 2

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 32

CHAPTER 7

MULTIPLE REGRESSION

ANSWERS TO ODD NUMBERED PROBLEMS

1. A good predictor variable is related to the dependent variable but not too highly related
to
other predictor variables.

3. The net regression coefficient measures the average change in the dependent variable per
unit change in the relevant independent variable, holding the other independent variables
constant.

5. Y = 7.52 + 3(20) - 12.2(7) = -17.88

7. a. Each variable on the primary diagonal is perfectly related to itself.

b. The bottom half of a correlation matrix is the same as the top half.

c. Variables 5 and 6 with correlation coefficients of .79 and .70, respectively.

d. The r14 = -.51 shows a negative or inverse relationship.

e. Yes. Variables 5 and 6 are to some extent collinear, r56 = .69.

f. Models that include variables 4 and 6 or variables 2 and 5 should be best. The
predictor variables in these models are related to the dependent variable and not
too highly related to each other.

g. Variable 5.

35
9.

a. Both predictor variables are significantly related to the dependent variable. The
predictor variables are highly correlated to each other indicating potential for
multicollinearity.

b. When income is increased by one thousand dollars holding family size constant,
the
average increase in annual food expenditures is 228 dollars. When family size is
increased by one person holding income constant, the average decrease in annual
food expenditures is 41 dollars. Since family size is positively related to food
expenditures, .737, it doesn’t make sense that a decrease would take place.

b. Multicollinearity is a problem as indicated by VIF’s of 4.0. Use only one of the


predictor variables.

11. a. Scatter diagram below. Female drivers indicated by solid circles, male divers by
diamonds.

36
b. The regression equation is: Y = 25.5 - 1.04 X1 + 1.21 X2
For a given age of car, female drivers expect to get about 1.2 more miles
per gallon than male drivers.

c. Fitted line for female drivers has equation: Yˆ  26.21  1.04 X 1


Fitted line for male drivers has equation: Yˆ  25.5  1.04 X 1
(Parallel lines with different intercepts)

d.

Line falls “between” point representing female drivers and point


representing male drivers. Straight line equation over-predicts mileage for
male drivers and under-predicts mileage for female drivers. Important to
include gender variable in regression function.

13. Sales Outlets Auto

37
Outlets 0.739
Auto 0.548 0.670
Income 0.936 0.556 0.281

The regression equation is


Sales = - 3.92 + 0.00238 Outlets + 0.457 Auto + 0.401 Income

Predictor Coef StDev T P


Constant -3.918 2.290 -1.71 0.131
Outlets 0.0024 0.0016 1.52 0.173
Auto 0.4574 0.1675 2.73 0.029
Income 0.40058 0.0378 10.60 0.000

S = 2.668 R-Sq = 97.4% R-Sq(adj) = 96.2%

Analysis of Variance

Source DF SS MS F P
Regression 3 1843.40 614.47 86.32 0.000
Error 7 49.83 7.12
Total 10 1893.23

Source DF Seq SS
Outlets 1 1033.84
Auto 1 9.83
Income 1 799.74

Fit StDev Fit 95.0% CI 95.0% PI


27.306 1.878 ( 22.864, 31.747) ( 19.588, 35.023)

a. Yes! The R2 increases to 97.4%.

b. Y = -3.91771 + .00238(2500) + .45743(20.2) + .40058(40) = 27.296


(million)

c. The standard error of estimate has been reduced to 2.67 and the R2 has increased
to 97.4%. The estimate made in part (b) should be quite accurate.

d. The number of retail outlets (X3) and personal income (X4) would be included in
the best model. The equation is Y = -4.02690 + .62092X 3 + .43017X4 and would
explain 96.5% of the sales variable variance. The number of automobiles
registered variable would be dropped to eliminate collinearity.

15. a. Scatter plot for cash purchases versus number of itmes (rectangles) and credit card
purchases versus number of itmes (solid circles).

38
b. Minitab regression output follows

Notice that for a given number of items, sales from cash purchases are estimated to be
about $18.60 less than gross sales from credit card purchases.

39
c. The regression in part b is significant. The number of items sold and
whether the purchases were cash or credit card explains approximately
83% of the variation in gross sales. The predictor variable Items is clearly
significant. The coefficient of the dummy variable X2 is significantly
different from 0 at the 10% level but not at the 5% level. From the
residual plots below we see that there are a few large residuals (see, in
particular, cash sales for day 25 and credit card sales for day 1); but
overall, plots do not indicate any serious departures from the usual
regression assumptions.

Y = 13.61 + 5.99(25) – 18.6(1) = $145

d. sy.x’s = 30.98 df = 47 t.025 = Z.025 = 1.96

95% (large sample) prediction interval:


145  1.96(30.98) = ($84, $206)

e. Fitted function in part b is effectively two parallel straight lines given by the
equations:
Cash purchases: Y = 13.61 + 5.99Items – 18.6(1) = -4.98 + 5.99Items
Credit card purchases: Y = 13.61 + 5.99Items

If we fit separate straight lines to the two types of purchases we get:


Cash purchases: Y = -.60 + 5.78Items R2 = 90.5%

40
Credit card purchases: Y = 10.02 + 6.46Items R2 = 66.0%
Predictions for cash sales and credit card sales will not be too much different
for the two procedures (one prediction equation or two individual equations).
In terms of R2, the single equation model falls between the fits of the separate
models for cash purchases and credit card purchases but closer to the higher
number for cash purchases. For convenience and overall good fit, prefer the
single equation with the dummy variable.

17. a. View, Area, Elevation, and Slope

b. View and Area

19. Stepwise regression results, with significance level .05 to enter and leave the
regression function, follow.

Alpha-to-Enter: 0.05 Alpha-to-Remove: 0.05

Response is Y on 3 predictors, with N = 20

Step 1
Constant -26.24

X3 31.4
T-Value 3.30
P-Value 0.004

S 14.6
R-Sq 37.71
R-Sq(adj) 34.25

The “best” regression model relates final exam score to the single predictor
variable grade point average.

All possible regression results are summarized in the following table.

Predictor R2
Variables
X1 .295
X2 .301

41
X3 .377
X1, X2 .404
X1, X3 .452
X2, X3 .460
X1, X2, X3 .498

The R 2 criterion would suggest using all three predictor variables. However,
the
results in problem 7.17 suggest there is a multicollinearity problem with three
predictors. The best two independent variable model uses predictors X2 and X3.
When this model is fit, X2 is not required. End up with model involving single
predictor X3, the model selected by the stepwise procedure.

CHAPTER 8

REGRESSION WITH TIME SERIES DATA

42
ANSWERS TO ODD NUMBERED PROBLEMS

1. The residuals are not independent from one observation to the next. So knowledge of
the error in one time period helps an analyst anticipate the error in the next time period.

3. The error terms are supposed to be independent of each other.

5. Reject H0 if DW < 1.10; Since 1.0 < 1.1, reject and conclude that the residuals are
positively autocorrelated.

7. Serial correlation can be eliminated by correct specification of the equation (using the
best predictor variables). Using the regression of percentage changes, autoregressive
models, first differencing, and the iterative approach can also eliminate serial
correlation.

9. The regression equation is


FUEL = 129 - 9.59 PRICE - 0.202 POP

Predictor Coef Stdev t-ratio p


Constant 129.29 16.58 7.80 0.000
PRICE -9.588 2.780 -3.45 0.004
POP -0.20157 0.08007 -2.52 0.026

s = 2.275 R-sq = 83.5% R-sq(adj) = 81.0%

Analysis of Variance

SOURCE DF SS MS F p
Regression 2 340.29 170.15 32.87 0.000
Error 13 67.29 5.18
Total 15 407.58

SOURCE DF SEQ SS
PRICE 1 307.49
POP 1 32.81

Durbin-Watson statistic = 2.37

The null and alternative hypotheses are:

H0: 1 = 0 H1: 1 > 0

Using the .05 significance level for a sample of 16 with 2 predictor variables, the
decision rule is:

43
Reject the null hypothesis if the calculated Durbin-Watson statistic is less
than .98. Fail to reject the null hypothesis if the calculated Durbin-Watson
statistic is greater than 1.54. If the calculated Durbin-Watson statistic lies
between .98 and 1.54, the test is inconclusive (reject H0 if DW < .98).

Since the test statistic computed from the sample data is above the critical values
from the table 2.37 > 1.54, fail to reject. Serial correlation was not a problem.

11. Serial correlation is not a problem. However, it is interesting to see whether the students
realize that collinearity is a problem.

Revenue Use Charge


Use 0.187
Charge 0.989 0.109
Customer 0.918 0.426 0.891

The regression equation is


Revenue = - 159 - 0.0323 Use + 15.7 Charge + 0.00349 Customer

Predictor Coef StDev T P


Constant -159.5 575.9 -0.28 0.784
Use -0.03226 0.05762 -0.56 0.581
Charge 15.74 93.48 0.17 0.868
Customer 0.003486 0.005311 0.66 0.518

S = 268.1 R-Sq = 19.0% R-Sq(adj) = 8.9%

Analysis of Variance

Source DF SS MS F P
Regression 3 404256 134752 1.88 0.161
Error 24 1724481 71853
Total 27 2128736

Source DF Seq SS
Use 1 1195
Charge 1 372102
Customer 1 30959

Durbin-Watson statistic = 2.21

The regression equation is


Revenue = - 57.6 + 0.00328 Use + 32.7 Charge

Predictor Coef StDev T P


Constant -57.60 14.03 -4.11 0.000

44
Use 0.003284 0.001039 3.16 0.004
Charge 32.7488 0.8472 38.66 0.000

S = 7.047 R-Sq = 98.4% R-Sq(adj) = 98.3%

Analysis of Variance

Source DF SS MS F P
Regression 2 76938 38469 774.66 0.000
Error 25 1241 50
Total 27 78180

Source DF Seq SS
Use 1 2734
Charge 1 74204

Durbin-Watson statistic = 1.82

13. a.

b. No

45
c. Using the exponential model

d.

e. No

46
f. Autocorrelation

g. Yt = 22.6459*(1.08655**t)

Y26 = 22.6459*(1.08655**26) = 196

15.
ROW C1 C2 C3 C4

1 16.3 0 0 0
2 17.7 1 0 0
3 28.1 0 1 0
4 34.3 0 0 1

47
1996(3rd Qt) Y = 19.3 - 1.43(0) + 11.2(1) + 33.3(0) = 30.5

1996(4th Qt) Y = 19.3 - 1.43(0) + 11.2(0) + 33.3(1) = 52.6

The model explains 80.1% of the sales variable variance. Autocorrelation does not
appear to be a problem.

17. The regression equation is


DiffSales = 149 + 9.16 DiffIncome

20 cases used 1 cases contain missing values

Predictor Coef StDev T P


Constant 148.92 97.70 1.52 0.145
DiffInco 9.155 2.034 4.50 0.000

S = 239.7 R-Sq = 53.0% R-Sq(adj) = 50.3%

Analysis of Variance
Source DF SS MS F P
Regression 1 1164598 1164598 20.27 0.000
Residual Error 18 1034389 57466
Total 19 2198987

Here DiffSales = Yt  Yt  Yt 1 and DiffIncome = X t  X t  X t 1 . The results


' '

involving simple differences are close to the results obtained by the method of
generalized differences in Example 8.5. The estimated slope coefficient is 9.16
versus an estimated slope coefficient of 9.26 obtained with generalized. The
intercept coefficient 149 is also somewhat consistent with the intercept coefficient
54483(1.997) = 163 for the generalized differences procedure. We would
expect the two methods to produce similar results.

19. a. Classical decomposition, Winter’s exponential smoothing, time series multiple

48
regression, autoregressive models, and ARIMA models.

b. The data are seasonal so yes an autoregressive model might work.

c. Look at the Durbin Watson statistic.

d.

Durbin-Watson statistic = 0.818021

e.
May 31 (2004) Y = 421.4 + .85273(2118) = 2227.5 compared to 2150
Aug 31 (2004) Y = 421.4 + .85273(2221) = 2315.3 compared to 2350
Nov 30 (2004) Y = 421.4 + .85273(2422) = 2486.7 compared to 2600
Feb 28 (2005) Y = 421.4 + .85273(3239) = 3183.3 compared to 3400

Estimates are pretty good.

f. Durbin Watson statistic of .81 shows that serial correlation is a problem.

CHAPTER 9

49
BOX-JENKINS (ARIMA) METHODOLOGY

ANSWERS TO ODD NUMBERED PROBLEMS

1. a. 0  .196

b. series is random

c. series is not stationary

d. seasonal series with period of 4

3. a. Yˆ 51 = 75.65 Yˆ 52 = 84.04 Yˆ 53 = 87.82

b. Yˆ 52 = 76.55 Yˆ 53 = 84.45

c. 75.65 
2
3.2

5. a. MA(2)

b. AR(1)

c. ARIMA(1,0,1)

7. a. Autocorrelations of original series fail to die out, suggesting that demand


is
non-stationary. Autocorrelations for first differences of demand, do die
out (cut off relative to standard error limits) suggesting series of first
differences is stationary. Low lag autocorrelations of series of second
differences increase in magnitude, suggesting second differencing is too
much. A plot of the demand series shows the series is increasing linearly
in
time with almost a perfect (deterministic) straight line pattern. In fact, a
straight line time trend fit to the demand data represents the data well as
shown in the plot below.

50
If an ARIMA model is fit to the demand data, the autocorrelations and
plots of the original series and the series of first differences, suggest an
ARIMA(0,1,1) model with a constant term might be good starting point. The first
order moving average term is suggested by the significant autocorrelation at lag 1
for the first differenced series.

b. The Minitab output from fitting an ARIMA(0,1,1) model with a constant


is
shown below.

The least squares estimate of the constant term, .7127, is virtually the same as the
least squares slope coefficient in the straight line fit shown in part a. Also, the
first order moving average coefficient is essentially 1. These two results
are consistent with a straight line time trend regression model for the original
data.

Suppose Yt is demand in time period t. The straight line time trend regression

51
model is: Yt   0   1t   t . Thus Yt 1   0   1 (t  1)   t 1 and
Yt  Yt 1   1   t   t 1 . The latter is an ARIMA(0,1,1) model with a constant
term (the slope coefficient in the straight line model) and a first order moving
average coefficient of 1.

There is some residual autocorrelation (particularly at lag 2) for both the straight
line fit and the ARIMA(0,1,1) fit, but the usual residual plots indicate no other
problems.

c. Prediction equations for period 53.


Straight line model: Yˆ53  19.97  .71(53)
ARIMA model: Yˆ53  Y52  .71  1.00ˆ52

d. The forecasts for the next four periods from forecast origin t = 52 for the
ARIMA model follow.

These forecasts are essentially the same as the forecasts obtained by


extrapolating the fitted straight line in part a.

9. Since the autocorrelation coefficients trail off and the partial autocorrelation
coefficients drop off to zero after one time lag, an AR(1) model should be
adequate. The best model is

Yˆ t = 109.628 - 0.9377Yt-1

The forecast for period 81 is

Yˆ 81 = 109.628 - 0.9377Y80

Yˆ 81 = 109.628 - 0.9377(85) = 29.92

52
Autocorrelation Function for Yt

1.0
0.8
0.6
0.4
0.2

0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 10 15 20

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 -0.88 -7.86 64.18 8 0.17 0.60 229.81 15 0.21 0.74 238.24

2 0.80 4.50 118.31 9 -0.09 -0.31 230.51 16 -0.24 -0.84 244.21

3 -0.66 -3.03 155.90 10 0.05 0.19 230.78 17 0.26 0.90 251.34

4 0.59 2.45 186.36 11 -0.01 -0.04 230.79 18 -0.33 -1.13 262.96

5 -0.48 -1.83 206.31 12 -0.02 -0.08 230.84 19 0.36 1.22 277.15

6 0.40 1.47 220.35 13 0.08 0.28 231.46 20 -0.39 -1.28 293.47

7 -0.28 -1.00 227.24 14 -0.15 -0.53 233.73

53
Partial Autocorrelation Function for Yt

1.0
0.8
0.6
0.4
0.2

0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 10 15 20

Lag PAC T Lag PAC T Lag PAC T

1 -0.88 -7.86 8 -0.16 -1.44 15 -0.00 -0.04

2 0.13 1.16 9 -0.16 -1.44 16 0.03 0.28

3 0.28 2.54 10 0.13 1.19 17 0.01 0.08

4 0.16 1.43 11 0.11 1.01 18 -0.26 -2.32

5 0.15 1.30 12 -0.02 -0.19 19 -0.10 -0.87

6 -0.04 -0.40 13 0.11 0.98 20 0.16 1.42

7 0.12 1.10 14 -0.22 -1.95

ARIMA model for Yt


Final Estimates of Parameters
Type Coef StDev T
AR 1 -0.9377 0.0489 -19.17
Constant 109.628 0.611 179.57
Mean 56.5763 0.3151

Number of observations: 80
Residuals: SS = 2325.19 (backforecasts excluded)
MS = 29.81 DF = 78
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 24.8(DF=11) 39.4(DF=23) 74.0(DF=35) 83.9(DF=47)

95 Percent Limits
Period Forecast Lower Upper Actual
81 29.9234 19.2199 40.6269
82 81.5688 66.8957 96.2419
83 33.1408 15.7088 50.5728

54
The critical chi-square for 11 df's is 19.68. Since the calculated chi-square for
the residual autocorrelations equals 24.8, the model is deemed inadequate.

11. The slow decline in the early, nonseasonal lags indicates the need for regular
differencing.

Autocorrelation Function for Yt

1.0
0.8
0.6
0.4
0.2

0.0
-0.2
-0.4
-0.6
-0.8
-1.0

2 12 22

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 0.71 6.92 49.43 8 0.54 2.07 309.89 15 0.40 1.22 506.38 22 0.23 0.64 602.29

2 0.63 4.34 88.66 9 0.50 1.85 337.18 16 0.40 1.20 525.09 23 0.26 0.73 610.88

3 0.63 3.69 128.66 10 0.45 1.61 359.38 17 0.42 1.26 546.38 24 0.42 1.18 634.13

4 0.62 3.23 168.41 11 0.50 1.74 387.14 18 0.33 0.97 559.60

5 0.63 2.98 210.04 12 0.70 2.34 441.38 19 0.35 1.03 574.97

6 0.56 2.41 242.61 13 0.49 1.56 468.47 20 0.30 0.87 586.38

7 0.59 2.39 279.06 14 0.41 1.28 487.93 21 0.27 0.78 595.80

55
Autocorrelation Function for Regular

1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 15 25

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 -0.35 -3.44 12.19 8 -0.01 -0.11 25.78 15 -0.03 -0.19 90.84 22 -0.13 -0.72 109.87

2 -0.17 -1.49 15.08 9 0.05 0.40 26.06 16 -0.05 -0.28 91.10 23 -0.25 -1.43 118.13

3 0.01 0.07 15.09 10 -0.17 -1.35 29.20 17 0.25 1.47 98.33 24 0.54 2.98 156.06

4 -0.03 -0.23 15.16 11 -0.29 -2.22 38.14 18 -0.24 -1.38 105.13 25 -0.14 -0.71 158.67

5 0.18 1.57 18.65 12 0.65 4.80 85.04 19 0.14 0.79 107.44

6 -0.21 -1.78 23.41 13 -0.19 -1.14 89.05 20 -0.02 -0.14 107.52

7 0.15 1.21 25.76 14 -0.12 -0.73 90.72 21 0.05 0.26 107.79

The nonseasonal part of the series now seems to be stationary and the peaks at
lags 12 and 24 are apparent. The seasonal autocorrelation coefficients seem to
be decaying slowly.Seasonal differencing is necessary. The autocorrelation
coefficient and partial autocorrelation coefficient plots for the seasonal
differenced data are shown on the next page.

56
Autocorrelation Function for Seasonal

1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 15 25

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 -0.49 -4.44 20.42 8 0.07 0.48 23.42 15 -0.03 -0.15 61.85 22 0.02 0.12 70.43

2 -0.03 -0.19 20.47 9 0.01 0.08 23.43 16 -0.11 -0.66 63.13 23 -0.02 -0.12 70.48

3 0.04 0.30 20.61 10 -0.07 -0.50 23.89 17 0.21 1.22 67.63 24 0.02 0.10 70.51

4 0.03 0.23 20.70 11 0.27 2.00 31.19 18 -0.13 -0.78 69.58 25 0.03 0.20 70.65

5 -0.10 -0.76 21.64 12 -0.50 -3.48 55.77 19 0.07 0.40 70.11

6 0.09 0.67 22.38 13 0.24 1.45 61.38 20 -0.05 -0.28 70.36

7 -0.08 -0.62 23.02 14 0.06 0.37 61.78 21 -0.01 -0.07 70.38

Partial Autocorrelation Function for Seasonal

1.0
0.8
0.6
0.4
0.2

0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 15 25

Lag PAC T Lag PAC T Lag PAC T Lag PAC T

1 -0.49 -4.44 8 -0.07 -0.66 15 -0.04 -0.37 22 -0.04 -0.38

2 -0.34 -3.14 9 -0.01 -0.07 16 -0.09 -0.81 23 0.15 1.36

3 -0.21 -1.95 10 -0.09 -0.79 17 0.01 0.09 24 -0.20 -1.79

4 -0.09 -0.81 11 0.34 3.10 18 0.04 0.36 25 -0.05 -0.47

5 -0.17 -1.55 12 -0.32 -2.92 19 0.01 0.09

6 -0.07 -0.64 13 -0.20 -1.86 20 0.02 0.17

7 -0.15 -1.36 14 -0.12 -1.06 21 0.02 0.19

57
Concentrating on the nonseasonal lags, the autocorrelation coefficients drop off
after one time lag and the partial autocorrelation coefficients trail off, so an MA(1)
model should be adequate. Concentrating on the seasonal lags (12 and 24), the
autocorrelation coefficients drop off from 12 to 24 and the partial autocorrelation
coefficients trail off, so an MA(1) model should be adequate. The best model
should be an ARIMA(0,1,1)(0,1,1).ARIMA model for Yt

Final Estimates of Parameters


Type Coef StDev T
MA 1 0.7486 0.0742 10.09
SMA 12 0.8800 0.0893 9.85

Differencing: 1 regular, 1 seasonal of order 12


Number of observations: Original series 96, after differencing 83
Residuals: SS = 5744406210 (backforecasts excluded)
MS = 70918595 DF = 81

Modified Box-Pierce (Ljung-Box) Chi-Square statistic


Lag 12 24 36 48
Chi-Square 3.0(DF=10) 19.3(DF=22) 23.0(DF=34) 25.1(DF=46)

95 Percent Limits
Period Forecast Lower Upper Actual
97 163500 146991 180009
98 158300 141277 175322
99 177084 159562 194606
100 178792 160785 196798
101 188706 170227 207185
102 184846 165907 203785
103 191921 172532 211310
104 188746 168918 208574
105 185194 164936 205451
106 187669 166991 208348
107 188084 166993 209175
108 221521 200025 243016

The critical chi-square for 10 df's is 18.3. Since the calculated chi-square for the
residual autocorrelations equals 3, the model is deemed adequate.

58
13. One question that might arise is should the student use the first 145 observations or
all 150 observations. In the actual circumstance it will not make much difference.
The autocorrelation coefficient plot indicates that the data are nonstationary.
Therefore, the data should be first differenced.

Autocorrelation Function for DEF

1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 15 25 35

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 0.53 6.43 42.20 13 0.19 1.13 269.28 25 0.17 0.95 322.51 37 -0.09 -0.49 346.51
2 0.51 4.97 81.55 14 0.25 1.44 279.58 26 0.16 0.89 327.38
3 0.42 3.59 109.07 15 0.13 0.76 282.53 27 0.15 0.84 331.77
4 0.41 3.21 134.96 16 0.20 1.13 289.20 28 0.02 0.13 331.88
5 0.37 2.75 156.75 17 0.21 1.16 296.43 29 0.09 0.48 333.36
6 0.38 2.65 179.11 18 0.20 1.15 303.67 30 0.01 0.05 333.37
7 0.43 2.88 208.20 19 0.10 0.53 305.26 31 0.10 0.56 335.40
8 0.36 2.27 228.45 20 0.14 0.75 308.48 32 0.16 0.87 340.46
9 0.32 1.95 244.61 21 0.04 0.22 308.75 33 0.11 0.60 342.88
10 0.24 1.43 253.73 22 0.11 0.60 310.84 34 0.05 0.26 343.35
11 0.18 1.05 258.87 23 0.10 0.56 312.67 35 -0.08 -0.45 344.76
12 0.16 0.95 263.15 24 0.16 0.86 317.10 36 -0.02 -0.10 344.84

The autocorrelation coefficient and partial autocorrelation coefficient plots for the first
differenced data are also shown on the next page.

59
Autocorrelation Function for Diff.

1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 15 25 35

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 -0.48 -5.84 34.76 13 -0.03 -0.27 41.25 25 0.03 0.23 70.52 37 -0.09 -0.74 109.44
2 0.07 0.70 35.49 14 0.18 1.76 46.60 26 -0.00 -0.02 70.52
3 -0.07 -0.72 36.27 15 -0.19 -1.80 52.47 27 0.13 1.12 73.43
4 0.02 0.17 36.32 16 0.06 0.57 53.09 28 -0.20 -1.78 80.98
5 -0.04 -0.42 36.60 17 0.01 0.08 53.10 29 0.15 1.29 85.15
6 -0.05 -0.51 37.00 18 0.11 1.08 55.36 30 -0.18 -1.57 91.51
7 0.13 1.33 39.78 19 -0.15 -1.42 59.36 31 0.03 0.29 91.74

8 -0.04 -0.39 40.02 20 0.14 1.30 62.81 32 0.11 0.95 94.17


9 0.04 0.43 40.32 21 -0.17 -1.58 68.12 33 0.02 0.14 94.23
10 -0.02 -0.20 40.39 22 0.08 0.72 69.28 34 0.07 0.62 95.30
11 -0.05 -0.46 40.74 23 -0.07 -0.60 70.07 35 -0.21 -1.71 103.64
12 -0.05 -0.48 41.13 24 0.04 0.38 70.40 36 0.14 1.18 107.75

It appears that the autocorrelations drop off after lag one and that the partial
autocorrelations trail off to zero. This suggests an MA(1) model. If 145
observations are used, the model is

Yˆ t = Yt-1 - 0.7179t-1

60
The critical chi-square for 23 df's is 36.4 at the 95% significance level. Since the
calculated chi-square for the residual autocorrelations equals 29.5, the model is
deemed adequate.

The forecast for period 146 is

Yˆ 146 = Y145 - 0.7179145

Yˆ 146 = 133.6 - 0.7179(-0.30119) = 133.816

MAPE for observations 146-150 is 1.82%.

15.

61
Autocorrelation Function for Price

1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

10 20 30

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 0.96 10.56 114.31 10 0.63 1.93 800.80 19 0.29 0.76 1048.79 28 -0.06 -0.16 1068.56

2 0.92 5.98 220.03 11 0.59 1.76 847.89 20 0.24 0.64 1057.31 29 -0.08 -0.22 1069.71

3 0.88 4.50 316.31 12 0.56 1.61 889.73 21 0.19 0.50 1062.65 30 -0.11 -0.29 1071.65
4 0.83 3.69 403.83 13 0.51 1.45 925.67 22 0.14 0.37 1065.69

5 0.80 3.19 484.45 14 0.47 1.32 956.68 23 0.10 0.26 1067.15

6 0.76 2.81 558.58 15 0.43 1.18 982.17 24 0.06 0.15 1067.66

7 0.72 2.52 626.60 16 0.39 1.06 1003.42 25 0.02 0.05 1067.72

8 0.69 2.30 689.53 17 0.36 0.96 1021.51 26 -0.01 -0.031067.74

9 0.67 2.11 748.02 18 0.33 0.87 1036.82 27 -0.04 -0.101067.96

The autocorrelation coefficient plot indicates that the data are nonstationary.
Therefore, the data should be first differenced.

62
Autocorrelation Function for Diff.

1.0
0.8
0.6
0.4
0.2

0.0
-0.2
-0.4
-0.6
-0.8
-1.0

5 15 25

Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ

1 0.07 0.76 0.59 10 0.03 0.33 6.87 19 0.04 0.40 20.74 28 -0.01 -0.13 27.68

2 0.10 1.08 1.81 11 0.01 0.15 6.90 20 0.05 0.46 21.08 29 0.02 0.14 27.72

3 -0.04 -0.47 2.04 12 0.15 1.50 9.73 21 -0.10 -0.94 22.50

4 -0.13 -1.39 4.15 13 -0.08 -0.80 10.57 22 -0.06 -0.52 22.96

5 -0.02 -0.20 4.20 14 0.12 1.25 12.67 23 -0.01 -0.06 22.96

6 -0.03 -0.32 4.31 15 -0.14 -1.39 15.34 24 -0.05 -0.45 23.31

7 -0.06 -0.64 4.79 16 -0.14 -1.42 18.25 25 -0.16 -1.46 27.01

8 0.01 0.10 4.80 17 -0.01 -0.11 18.27 26 -0.03 -0.31 27.18

9 0.12 1.28 6.74 18 0.12 1.20 20.49 27 -0.05 -0.50 27.65

The autocorrleations for the first differenced series are random. Box-Jenkins is
not the appropriate technique to forecast this series.

17. The variation in Disney sales increases with the level, so a log transformation
seems appropriate. Let Yt be the natural log of sales and Wt  Yt  Yt  4 be the
seasonally differenced series. Two ARIMA models that represent the data
reasonably well are given by the representations ARIMA(1,0,0)(0,1,1) 4 and
ARIMA(0,1,1)(0,1,1) 4. The former model contains a constant. The results for
the ARIMA(1,0,0)(0,1,1)4 process are displayed below.
Fitted model: Wt  .50Wt 1  .089   t  .49 t  4
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 0.4991 0.1164 4.29 0.000
SMA 4 0.4863 0.1196 4.07 0.000
Constant 0.0886 0.0063 14.07 0.000

Differencing: 0 regular, 1 seasonal of order 4


Number of observations: Original series 63, after
differencing 59

63
Forecasts: Date ForecastLnSales ForecastSales
Q4 1995 8.25008 3828
Q1 1996 8.12423 3375
Q2 1996 8.11642 3349
Q3 1996 8.24372 3804
Q4 1996 8.43698 4615

64
CHAPTER 10

JUDGMENTAL FORECASTING AND FORECAST


ADJUSTMENTS

ANSWERS TO ODD NUMBERED PROBLEMS

1. The Delphi method can be used in any forecasting situation where there is little or no
historical data and there is expert opinion (experience) available. Two examples might
be:

 First year sales for a new product


 Full capacity employment at a new plant

65
CHAPTER 11

MANAGING THE FORECASTING PROCESS

ANSWERS TO ODD NUMBERED PROBLEMS

1.

a. One response: Forecasts may not be right, but they improve the odds of being close to
right. More importantly, if there are no agreed upon set of forecasts to drive planning,
then different groups may develop own procedures to guide planning with potential
chaos
as the result.

b. One response: Analogy—If you think education is expensive, try ignorance. Having a
good set of forecasts is like walking while looking ahead instead of at your shoes.
Planning without forecasts will lead to inefficient operations, sub optimal returns on
investment, poor customer service, and so forth.

c. One response: Good forecasts require not only good quantitative skills, they also require
an in-depth understanding of the business and, ultimately, good communication skills
to sell forecasts to management.

66

You might also like