TCMG - MEEG 573 - SP - 20 - Lecture - 7
TCMG - MEEG 573 - SP - 20 - Lecture - 7
TCMG - MEEG 573 - SP - 20 - Lecture - 7
http://www.ct.gov/dcp/cwp/view.asp?a=1629&q=428698
Metropolitan Museum of Art, NY
Admission
Recommended
Adults $20
Seniors (65 and older) $15
Students $10*
Members Free
Children under 12 (accompanied by an adult) Free
Forecasting Models
• Subjective Models
Delphi Methods
• Causal Models
Regression Models
• Time Series Models
Moving Averages
Exponential Smoothing
Forecasting Methods
• Qualitative: primarily subjective; rely on
judgment and opinion
• Time Series: use historical demand only
– Static
– Adaptive
• Causal: use the relationship between demand
and some other factor to develop forecast
• Simulation
– Imitate real life
– Can combine time series and causal methods
Characteristics of Forecasts
• Y = f(x) + e
• Y = b0 + b1 + e
LR assumes:
1. Linear relationship
2. Multivariate normality
3. No or little multicollinearity
4. No auto-correlation
yi = b0 + + ei
Simple Linear Regression
http://www.stat.ncsu.edu/people/reiland/courses/st302/simple_lin_regress_inference.ppt
Introduction
13
The Model
The model has a deterministic and a probabilistic components
House Cost
a bout
c osts
h ouse ot. iz e)
a fo ( S
u i lding square + 75
B
p e r 2 5 000
$75 o st =
Similar houses sell o u se c
H
for $25,000
House size
The Model
– e = error variable b0
Run
x
Estimating the Coefficients
• The estimates are determined by
– drawing a sample from the population of interest,
– calculating sample statistics.
– producing a straight line that cuts into the data.
x
16
The Least Squares (Regression) Line
17
The Least Squares (Regression) Line
Sum of squared differences = (2 - 1)2 +(4 - 2)2 +(1.5 - 3)2 + (3.2 - 4)2 = 6.89
Sum of squared differences = (2 -2.5)2 + (4 - 2.5)2 +(1.5 - 2.5)2 + (3.2 - 2.5)2 = 3.99
Let us compare two lines
(2,4)
4 The second line is horizontal
w
3 w (4,3.2)
2.5
2 The smaller the sum of
(1,2) w
w
(3,1.5) squared differences
1 the better the fit of the
line to the data.
1 2 3 4
18
The Estimated Coefficients
19
The Simple Linear Regression Line
20
The Simple Linear Regression Line
• Solution
– Solving by hand: Calculate a number of statistics
x 36,009 .45; s 2
( x x)
i
2
43,528,690
n 1
x
y 14,822.823; cov(X, Y)
( x x)( y
i i
y)
2,712,511
n 1
where n = 100.
cov(X , Y ) 1,712,511
b1 .06232
s 2x 43,528,690
b 0 y b1x 14,822.82 ( .06232 )( 36,009.45) 17,067
• Solution – continued
– Using the computer
Regression Statistics
Multiple R 0.8063
R Square 0.6501
Adjusted R Square
0.6466
Standard Error 303.1
Observations 100
yˆ 17,067 .0623x
ANOVA
df SS MS F Significance F
Regression 1 16734111 16734111 182.11 0.0000
Residual 98 9005450 91892
Total 99 25739561
Price
16000
15000
14000
13000
No data
12000
0 1500020000 25000 30000 35000 40000 4500050000 55000
Odometer
yˆ 17,067 .0623x
This is the slope of the line.
The intercept is b0 = $17067. For each additional mile on the odometer,
the price decreases by an average of $0.0623
n
SSE (y ŷ ) .
i1
i i
2
– A shortcut formula
q
q
q q q
q
q
q qq
q q q q
q q q q q q
q q q q q qq q qq q q q q q
q
q q q q qq q q
q q q q q qq q q q q q
– The rejection region is t > t.025 or t < -t.025 with n = n-2 = 98.
Approximately, t.025 = 1.984
Testing the Slope - Example
( xi x )( yi y
2
R
2
sx2 s y2
SSE
or R 1
2
i
( y y ) 2
y2
Two data points (x1,y1) and (x2,y2)
of a certain sample are shown.
x1 x2
Total variation in y = Variation explained by the + Unexplained variation (error)
regression line
(y1 y) 2 (y 2 y) 2 ( ŷ 1 y ) 2 ( ŷ 2 y ) 2 ( y 1 ŷ 1 ) 2 ( y 2 ŷ 2 ) 2
Coefficient of determination
R2 1
SSE
( y i y ) 2 SSE
SSR
(y y)
i
2
(y y)
i
2
(y y)
i
2
• Example
– Find the coefficient of determination for the used car price –
odometer example. What does this statistic tell you about the
model?
• Solution
– Solving by hand;
( xi x )( yi y
2
[ 2,712,511]2
R
2
2 2
(43,528,688)(259,996)
.6501
sx s y
Coefficient of Determination
Regression Statistics
Multiple R 0.8063 65% of the variation in the auction
R Square 0.6501
selling price is explained by the
Adjusted R Square
0.6466
Standard Error 303.1
variation in odometer reading. The
Observations 100 rest (35%) remains unexplained by
this model.
ANOVA
df SS MS F Significance F
Regression 1 16734111 16734111 182.11 0.0000
Residual 98 9005450 91892
Total 99 25739561
• Averaging methods
– If a time series is generated by a constant process
subject to random error, then mean is a useful
statistic and can be used as a forecast for the next
period.
– Averaging methods are suitable for stationary time
series data where the series is in equilibrium around
a constant value ( the underlying mean) with a
constant variance over time.
N Period Moving Average
Let : MAT = The N period moving average at the end of
period T
AT = Actual observation for period T
Characteristics:
Need N observations to make a forecast
Very inexpensive and easy to understand
Gives equal weight to all observations
Does not consider observations older than N periods
6-44
Moving Average Example
Three-period
Saturday Period Occupancy Moving Average Forecast
Aug. 1 1 79
8 2 84
15 3 83 82
22 4 81 83 82
29 5 98 87 83
Sept. 5 6 100 93 87
12 7 93
– When new data becomes available , the forecast for time t+2 is
the new mean including the previously observed data plus this
new observation. 1 t 1
Ft 2
t 1
y
i 1
i
( yt yt 1 yt 2 yt k 1 )
Ft 1 yˆ t 1
K
1 t
Ft 1 yi
k i t k 1
dollars) presented in
7 5.6
8 5.6
9 5.4
personnel. 22
23
6.7
5.2
24 6
25 5.8
Example: Weekly Department Store Sales
Weekly Sales
5
Sales
Sales
4 (y)
0
0 5 10 15 20 25 30
Weeks
Example: Weekly Department Store Sales
5 5.6 5.2
6 4.8 5.6
7
7 5.6 5.4
8 5.6 5.333333
6 9 5.4 5.333333
10 6.5 5.533333
5 11 5.1 5.833333
12 5.8 5.666667
Sales
4
13 5 5.8
14 6.2 5.3
15 5.6 5.666667
16 6.7 5.6
3
17 5.2 6.166667
2
18 5.5 5.833333
19 5.8 5.8
1 20 5.1 5.5
21 5.8 5.466667
0
22 6.7 5.566667
0 5 10
Weeks
15 20 25 30 23 5.2 5.866667
24 6 5.9
25 5.8 5.966667
5.666667
the University of
Jan-96 89.3
Feb-96 88.5
Mar-96 93.7
Michigan Index of Apr-96
May-96
92.7
94.7
Consumer Sentiment Jun-96
Jul-96
95.3
94.7
using Simple Aug-96
Sep-96
95.3
94.7
Exponential Smoothing Oct-96
Nov-96
96.5
99.2
Method. Dec-96
Jan-97
96.9
Example: University of Michigan Index
of Consumer Sentiment
• Since no forecast is
estimate equal to
98
96
94
the first 92
90
88
observation. 86
84
82
Date
0.6.
Example: University of Michigan Index
of Consumer Sentiment
Date Consumer Sentiment Alpha =0.3 Alpha=0.6
value.
Jul-95 94.4 92.81 91.98
Aug-95 96.2 93.29 93.43
Sep-95 88.9 94.16 95.09
Oct-95 90.2 92.58 91.38
95 (t = 2) and Mar. 95
Feb-96 88.5 90.38 89.69
Mar-96 93.7 89.81 88.98
Apr-96 92.7 90.98 91.81
(t = 3) are evaluated
May-96 89.4 91.50 92.34
Jun-96 92.4 90.87 90.58
Jul-96 94.7 91.33 91.67
as follows:
Aug-96 95.3 92.34 93.49
Sep-96 94.7 93.23 94.58
Oct-96 96.5 93.67 94.65
Nov-96 99.2 94.52 95.76
Dec-96 96.9 95.92 97.82
Jan-97 97.4 96.22 97.27
Feb-97 99.7 96.57 97.35
yˆ t 1 yˆ t ( yt yˆ t ) Mar-97
Apr-97
100
101.4
97.51
98.26
98.76
99.50
May-97 103.2 99.20 100.64
120
100
Sentiment Index
80
Consumer Sentiment
40
20
0
Jun-94 Oct-95 Mar-97 Jul-98 Dec-99 Apr-01
Months
Measures of Forecast Error
• Forecast error = Et = Ft - Dt
• Mean squared error (MSE)
MSEn = (Sum(t=1 to n)[Et2])/n
• Absolute deviation = At = |Et|
• Mean absolute deviation (MAD)
MADn = (Sum(t=1 to n)[At])/n
• s = 1.25MAD
Measures of Forecast Error
• Mean absolute percentage error (MAPE)
MAPEn = (Sum(t=1 to n)[|Et/ Dt|100])/n
• Bias
• Shows whether the forecast consistently under- or
overestimates demand; should fluctuate around 0
biasn = Sum(t=1 to n)[Et]
• Tracking signal
– Should be within the range of +6
– Otherwise, possibly use a new forecasting method
– Tracking Signal = Accumulated Forecast Errors / Mean Absolute Deviation
TSt = bias / MADt
Tracking signal