Lecture 11
Lecture 11
Lecture 11:
Time Series Analysis and Forecasting
Recommended Text:
Albright and Winston, “Business Analytics”
6th Edition. 2017 Copyright © Cengage Learning
6
Components of Time Series Data
If observations increase or decrease regularly through time,
the time series has a trend.
• Linear trends occur when the observations increase by
the same amount from period to period.
• Exponential trends occur when observations increase
at a tremendous rate.
• S-shaped trends occur when it takes a while for
observations to start increasing, but then a rapid
increase occurs, before finally tapering off to a fairly
constant level.
7
Components of Time Series Data
If a time series has a seasonal component, it exhibits
seasonality—that is, the same seasonal pattern tends to
repeat itself every year.
8
Components of Time Series Data
A time series has a cyclic component when business
cycles affect the variables in similar ways.
• The cyclic component is more difficult to predict than the
seasonal component, because seasonal variation is much
more regular.
• The length of the business cycle varies, sometimes
substantially.
• The length of a seasonal cycle is generally one year, while
the length of a business cycle is generally longer than one
year and its actual length is difficult to predict.
9
Components of Time Series Data
Random variation (or noise) is the unpredictable
component that gives most time series graphs their
irregular, zigzag appearance.
• A time series can be determined only to a certain extent by
its trend, seasonal, and cyclic components; other factors
determine the rest.
• These other factors combine to create a certain amount of
unpredictability in almost all time series.
10
Measures of Accuracy
The forecast error is the difference between the actual value
and the forecast. It is denoted by E with appropriate subscripts.
Forecasting software packages typically report several summary
measures of forecast errors:
• MAE (Mean Absolute Error):
12
Testing for Randomness
All forecasting models have the general form shown in
the equation below:
14
The Runs Test
The runs test is a quantitative method of testing for
randomness. It is a formal test of the null hypothesis of
randomness.
• First, choose a base value, which could be the
average value of the series, the median value, or
even some other value.
• Then a run is defined as a consecutive series of
observations that remain on one side of this base
level.
• If there are too many or too few runs in the series, the
null hypothesis of randomness can be rejected.
15
Example 1: Forecasting Monthly
Stereo Sales
Objective: To use the runs test to check whether the
residuals from this simple forecasting model represent
random noise.
Solution: Data file contains monthly sales for a chain of
stereo retailers from the beginning of 2013 to the end of
2016, during which there was no upward or downward
trend in sales and no clear seasonality.
16
Example 1: Forecasting Monthly
Stereo Sales
A simple forecast model of sales is to use the average
of the series, 182.67, as a forecast of sales for each
month.
The residuals for this forecasting model are found by
subtracting the average from each observation.
Use the runs test to see whether there are too many or
too few runs around the base of 0.
Select Runs Test for Randomness from the StatTools
Time Series and Forecasting dropdown, choose
Residual as the variable to analyze, and choose Mean
of Series as the cutoff value.
17
Example 1: Forecasting Monthly
Stereo Sales
The resulting output is shown below:
19
Autocorrelation
Another way to check for randomness of a time series of
residuals is to examine the autocorrelations of the
residuals – a type of correlation used to measure whether
values of a time series are related to their own past values.
• In positive autocorrelation, large observations tend to follow
large observations, and small observations tend to follow
small observations.
• The autocorrelation of lag k is essentially the correlation
between the original series and the lag k version of the
series.
• Lags are previous observations, removed by a certain number
of periods from the present time.
20
Example 2: Forecasting Monthly
Stereo Sales
Objective: To examine the autocorrelations of the residuals
from the forecasting model for evidence of non-randomness.
Solution: Use StatTools’s Autocorrelation procedure, found
on the StatTools Time Series and Forecasting dropdown list.
• Specify the times series variable (Residual), the number of
lags you want, and whether you want a chart of the
autocorrelations, called a correlogram.
• It is common practice to ask for no more lags than 25% of the
number of observations.
• Any autocorrelation that is larger than two standard errors in
magnitude is worth your attention.
• One measure of the lag 1 autocorrelation is provided by the
Durbin-Watson (DW) statistic.
• A DW value of 2 indicates no lag 1 autocorrelation.
• A DW value less than 2 indicates positive autocorrelation.
• A DW value greater than 2 indicates negative autocorrelation. 21
Example 2: Forecasting Monthly
Stereo Sales
The autocorrelations and correlogram of the residuals are
shown below.
22
Regression-Based Trend Models
Many time series follow a long-term trend except for
random variation.
• This trend can be upward or downward.
• A straightforward way to model this trend is to
estimate a regression equation for Yt, using time t as
the single explanatory variable.
• The two most frequently used trend models are the
linear trend and the exponential trend.
23
Linear Trend
A linear trend means that the time series variable changes
by a constant amount each time period.
The equation for the linear trend model is:
25
Example 3: Monthly U.S. Population
Run a simple regression of Population versus Time.
26
Exponential Trend
An exponential trend is appropriate when the time series
changes by a constant percentage (as opposed to a constant
dollar amount) each period.
The appropriate regression equation contains a multiplicative
error term ut:
This equation is not useful for estimation; for that, a linear
equation is required.
• You can achieve linearity by taking natural logarithms of both
sides of the equation, as shown below, where a = ln(c) and et
= ln(ut).
29
The Random Walk Model
The random walk model is an example of using random
series as building blocks for other time series models.
• In this model, the series itself is not random.
• However, its differences—that is, changes from one period
to the next—are random.
• This type of behavior is typical of stock price data and
various other time series data.
• The equation for the random walk model is shown below,
where m (mean difference) is a constant, and et is a
random series (noise) with mean 0 and a standard
deviation that remains constant through time.
30
The Random Walk Model
A series that behaves according to this random walk
model has random differences, and the series tends to
trend upward (if m > 0), or downward (if m < 0) by an
amount m each period.
If you are standing in period t and want to forecast Yt+1,
then a reasonable forecast is given by the equation
below:
31
The Random Walk Model
To implement the random walk model on a time series
of stock prices (shown left), create a time series of
differences (shown right).
This model is called the naïve forecasting model when
the mean difference is practically zero.
32
Moving Averages Forecasts
One of the simplest and the most frequently used
extrapolation models is the moving averages model.
• A moving average is the average of the observations
in the past few periods, where the number of terms in
the average is the span.
• If the span is large, extreme values have relatively
little effect on the forecasts, and the resulting series of
forecasts will be much smoother than the original
series.
• For this reason, this method is called a smoothing
method.
33
Moving Averages Forecasts
If the span is small, extreme observations have a larger
effect on the forecasts, and the forecast series will be
much less smooth.
Using a span requires some judgment:
• If you believe the ups and downs in the series are
random noise, use a relatively large span.
• If you believe each up and down is predictable,
use a smaller span.
34
Example 5: Houses Sold in the United
States
Objective: To see whether a moving averages model
with an appropriate span fits the housing sales data
and to see how StatTools implements this method.
Solution: Data file contains monthly data on the
number of new one-family houses sold in the U.S. from
January 1991 through June 2015.
Select Forecast from the StatTools Time Series and
Forecasting dropdown list.
Then select the time period on the Time Scale tab, and
Moving Average on the Forecast Settings tab.
35
Example 5: Houses Sold in the United
States
The output consists of several parts, with the summary
measures MAE, RMSE, and MAPE of the forecast
errors included.
36
Example 5: Houses Sold in the United
States
The graphs below show the behavior of the forecasts.
The left graph has span 3; the right has span 12.
37