0% found this document useful (0 votes)
18 views

Lecture 11

Uploaded by

nirmal thing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Lecture 11

Uploaded by

nirmal thing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

MDA511

Lecture 11:
Time Series Analysis and Forecasting

Recommended Text:
Albright and Winston, “Business Analytics”
6th Edition. 2017 Copyright © Cengage Learning

Compiled by Prof. Paul Kwan


Introduction
Forecasting is a very difficult task, both in the short run
and in the long run.
Analysts search for patterns or relationships in
historical data and then make forecasts.
• There are two problems with this approach:
• It is not always easy to undercover historical patterns or
relationships.
• It is often difficult to separate the noise, or random
behavior, from the underlying patterns.
• Some forecasts may attribute importance to patterns that
are in fact random variations and are unlikely to repeat
themselves.
• There are no guarantees that past patterns will
continue in the future.
2
Forecasting Methods: An Overview
There are many forecasting methods available, and
there is little agreement as to the best forecasting
method.
The methods can be divided into three groups:
• Judgmental methods
• Extrapolation (or time series) methods
• Econometric (or causal) methods
The first method is basically non-quantitative; the last
two are quantitative.
In this lecture, our focus is quantitative methods.
3
Extrapolation Models
Extrapolation models are quantitative models that use
past data of a time series variable to forecast future values
of the variable.
Many extrapolation models are available:
• Trend-based regression
• Autoregression
• Moving averages
• Exponential smoothing
All of these methods look for patterns in the historical
series and then extrapolate these patterns into the future.
Complex models are not always better than simpler models.
• Simpler models track only the most basic underlying
patterns and can be more flexible and accurate in
forecasting the future. 4
Econometric Models
Econometric models, also called causal or regression-
based models, use regression to forecast a time series
variable by using other explanatory time series variables.
Prediction from regression equation:

Causal regression models present mathematical


challenges, including:
• Determining the appropriate “lags” for the regression
equation
• Deciding whether to include lags of the dependent variable
as explanatory variables
• Autocorrelation (correlation of a variable with itself) and
cross-correlation (correlation of a variable with a lagged
version of another variable)
5
Combining Forecasts
This method combines two or more forecasts to obtain
the final forecast.
The reasoning is simple: The forecast errors from
different forecasting methods might cancel one another.
Forecasts that are combined can be of the same
general type, or of different types.
The number of forecasts to combine and the weights to
use in combining them have been the subject of
several research studies.

6
Components of Time Series Data
If observations increase or decrease regularly through time,
the time series has a trend.
• Linear trends occur when the observations increase by
the same amount from period to period.
• Exponential trends occur when observations increase
at a tremendous rate.
• S-shaped trends occur when it takes a while for
observations to start increasing, but then a rapid
increase occurs, before finally tapering off to a fairly
constant level.

7
Components of Time Series Data
If a time series has a seasonal component, it exhibits
seasonality—that is, the same seasonal pattern tends to
repeat itself every year.

8
Components of Time Series Data
A time series has a cyclic component when business
cycles affect the variables in similar ways.
• The cyclic component is more difficult to predict than the
seasonal component, because seasonal variation is much
more regular.
• The length of the business cycle varies, sometimes
substantially.
• The length of a seasonal cycle is generally one year, while
the length of a business cycle is generally longer than one
year and its actual length is difficult to predict.

9
Components of Time Series Data
Random variation (or noise) is the unpredictable
component that gives most time series graphs their
irregular, zigzag appearance.
• A time series can be determined only to a certain extent by
its trend, seasonal, and cyclic components; other factors
determine the rest.
• These other factors combine to create a certain amount of
unpredictability in almost all time series.

10
Measures of Accuracy
The forecast error is the difference between the actual value
and the forecast. It is denoted by E with appropriate subscripts.
Forecasting software packages typically report several summary
measures of forecast errors:
• MAE (Mean Absolute Error):

• RMSE (Root Mean Square Error):

• MAPE (Mean Absolute Percentage Error):

Another measure of forecast errors is the average of the errors. 11


Measures of Accuracy
Some forecasting software packages choose the best
model from a given class by minimizing MAE, RMSE, or
MAPE.
• However, small values of these measures guarantee only
that the model tracks the historical observations well.
• There is still no guarantee that the model will forecast
future values accurately.
Unlike residuals from the regression equation, forecast
errors are not guaranteed to always average to zero.
• If the average of the forecast errors is negative, this implies
a bias, or that the forecasts tend to be too high.
• If the average is positive, the forecasts tend to be too low.

12
Testing for Randomness
All forecasting models have the general form shown in
the equation below:

• The fitted value is the part calculated from past data


and any other available information.
• The residual is the forecast error.
• The fitted value should include all components of the
original series that can possibly be forecast, and the
leftover residuals should be unpredictable noise.
The simplest way to determine whether a time series of
residuals is random noise is to examine time series
graphs of residuals visually—although this is not
always reliable. 13
Testing for Randomness
Some common nonrandom patterns are shown below.

14
The Runs Test
The runs test is a quantitative method of testing for
randomness. It is a formal test of the null hypothesis of
randomness.
• First, choose a base value, which could be the
average value of the series, the median value, or
even some other value.
• Then a run is defined as a consecutive series of
observations that remain on one side of this base
level.
• If there are too many or too few runs in the series, the
null hypothesis of randomness can be rejected.

15
Example 1: Forecasting Monthly
Stereo Sales
Objective: To use the runs test to check whether the
residuals from this simple forecasting model represent
random noise.
Solution: Data file contains monthly sales for a chain of
stereo retailers from the beginning of 2013 to the end of
2016, during which there was no upward or downward
trend in sales and no clear seasonality.

16
Example 1: Forecasting Monthly
Stereo Sales
A simple forecast model of sales is to use the average
of the series, 182.67, as a forecast of sales for each
month.
The residuals for this forecasting model are found by
subtracting the average from each observation.
Use the runs test to see whether there are too many or
too few runs around the base of 0.
Select Runs Test for Randomness from the StatTools
Time Series and Forecasting dropdown, choose
Residual as the variable to analyze, and choose Mean
of Series as the cutoff value.
17
Example 1: Forecasting Monthly
Stereo Sales
The resulting output is shown below:

There is not convincing evidence of non-randomness in


the residuals. It is reasonable to conclude the residuals
represent noise. 18
End of Lecture

19
Autocorrelation
Another way to check for randomness of a time series of
residuals is to examine the autocorrelations of the
residuals – a type of correlation used to measure whether
values of a time series are related to their own past values.
• In positive autocorrelation, large observations tend to follow
large observations, and small observations tend to follow
small observations.
• The autocorrelation of lag k is essentially the correlation
between the original series and the lag k version of the
series.
• Lags are previous observations, removed by a certain number
of periods from the present time.

20
Example 2: Forecasting Monthly
Stereo Sales
Objective: To examine the autocorrelations of the residuals
from the forecasting model for evidence of non-randomness.
Solution: Use StatTools’s Autocorrelation procedure, found
on the StatTools Time Series and Forecasting dropdown list.
• Specify the times series variable (Residual), the number of
lags you want, and whether you want a chart of the
autocorrelations, called a correlogram.
• It is common practice to ask for no more lags than 25% of the
number of observations.
• Any autocorrelation that is larger than two standard errors in
magnitude is worth your attention.
• One measure of the lag 1 autocorrelation is provided by the
Durbin-Watson (DW) statistic.
• A DW value of 2 indicates no lag 1 autocorrelation.
• A DW value less than 2 indicates positive autocorrelation.
• A DW value greater than 2 indicates negative autocorrelation. 21
Example 2: Forecasting Monthly
Stereo Sales
The autocorrelations and correlogram of the residuals are
shown below.

22
Regression-Based Trend Models
Many time series follow a long-term trend except for
random variation.
• This trend can be upward or downward.
• A straightforward way to model this trend is to
estimate a regression equation for Yt, using time t as
the single explanatory variable.
• The two most frequently used trend models are the
linear trend and the exponential trend.

23
Linear Trend
A linear trend means that the time series variable changes
by a constant amount each time period.
The equation for the linear trend model is:

• The interpretation of b is that it represents the expected


change in the series from one period to the next.
• If b is positive, the trend is upward.
• If b is negative, the trend is downward.
• The intercept term a is less important: It literally represents
the expected value of the series at time
t = 0.
A graph of the time series indicates whether a linear trend is
likely to provide a good fit.
24
Example 3: Monthly U.S. Population
Objective: To fit a linear trend line to monthly population
and examine its residuals for randomness.
Solution: Data file contains monthly population data for the
United States from January 1952 to December 2014.
During this period, the population has increased steadily
from about 156 million to about 320 million.
To estimate the trend with regression, use a numeric time
variable representing consecutive months 1 through 756.

25
Example 3: Monthly U.S. Population
Run a simple regression of Population versus Time.

26
Exponential Trend
An exponential trend is appropriate when the time series
changes by a constant percentage (as opposed to a constant
dollar amount) each period.
The appropriate regression equation contains a multiplicative
error term ut:
This equation is not useful for estimation; for that, a linear
equation is required.
• You can achieve linearity by taking natural logarithms of both
sides of the equation, as shown below, where a = ln(c) and et
= ln(ut).

• The coefficient b (expressed as a percentage) is approximately


the percentage change per period. For example, if b = 0.05, then
the series is increasing by approximately 5% per period.
If a time series exhibits an exponential trend, then a plot of its
logarithm should be approximately linear. 27
Example 4: Quarterly PC Device Sales
• Objective: To estimate the company’s exponential
growth and to see whether it has been maintained
during the entire period from 2002 until the end of
2016.
• Solution: Data file contains quarterly sales data for a
large PC device manufacturer from the first quarter of
2002 through the fourth quarter of 2016.
• First, estimate and interpret an exponential trend for
the years 2002 through 2011.
• In Excel®, use the Trendline
tool, with the Exponential
option, to superimpose
an exponential trend line
and the corresponding
equation.
28
Example 4: Quarterly PC Device Sales
Can also use regression to estimate this exponential trend, by
adding a time variable in column C and a log transformation of
sales in column D
Regress Log(Sales) on Time using the data through 2011 only
Estimated equation is Forecast Sales = 62.188e0.0654t

29
The Random Walk Model
The random walk model is an example of using random
series as building blocks for other time series models.
• In this model, the series itself is not random.
• However, its differences—that is, changes from one period
to the next—are random.
• This type of behavior is typical of stock price data and
various other time series data.
• The equation for the random walk model is shown below,
where m (mean difference) is a constant, and et is a
random series (noise) with mean 0 and a standard
deviation that remains constant through time.

30
The Random Walk Model
A series that behaves according to this random walk
model has random differences, and the series tends to
trend upward (if m > 0), or downward (if m < 0) by an
amount m each period.
If you are standing in period t and want to forecast Yt+1,
then a reasonable forecast is given by the equation
below:

31
The Random Walk Model
To implement the random walk model on a time series
of stock prices (shown left), create a time series of
differences (shown right).
This model is called the naïve forecasting model when
the mean difference is practically zero.

32
Moving Averages Forecasts
One of the simplest and the most frequently used
extrapolation models is the moving averages model.
• A moving average is the average of the observations
in the past few periods, where the number of terms in
the average is the span.
• If the span is large, extreme values have relatively
little effect on the forecasts, and the resulting series of
forecasts will be much smoother than the original
series.
• For this reason, this method is called a smoothing
method.

33
Moving Averages Forecasts
If the span is small, extreme observations have a larger
effect on the forecasts, and the forecast series will be
much less smooth.
Using a span requires some judgment:
• If you believe the ups and downs in the series are
random noise, use a relatively large span.
• If you believe each up and down is predictable,
use a smaller span.

34
Example 5: Houses Sold in the United
States
Objective: To see whether a moving averages model
with an appropriate span fits the housing sales data
and to see how StatTools implements this method.
Solution: Data file contains monthly data on the
number of new one-family houses sold in the U.S. from
January 1991 through June 2015.
Select Forecast from the StatTools Time Series and
Forecasting dropdown list.
Then select the time period on the Time Scale tab, and
Moving Average on the Forecast Settings tab.

35
Example 5: Houses Sold in the United
States
The output consists of several parts, with the summary
measures MAE, RMSE, and MAPE of the forecast
errors included.

36
Example 5: Houses Sold in the United
States
The graphs below show the behavior of the forecasts.
The left graph has span 3; the right has span 12.

37

You might also like