0% found this document useful (0 votes)
42 views63 pages

Chapter7 Explanatory Models3 ARIMA

ARIMA (Box-Jenkins) models provide a technique for forecasting a time series by analyzing the past pattern of the series without relying on explanatory variables. The methodology involves identifying the "black box" that best transforms white noise inputs into the observed time series outputs. There are three main types of black box models: autoregressive (AR) models, moving average (MA) models, and autoregressive-moving average (ARMA) models. The goal is to select the model that produces white noise residuals, indicating a good fit to the data generating process.

Uploaded by

aciara chan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views63 pages

Chapter7 Explanatory Models3 ARIMA

ARIMA (Box-Jenkins) models provide a technique for forecasting a time series by analyzing the past pattern of the series without relying on explanatory variables. The methodology involves identifying the "black box" that best transforms white noise inputs into the observed time series outputs. There are three main types of black box models: autoregressive (AR) models, moving average (MA) models, and autoregressive-moving average (ARMA) models. The goal is to select the model that produces white noise residuals, indicating a good fit to the data generating process.

Uploaded by

aciara chan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 63

Chapter 7

Explanatory
Models 3:
ARIMA
(Box-Jenkins)
Forecasting
Models
Box-Jenkins or ARIMA
•Box-Jenkins = ARIMA

•AutoRegressive Integrated Moving Average


Simple, Holt’s, Winters’ exponential
smoothing

Regression trend
Review
Multiple regression

Time series decomposition


ARIMA

We have already examined some time-series data by using regression


analysis to relate sequences of data to explanatory variables. Sales (as the
dependent variable), for instance, might be forecast by using the
explanatory (or independent) variables of product price, personal income
of potential purchasers, and advertising expenditures by the firm. Such a
model is a structural or causal forecasting model that requires the
forecaster to know in advance at least some of the determinants of sales.
But in many real-world situations, we do not know the determinants of the
variable to be forecast, or data on these causal variables are not readily
available. It is in just these situations that the ARIMA technique has a
decided advantage over standard regression models.
ARIMA (Box-Jenkins)

• We will use an intuitive approach.

•A time series of data is a sequence of numerical observations naturally ordered in time. The
order of the data is an important part of the data. Some examples would be:

• Hourly temperatures at the entrance to Grand Central Station


• Daily closing price of IBM stock
• Weekly automobile production by the Chevrolet Division of General Motors
• Data from an individual firm: sales, profits, inventory, back orders
• An electrocardiogram

•When a forecaster examines time-series data, two questions are of paramount


importance:
•Do the data exhibit a discernible pattern?
•Can this pattern be exploited to make meaningful forecasts?
Box-Jenkins methodology

The Box-Jenkins methodology of using ARIMA models is a


technically sophisticated way of forecasting a variable by
looking only at the past pattern of the time series. Box-Jenkins
thus ignores information that might be contained in a structural
regression model; instead, it uses the most recent observation
as a starting value and proceeds to analyze recent forecasting
errors to select the most appropriate adjustment for future time
periods.
Since the adjustment usually compensates for only part of the
forecast error, the Box-Jenkins process is best suited to longer-
range rather than shorter-range forecasting (although it is used
for short-, medium-, and long-range forecasts in actual
practice).
• The Box-Jenkins methodology of using ARIMA models has
some advantages over other time-series methods, such as
exponential smoothing, time-series decomposition, and simple
trend analysis. Box-Jenkins methodology determines a great
deal of information from the time series (more so than any
other timeseries method), and it does so while using a
minimum number of parameters. The
Box-Jenkins • Box-Jenkins method allows for greater flexibility in the choice
methodology of the “correct” model (this, we will see, is called
“identification” in Box-Jenkins terminology). Instead of a
priori choosing a simple time trend or a specific exponential
smoothing method, for example, as the correct model, Box-
Jenkins methodology includes a process that allows us to
examine a large variety of models in our search for the correct
one. This “open-ended” characteristic alone accounts for its
appeal to many forecasters.
ARIMA (Box-Jenkins)
A time series is an array of data recorded at
equidistant time intervals

Example:
 consider a cylinder as it rolls across a flat surface

 locate a point, X, on the surface of the cylinder and observe how


discrete observations trace a path through time:

 average value

 turning points in the time series

 conditional forecasts

ARIMA is able to match almost any pattern, no matter how it is generated.


ARIMA (Box-Jenkins)

ARIMA is able to match almost any pattern, no matter how it is generated.


• Crime rates by day of month (Criminology)
• Pulsation of light from stars (Physics)
• Gas concentration in arctic ice cores (Geology)
ARIMA (Box- • Occurrence of diseases by day of year
Jenkins): Who (epidemiology)
• Daily seasonal variation in telephone use
Uses It? (Business)
• Daily fluctuations in stock market (Business)
• Growth ring size in trees (Dendrochronology)
ARIMA compared to Regression
 Regression Analysis:
 example:

 forecasting model of firm sales


 sequence of observations through time
 sequence of observations on explanatory variables
 product price, personal income, advertising expenditure, etc.
 Box-Jenkins Analysis:

example
 forecasting model of firm sales
 sequence of observations through time
 forecast variable of interest by looking at the past pattern of the series
 ignores information that might be contained in the regression model
Philosophy of
ARIMA
• Consider a time series generated by a "black box"
• black box --> observed time series
• objective: find the black box that most closely fits the data

• What are the inputs to the black box?


• In ARIMA analysis the inputs are "white noise“

• white noise --> black box --> observed series


THE PHILOSOPHY OF BOX-JENKINS
• Pretend for a moment that a certain time series is generated by a “black box”:

Black Box  Observed Time Series


• In standard regression analysis we attempt to find the causal variables that explain the observed time series;
what we take as a given is that the black box process is actually approximated by a linear regression
technique:

Explanatory variables  Black box (approximated by linear regression)  Observed Time Series

• In the Box-Jenkins methodology, on the other hand, we do not start with any explanatory variables, but rather
with the observed time series itself; what we attempt to discern is the “correct” black box that could have
produced such a series from white noise:

White Noise  Black Box  Observed Time Series


THE PHILOSOPHY OF
BOX-JENKINS
• Since we are to use no explanatory variables in the ARIMA
process, we assume instead that the series we are observing
started as white noise and was transformed by the black box
process into the series we are trying to forecast.
• White noise is a purely random series of numbers. The
numbers are normally and independently distributed.
• White noise, then, has two characteristics:
1. There is no relationship between consecutively observed values.
2. Previous values do not help in predicting future values.
The Box-
Jenkins
methodology.
White Noise

white noise: purely random data no relation


between consecutively observed values zero
"serial correlation“ previous values do not help
in predicting future values

example: the toss of a fair coin


When choosing the correct black box, there are really only
three basic types of models for us to examine; there are,
however, many variations within each of these three
types.

The three types of models are:

Black Box (1) moving-average (MA) models,

(2) autoregressive (AR) models, and

(3) mixed autoregressive–moving-average models (called


ARMA models).
When To Use
ARIMA

• 1) When the only feasible and practical approach

• 2) Limitation of explanatory variables

• 3) Benchmark for other forecasting models

• 4) Presence of large residual errors in regression


How is the Black Box Estimated?

 start with the observed time series


 pass the black box through the observed series
 examine the time series that remains after it has been transformed by the black
box
 if the black box is correctly specified, what should remain is white noise
 diagnostics: if the remaining series is not white noise, try another black box
• ARIMA Analysis:

• observed series --> black box


--> purely random white noise
How is the
Black Box • Even though the direction of
Estimated? causality is from white noise TO the
observed series, we operate as if
the process is reversed!

• The objective of ARIMA analysis is


to design the correct “black box”
How To Determine The Black Box

• There are three types of models considered in ARIMA


analysis:
• 1) Autoregressive Models
• 2) Moving Average Models
• 3) Autoregressive - Moving Average Models
Moving-Average (MA)
• A moving-average (MA) model is simply one that predicts Yt as a function of the past forecast errors in
predicting Yt.
• Consider et to be a white noise series; a moving-average model would then take the following form:
where:
et: The value at time t of the white noise series
Yt: The generated moving-average time series
W1,2, . . . ,q: The coefficients (or “weights”)
et-1,t-2, . . . ,t-q: Previous values of the white noise series
Moving Average Models

The name “moving average model” is not very descriptive of this type of model.
We would do better to call it a weighted average model.
Consider Table 7-2 (next slide)
The first column is white noise.
The second column labeled “MA1” was constructed:

The “e” above refers to past errors in forecasting


Moving Average Example
Moving Average Example

Yt = The series generated as MA1


et = white noise series
W1 = a constant (equal here to 0.7)
et-1 = white noise lagged one period
The process of identifying
the correct model is a bit like
identifying an animal from
the tracks it makes in the
soil.

5¼“
Mountain Skunk
Moose Lion
But What Are The “Tracks in the Soil?”
Correlation!

Table 7.3 A Simple Example of Autocorrelation


Partial Autocorrelation

The partial autocorrelation coefficient is the second tool we will use to help identify
the relationship between the current values and past values of the original time
series.

Partial autocorrelations measure the degree of association between the variable


and that same variable in another time period after partialing out (i.e., controlling
for) the effects of the other lags.
Correlogram
It is most common to view both the autocorrelation coefficients and the partial autocorrelation coefficients in
graphic form by constructing a correlogram (also called an autocorrelation plot) of the autocorrelation coefficients
and a partial correlogram for the partial autocorrelation coefficients.

Figure 7.1 Examples of theoretical autocorrelation and partial autocorrelation plots for MA(1) and MA(2) models.
Correlogram for an MA(1) Model as shown in
ForecastX

Figure 7.2
Autoregressive Model

Similar to the moving average model, except that the dependent variable Yt depends on
its own previous values rather than the white noise series or residuals.

Where:
Yt = time series generated
A1, A2,…, Ap = coefficients
Yt-1, Yt-2,…, Yt-p = lagged values of the time series
et = white noise
Autoregressive Model Example

Yt = time series generated


A1, A2,…, Ap = coefficients
Yt-1, Yt-2,…, Yt-p = lagged values of the time series
et = white noise
Correlogram

Figure 7.3 Examples of theoretical autocorrelation and partial autocorrelation plots of AR(1) and AR(2) models.
Correlogram for an AR(1) Model as shown in
ForecastX

Figure 7.4
Mixed Autoregressive and Moving-Average Models

Figure 7.5 Examples of theoretical autocorrelation and partial autocorrelation plots of ARMA(1, 1) models.

Many real-world processes, once they have been adjusted for seasonality, can be adequately modeled with the low-
order models.
Stationarity
 Incorporate elements from both the autoregressive and moving average models
 All data in ARIMA analysis is assumed to be "stationary"
 A stationary time series is one in which two consecutive values in the series depend only on the
time interval between them and not on time itself
 If data is not stationary, it should be adjusted to correct for the nonstationarity
 Differencing is usually used to make this correction
 The resulting model is said to be an "integrated" (differenced) model
 This is the source of the "I" in an ARIMA model

A stationary time series is one in which two


consecutive values in the series depend only on
the time interval between them and not on
time itself.
Nonstationary Series

Figure 7.6

Autocorrelation and
partial autocorrelation
plots for the ARIMA(1, 1,
1) Series in Table 7.2
Same Series, After First Differencing

Figure 7.7

Autocorrelation and partial


autocorrelation plots for the
ARIMA(1, 1, 1) series in Table
7.2 after first differences have
been taken.
Figure 7.8

The Box-
Jenkins
methodology.
The Diagnostic Step (Visually first)
The Diagnostic Step a Second Way
The Ljung-Box statistic
Second Diagnostic Method (Use the Ljung-Box statistic)

The second test for correctness of the model (but again, not a definitive test) is
the Ljung-Box-Cox-Pierce Q statistic.

The statistic is used to perform a chi-square test on the autocorrelations of the


residuals (or error terms).
Second Diagnostic Method (Use the Ljung-Box statistic)

The Ljung-Box statistic is:

m-p-q
Degrees of
Freedom
Where:
n= the number of observations in the time series
k= the particular time lag to be checked
m= the number of time lags to be tested
rk= sample autocorrelation function of the kth residual term
Second Diagnostic Method (Use the Ljung-Box statistic)

The Ljung-Box, or Q, statistic tests whether the residual autocorrelations as a set are
significantly different from zero. If the residual autocorrelations ARE significantly
different from zero, the model specification should be reformulated (i.e., the model has
failed the test).

If the calculated Ljung-Box statistic is less than the table value, the autocorrelations are
not significantly different from zero (that’s good!).

Note:

ForecastX is set to check automatically for a lag length of 12 if a nonseasonal model has
been selected; if a seasonal model has been selected the lag length is equal to four times
the seasonal length.
Second Diagnostic Method (Use the Ljung-Box statistic)

ARIMA (p d q)

AR I MA Table 7-2
Force MA1 Model on the
p is the number of AR terms, MA1 column data.
d is the number of differences, and
q is the number of MA terms.
Second Diagnostic Method (Use the Ljung-Box statistic)
Use a Chi-Square table to check the Ljung-Box.
For example:
If the calculated Ljung-Box reported by ForecastX is 7.33 for the first 12
autocorrelations (as in a nonseasonal model), the resulting degrees of freedom are 11.
(m-p-q degrees of freedom)
Check the textbook Chi-Square table for 11 degrees of freedom at the 0.10 column to
find 17.275. This is the critical value.
In this case, the model passes the Q-test because 7.33 is less than 17.275.

Note that it is standard practice with the Ljung-Box to check “four times the number of seasons” in terms of lags
examined. For instance, if the data is quarterly check (4 X 4) or 16 lags at a minimum.
FORECASTING SEASONAL TIME SERIES
• In many actual business situations the time series to be forecast are quite seasonal.
• This seasonality can cause some problems in the ARIMA process, since a model fitted to such a series would
likely have a very high order. If monthly data were used and the seasonality occurred in every 12th month,
the order of the model might be 12 or more.
Seasonal ARIMA Model
ARIMA (p d q) (P D Q)

AR I MA SAR SI SMA

p is the number of AR terms,


d is the number of differences,
q is the number of MA terms,
P is the number of seasonal AR terms,
D is the number of seasonal differences,
Q is the number of seasonal MA terms.
Figure 7.17

Autocorrelation and
partial
autocorrelation plots
for the total houses
sold series.  

Note that the series


appears to be
nonstationary. C7F17
ARIMA Results for Total Houses Sold Series

Figure 7.18 contains the diagnostic statistics for the estimation for an ARIMA (0, 1, 0) (2, 0, 2) model. The second set of P, D,
Q values (usually shown as uppercase letters) represents two seasonal AR and two seasonal MA terms. The Ljung-Box
statistic for the first 48 lags is 45.43 and confirms the acceptability of the model.
Estimation and order selection

• Maximum likelihood estimation


• Once the model order has been identified, we need to estimate the
parameters W and A. When R estimates the ARIMA model, it
uses maximum likelihood estimation (MLE). This technique finds the
values of the parameters which maximise the probability of obtaining
the data that we have observed.
ARIMA Modelling in R
• The auto.arima function in R uses a variation of the Hyndman-
Khandakar algorithm (Hyndman & Khandakar, 2008), which
combines unit root tests, minimisation of the AICc and MLE to obtain
an ARIMA model. The arguments to auto.arima provide for many
variations on the algorithm.
Identifying ARIMA model
• The general rules to be followed in this identification stage of the process can be
summed up as follows:
• 1. If the autocorrelation function abruptly stops at some point—say, after q spikes
—then the appropriate model is an MA(q) type.
• 2. If the partial autocorrelation function abruptly stops at some point—say, after p
spikes—then the appropriate model is an AR(p) type.
• 3. If neither function falls off abruptly, but both decline toward zero in some
fashion, the appropriate model is an ARMA(p, q) type.
Stationarity and differencing
A stationary time series is one whose properties do not depend on the time at which the series is observed. Thus, time
series with trends, or with seasonality, are not stationary — the trend and seasonality will affect the value of the time
series at different times. On the other hand, a white noise series is stationary — it does not matter when you observe it,
it should look much the same at any point in time.

In general, a stationary time series will have no predictable patterns in the long-term. Time plots will show the series to
be roughly horizontal (although some cyclic behaviour is possible), with constant variance.
DIFFERENCING
In previous Figure  Google stock price was non-stationary in panel (a), but the daily changes were stationary in
panel (b). This shows one way to make a non-stationary time series stationary — compute the differences between
consecutive observations. This is known as differencing.

Transformations such as logarithms can help to stabilise the variance of a time series. Differencing can help
stabilise the mean of a time series by removing changes in the level of a time series, and therefore eliminating
(or reducing) trend and seasonality.

The ACF of the differenced Google stock


price looks just like that of a white noise
series. There are no autocorrelations lying
outside the 95% limits, and the Ljung-
Box Q∗ statistic has a p-value of 0.355
(for h=10h=10). This suggests that
the daily change in the Google stock price
is essentially a random amount which is
uncorrelated with that of previous days.
First order Differencing
The differenced series is the change between consecutive observations in the
original series, and can be written as

y′t = yt− yt−1

Second order Differencing Seasonal Differencing


Occasionally the differenced data will not appear to be stationary and it may
be necessary to difference the data a second time to obtain a stationary A seasonal difference is the difference
series between an observation and the
previous observation from the same
y′′t = y′t − y′t−1 season

= (yt − yt−1) − (yt−1 − yt−2) y′t = yt − yt−m


= yt − 2yt−1 + yt−2
where m= the number of seasons.
These are also called “lag-
m differences,” as we subtract the
observation after a lag of m periods.
To distinguish seasonal differences from ordinary
differences, we sometimes refer to ordinary
differences as “first differences,” meaning
differences at lag 1.

Sometimes it is necessary to take both a seasonal


difference and a first difference to obtain
stationary data, as is shown. Here, the data are
first transformed using logarithms (second panel),
then seasonal differences are calculated (third
panel). The data still seem somewhat non-
stationary, and so a further lot of first differences
are computed (bottom panel).
Unit Root Tests
One way to determine more objectively whether differencing is required is to use a unit root test. These
are statistical hypothesis tests of stationarity that are designed for determining whether differencing is
required.

This process of using a sequence of KPSS tests to determine the


appropriate number of first differences is carried out by the function
ndiffs()

A similar function for determining whether seasonal


differencing is required is nsdiffs()

You might also like