0% found this document useful (0 votes)
7 views29 pages

Models For Non-Stationary Time Series: T T T T T

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views29 pages

Models For Non-Stationary Time Series: T T T T T

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Any time series without a constant mean over time

is non-stationary. Models of the form


Models for Yt = t + Xt

Non-Stationary
where t is a non-constant mean function and Xt is
a zero-mean, stationary series.

Non-stationary time series can be converted to

Time Series stationary time series through differencing. This


leads us to introduce a time series model known as
integrated autoregressive moving average model.

As an example consider the Dow-Jones index on The time plot of the series displays considerable
251 trading days ending 26 August 1994. variation and a stationary model does not seem to
be reasonable.

Taking first difference has transformed the series a


stationary series which resembles white noise,
showing that the daily change in the index is
essentially a random amount uncorrelated with
previous days.

ARIMA Models

A time series {Yt} is said to follow an integrated


autoregressive moving average model if the dth
difference Wt = dYt is a stationary ARMA
process. If {Wt} follows an ARMA(p,q) model, we
say that {Yt} is an ARIMA(p,d,q) process.
Fortunately, for practical purposes, we can usually
take d = 1 or at most 2.

Consider then an ARIMA(p,1,q) process. With Wt If the process contains no autoregressive terms, we
= Yt Yt 1, we have call it an integrated moving average and abbreviate
the name to IMA(d,q).

If no moving average terms are present, we denote


or, in terms of the observed series,
the model as ARI(p,d).

The IMA(1,1) Model


which we may rewrite as
The simple IMA(1,1) model satisfactorily
represents numerous time series, especially those
arising in economics and business.
In difference equation form, the model is

For this model, if t increases, Var(Yt) increases and


could be quite large. The correlation between Yt For this model, the variance of Yt increases rapidly
and Yt k will be strongly positive for many lags k with t and Corr(Yt,Yt k) is nearly 1 for all moderate
= 1, 2, . k.

The IMA(2,2) Model A simulation of an IMA(2,2) process with 1 = 1


and 2 = 0.6 shows the increasing variance and
In difference equation form, the IMA(2,2) model is the strong, positive neighboring correlations
expressed as: dominate the appearance of the time series plot.

Following figure shows the time series plot of the Finally, the second differences of the simulated
first difference of the simulated series. This series IMA(2,2) series values are plotted in the following
is also nonstationary, as it is governed by an figure. These values arise from a stationary MA(2)
IMA(1,2) model. model with 1 = 1 and 2 = 0.6.

The ARI(1,1) Model

The ARI(1,1) process will satisfy

where | < 1.
Theoretical autocorrelations for this model are 1 =
0.678 and 2 = 0.254. These correlation values Nonstationary ARIMA series can be simulated by
seem to be reflected in the appearance of the time first simulating the corresponding stationary
series plot. ARMA series and then it (really
partially summing it).

Constant Terms in ARIMA Models

For an ARIMA(p,d,q) model, dYt = Wt is a


stationary ARMA(p,q) process. The standard
assumption is that stationary models have a zero
mean; that is, we are actually working with
deviations from the constant mean.

A nonzero constant mean, in a stationary ARMA


model {Wt} can be accommodated in the following
way:
Other Transformations If the standard deviation of the series is
proportional to the level of the series, then
We know differencing is a useful transformation transforming to logarithms will produce a series
for achieving stationarity. However, the logarithm with approximately constant variance over time.
transformation is also a useful method in certain
circumstances. Also, if the level of the series is changing roughly
exponentially, the log-transformed series will
We frequently encounter series where increased exhibit a linear time trend. Thus, we might then
dispersion seems to be associated with higher want to take first differences.
levels of the series the higher the level of the
series, the more variation there is around that level As an example, consider the time series shown
and conversely. below:

Following figure displays the time series plot of the


logarithms of the electricity values. Now the
amount of variation around the upward trend is
much more uniform across high and low values of
the series.

This series gives the total monthly electricity


generated in the United States in millions of
kilowatt-hours. The higher values display
considerably more variation than the lower values.

The differences of the logarithms of the electricity The Backshift Operator


values are displayed below. On the basis of this
plot, we might well consider a stationary model as Many other books and much of the time series
appropriate. literature use what is called the backshift operator
to express and manipulate ARIMA models.

The backshift operator, denoted B, operates on the


time index of a series and shifts time back one time
unit to form a new series. In particular,

Two applications of B to Yt shifts the data back two


periods, as follows:
Note that a first difference is represented by (1-B).
Similarly, if second-order differences (i.e., first
For monthly data, if we wish to shift attention to differences of first differences) have second-order
the same month last year, then B12 is used, and the to be computed, then:
notation is B12Yt = Yt-12.

The backward shift operator is convenient for


describing the process of differencing. A first
difference can be written as
Note that the second-order difference is denoted (1- Exercise 5.1
B)2.
Identify the following as specific ARIMA models.
It is important to recognize that a second-order That is, what are p, d, and q and what are the
difference is not the same as a second difference, values of the parameters (the and )?
which would be denoted 1-B2; similarly, a twelfth
difference would be 1-B12, but a twelfth-order
difference would be (1-B)12.

In general, a dth-order difference can be written as


(1-B)dYt

For a time series model first we need to decide


tentative values for p, d, and q and then estimate

Model
the , and e for that model in the most
efficient way. Then we check the adequacy of the
model. If the model appears inadequate, we
consider the nature of the inadequacy to help us

Specification
select another model. We proceed to estimate that
new model and check it for adequacy.

With a few iterations of this model-building


strategy, we arrive at the best possible model for a
given series.

General Behavior of the ACF and PACF for Simulated AR(1) series with = 0.9
ARMA model

The ACF and PACF provide effective tools for


identifying AR(p) or MA(q) models.

Simulated AR(1) series with = -0.9


AR(1) Simulated AR(2) series with 1 = .6 and 2 = .3

ACF
Exponential decay: on positive side if >0 and
alternating in sign starting on negative side if <0.

PACF
Spike at lag 1, then cuts off to zero: spike positive
if >0, negative if <0.

Simulated AR(2) series with 1 = -.4 and 2 = .5

Simulated AR(2) series with 1 = 1.2 and 2 = -.7


Simulated AR(2) series with 1 = -1 and 2 = -.6

AR(2)

ACF
Exponential decay or damped sine-wave. The exact
pattern depends on the signs and sizes of 1 and 2.

PACF
Spikes at lags 1 to 2, then cuts off to zero.

AR(p) Simulated MA(1) series with = .9

ACF
Exponential decay or damped sine-wave. The exact
pattern depends on the signs and sizes of 1,----, p.

PACF
Spikes at lags 1 to p, then cuts off to zero.
Simulated MA(1) series with = -.9

MA(1) Simulated MA(2) series with 1 = .5 and 2 = .4

ACF
Spike at lag 1 then cuts off to zero: spike positive if
1 < 0, negative if 1 > 0.

PACF
Exponential decay: on negative side if 1 > 0 and
alternating in sign starting on positive side if 1 <
0.

Simulated MA(2) series with 1 = 1.2 and 2 = -.7


Simulated MA(2) series with 1 = -1 and 2 = -.6

MA(2)

ACF
Spikes at lags 1 to 2, then cuts off to zero.

PACF
Exponential decay or damped sine-wave. The exact
pattern depends on the signs and sizes of 1 and 2.

MA(q) Simulated ARMA(1,1) series with =.8 and =-.7

ACF
Spikes at lags 1 to q, then cuts off to zero.

PACF
Exponential decay or damped sine-wave. The exact
pattern depends on the signs and sizes of 1,----, q.
Simulated ARMA(1,1) series with =.8 and =.4

Simulated ARMA(1,1) series with =-.8 and =-.4

Simulated ARMA(1,1) series with =-.8 and =.4


ARMA(p,q)

ACF
Exponential decay.

PACF
Exponential decay.

Nonstationarity Usually one or at most two differences, perhaps


combined with a logarithm or other transformation,
The sample ACF computed for nonstationary series will accomplish this reduction to stationarity.
typically fails to die out rapidly as the lags increase.
The sample PACF is also indeterminate. Overdifferencing

If the first difference of a series and its sample ACF The difference of any stationary time series is also
do not appear to support a stationary ARMA model, stationary. Overdifferencing introduces unnecessary
then we take another difference and again compute correlations into a series and will complicate the
the sample ACF and PACF to look for modeling process.
characteristics of a stationary ARMA process.

For example, suppose our observed series, {Yt}, is Specifying an IMA(2,1) model would not be
a random walk so that one difference would lead to appropriate here. The random walk model, which
a very simple white noise model can be thought of as IMA(1,1) with = 0, is the
correct model.

If we difference once more (that is, overdifference) The random walk model can also be thought of as
we have an ARI(1,1) with = 0 or as a nonstationary AR(1)
with = 1. Overdifferencing also creates a
which is an MA(1) model but with = 1. If we take noninvertible model. Noninvertible models also
two differences in this situation we unnecessarily create serious problems when we attempt to
have to estimate the unknown value of . estimate their parameters.

To avoid overdifferencing, we carefully observe at


Consider the model
each difference in succession and choose models as
simple as possible.
where et is a white noise error term.
The Dickey-Fuller Unit-Root Test
We know that if = 1, that is, in the case of the unit
While the approximate linear decay of the sample
root, this becomes a random walk model, which we
ACF is often taken as a symptom that the
know is a nonstationary stochastic process. Subtract
underlying time series is nonstationary and requires
Y 1 from both sides to obtain:
differencing, it is also useful to quantify the
evidence of nonstationarity in the data-generating
mechanism. This can be done via hypothesis
testing.
which can be alternatively written as: Since et is a white noise error term, it is stationary,
which means that the first differences of a random
walk time series are stationary.
where a = 1). In practice, we test the (null)
hypothesis that a = 0, the alternative hypothesis
To estimate a, take the first differences of Yt and
being that a < 0.
regress them on Y 1 and see if the estimated slope
coefficient in this regression ( ) is zero or not.
If a = 0, then = 1, that is we have a unit root,
meaning the time series under consideration is
If it is zero, we conclude that Yt is nonstationary.
nonstationary. Now if a = 0, we have
But if it is negative, we conclude that Yt is
stationary.

Dickey and Fuller have shown that under the null Other Specification Methods
hypothesis that a = 0, the estimated t value of the
coefficient of Y 1 follows the (tau) statistic. A number of other approaches to model
specification have been proposed. One of the most
If the computed absolute value of tau statistic studied is (1973) Information Criterion
exceeds the absolute DF values, we reject the (AIC). This criterion says to select the model that
hypothesis that a = 0, in which case the time series minimizes
is stationary.
AIC = 2log(maximum likelihood) + 2k
In case, where et are correlated, Dickey and Fuller
have developed another test, known as the where k = p + q + 1 if the model contains an
augmented Dickey-Fuller (ADF) test. intercept or constant term and k = p + q otherwise.

The corrected AIC, denoted by AICc can also be Another approach to determining the ARMA orders
used and defined by the formula is to select a model that minimizes the Schwarz
Bayesian Information Criterion (BIC) defined as

BIC = 2log(maximum likelihood) + klog(n)


Here n is the sample size and again k is the total
number of parameters. If the true process follows an ARMA(p,q) model,
then it is known that the orders specified by
minimizing the BIC are consistent; that is, they
approach the true orders as the sample size
increases.

We may wish to consider several models with AIC Specification of the Actual Time Series
(BIC) values close to the minimum. A difference in (Dividend)
AIC (BIC) values of 2 or less is not regarded as
substantial and we may wish to choose a simpler Consider the U.S. dividend series for the quarterly
model either for simplicity, or for the sake of periods of 1970 1991, for a total of 88 quarterly
getting a better model fit. observations. The raw data are given below:
The dividend time series shows that over the period
of study dividend has been increasing, that is,
showing an upward trend, suggesting perhaps that
the mean of the dividend has been changing. This
perhaps suggests that the dividend series is not
stationary.

The autocorrelation coefficient starts at a very high


value at lag 1 and declines very slowly. Thus it
seems that the dividend time series is not stationary.

As with the ACF, the partial autocorrelation should


all be close to zero for a white noise series. Since
after the first lag the PACF drops dramatically, and
all PACFs after first lag are statistically
insignificant, the dividend time series is not
stationary.

So we take first differences of the data and


reanalyze.
Now autocorrelations show a mixture of an Model AIC
exponential decay and sine-wave pattern. 2,1,0 283
1,1,0 297
There are two significant partial autocorrelation.
This suggests an AR(2) model is operating. 1,1,1 286
2,1,1 284.9
So, for the original series, we have identified an
ARIMA(2,1,0) model. The ARIMA(2,1,0) selected initially is still the best
model since it has the smallest AIC value.
Now we want to consider several models with AIC
values close to the minimum.

R code Specification of the Actual Time Series (Internet)

data<-read.csv('dividend.csv') The number of users logged on to an Internet server


attach(data) each minute over 100 minutes in given below:
plot(dividend)
acf(dividend)
pacf(dividend)
adf.test(dividend)
plot(diff(dividend))
acf(diff(dividend))
adf.test(diff(dividend))
x.fit = arima(dividend, order = c(2, 1, 0))
The autocorrelation plot gives indications of non-
stationarity, and the data plot makes this clear too.

The first partial autocorrelation is very dominant


and close to 1- also showing the non-stationarity.

So we take first differences of the data and


reanalyze.

Now autocorrelations show a mixture of an


exponential decay and sine-wave pattern.

There are three significant partial autocorrelation.


This suggests an AR(3) model is operating.

So, for the original series, we have identified an


ARIMA(3,1,0) model. The ARIMA(3,1,0) selected initially is still the best
model since it has the smallest AIC value.
Now we want to consider several models with AIC
values close to the minimum.

Exercise 6.12 Exercise 6.13

From a time series of 100 observations, we A stationary time series of length 121 produced
calculate r1 = 0.49, r2 = 0.31, r3 = 0.21, r4 = 0.11, sample partial autocorrelation of
and |rk| < 0.09 for k > 4. On this basis alone, what
ARIMA model would we tentatively specify for the
series? Based on this information alone, what model would
we tentatively specify for the series?

After the specification of the values of p, d and q of


an ARIMA model, we estimate the parameters of
that model. There are three methods of estimating

Parameter the parameters namely:

(i) The method of moments

Estimation (ii) The method of least squares


(iii) The method of maximum likelihood
The method of moments Autoregressive Models

The method of moments is frequently one of the Consider first the AR(1) case:
easiest methods for obtaining parameter estimates,
but is not the most efficient. The method consists
of equating sample moments to corresponding For this process we obtain
theoretical moments and solving the resulting
equations to obtain estimates of any unknown
parameters. The simplest example of the method is
to estimate a stationary process mean by a sample We have the simple relationship 1 = . In the
mean. method of moments, 1 is equated to r1, the lag 1
sample autocorrelation. Thus we can estimate by

The method of moments replaces 1 by r1 and 2 by


r2 to obtain
Now consider the AR(2) case:

Solving the two equations we obtain


For this process we obtain

The relationships between the parameters 1 and 2


The general AR(p) case proceeds similarly.
and various moments are given by Replace k by rk to obtain

Moving Average Models

For MA(1) process:

We know that
Solving these equations we obtain

Equating 1 to r1, we are led to solve a quadratic


equation in . The invertible solution can be written
as

If r1 = 0.5, unique, real solutions exist, Mixed Models


namely 1, but neither is invertible. If |r1| > 0.5, no
real solutions exist, and so the method of moments For ARMA(1,1) process:
fails to yield an estimator of .

For higher-order MA models, the method of We know


moments quickly gets complicated.

Noting that 2 1 = , we can first estimate as


Having done so, to solve for we can then use Estimates of the Noise Variance

The final parameter to be estimated is the noise


variance, e2. In all cases, we can first estimate the
Note that a quadratic equation must be solved and process variance, 0 = Var(Yt), by the sample
only the invertible solution, if any, retained. variance

and using the known relationships among 2


0, e ,
and the and to estimate e2.

For an AR(1) process, we know that For the MA(1) model

For the AR(p) models For the MA(q) models

For the ARMA(1,1) process Method-of-Moments Parameter Estimates for


Simulated Series

Generally speaking, the estimates for all the


autoregressive models are fairly good but the
estimates for the moving average models are not
acceptable. It can be shown that method-of-
moments estimators are very inefficient for models
containing moving average terms.

Least Squares Estimation Autoregressive Models

Because the method of moments is unsatisfactory Consider the first-order case where
for many models, we must consider other methods
of estimation. We begin with least squares. At this
point, we introduce a possibly nonzero mean, We can view this as a regression model with
into our stationary models and treat it as another predictor variable Y 1 and response variable Yt.
parameter to be estimated by least squares. Least squares estimation then proceeds by
minimizing the sum of squares of the differences
Since only Y1, Y2, , Yn are observed, we can only or, simplifying and solving for
sum from t = 2 to t = n. The conditional sum-of-
squares function can be written as

For large n,
According to the principle of least squares, we
estimate and by the respective values that
minimize Sc( , given the observed values of Y1,
Regardless of the value of , we have
Y2, , Yn. Considering Sc = 0, we have

Consider now the minimization of Sc( , ) with Except for one term missing in the denominator,
respect to . We have this is the same as r1, namely

Especially, for large samples, the least squares and


Setting this equal to zero and solving for yields method of moments estimators are nearly identical.

For the general AR(p) process, using the same way


we have

To generalize the estimation of the we consider


the second-order model.

In accordance, replacing by in the conditional


sum-of-squares function, we obtain

The sum of the lagged products


Setting Sc 1 = 0, we have

is very nearly the numerator of r1 we are missing


one product,
which can be rewrite as
A similar situation exists for

but here we are missing Moving Average Models

Consider the MA(1) model


Now, dividing both sides by

Rewriting this as et = Yt + 1and then replacing t


we obtain
by 1 and substituting for e 1 we get

In a similar way with the equation Sc 2 = 0,


leads to

Entirely analogous results follow for the general If 1, we may continue this substitution
stationary AR(p) process. into the past and obtain the expression
It is clear that the least squares problem is
nonlinear in the parameters. We will not be able to
minimize Sc by taking a derivative with respect
This is an autoregressive model of infinite order. to setting it to zero, and solving. Thus, even for
The least squares can be carried out by choosing a the simple MA(1) model, we must resort to
value of that minimizes techniques of numerical optimization.

Now consider evaluating Sc for a single given


value of . The only we have available are our
where et = et is a function of the observed series observed series, Y1, Y2, , Yn. We know MA(1)
and the unknown parameter . process is

and thus calculate


Using this equation, e1, e2, , en can be calculated
recursively if we have the initial value e0. A
common approximation is to set e0 = 0 its conditional on e0 = 0, for that single given value of
expected value. Then, conditional on e0 = 0, we can .
obtain
For the simple case of one parameter, we carry out
a grid search over the invertible range 1,+1) for
to find the minimum sum of squares. For more
general MA(q) models, a numerical optimization
algorithm, such as Gauss-Newton or Nelder-Mead,
will be needed.

Mixed Models However, a better approach is to begin the


recursion at t = 2, thus avoiding Y0 altogether, and
Consider the ARMA(1,1) case simply minimize

For the general ARMA(p,q) model, we compute


we consider et = et( and wish to minimize

To obtain e1, we now have an additional startup


with ep = e 1 = ---- = ep+1 = 0 and then minimize
problem, namely Y0. One approach is to set Y0 = 0
or to if our model contains a nonzero mean.

Maximum Likelihood Estimation


numerically to obtain the conditional least squares
AR(1) Model
estimates of all the parameters.
Consider an AR(1) process

Let yt; t=1,2,----,n be the observed sample time


series, then MLE of AR parameter is the value of
parameter for which this sample is the most likely
to be observed.
This approach requires specifying a particular
distribution for the white noise term et. Typically
we assume et ~ N(0, e2).
But in case of time series, the observations are
dependent, so we are needed to use the conditional
The most important step in MLE is to evaluate the
densities. Each observation is dependent on its
sample joint distribution, which is also called the
previous observation and for y1 there is no previous
likelihood function
observation, therefore it will be alone.
L=f(y1,y2,---,yn)

In case of IID sample, the joint distribution is just


the product of marginal distributions

First we will find f(yt/yt-1). Treating yt-1 as a


constant, for the AR(1) process given below

we obtain
Now we will find f(y1). Express the AR(1) process
in terms of et. For simplicity, assuming =0, we
obtain
Thus

Thus

we obtain

Now
For 0

The log-likelihood function is Now, partially differentiate the log-likelihood


function with respect to and equate to zero we
obtain
For simplicity, assume = 0 and solve Now, partially differentiate the log-likelihood
function with respect to e2 and equate to zero

We can obtain the approximate MLE of by


ignoring the first term

For simplicity, assume = 0 and solve

For stationary autoregressive models, the method


of moments yields estimators equivalent to least
squares and maximum likelihood, at least for large
The exact closed form of the likelihood function of samples. For models containing moving average
a general ARMA model is complicated. The terms, such is not the case.
estimations are often done by numerical methods.

Parameter Estimation for some Simulated ARMA(1,1)


Models

AR(1)

AR(2)

Exercise 7.1

From a series of length 100, we have computed r1 =


0.8, r2 = 0.5, r3 = 0.4, = 2, and a sample variance

Forecasting
of 5. If we assume that an AR(2) model with a
constant term is appropriate, how can we get
(simple) estimates of 1, 2, and e2 ?
One of the primary objectives of building a model This forecast is developed based on minimizing the
for a time series is to be able to forecast the values mean square forecasting error. It turns out that
for that series at future times and the assessment of
the precision of those forecasts.
ARIMA Forecasting
Suppose the series is available up to time t, namely
Y1, Y2, , Y 1, Yt. We would like to forecast the AR(1)
value of Y that will occur time units into the
future. We call time t the forecast origin and the Consider the AR(1) process with a nonzero mean
lead time for the forecast, and denote the forecast
as .

Consider the problem of forecasting one unit time Since et+1 is independent of Y1, Y2, , Y 1, Yt, we
into the future. Replacing t by t + 1 we have obtain

Given Y1, Y2, , Y 1, Yt, we take the conditional Thus the one step ahead forecast is
expectations of both sides and obtain

For a general lead time replacing t by and


taking the conditional expectation we get
From the properties of conditional expectation, we
have

since For example, is obtained form

and, for 1, e is independent of Y1, Y2, , Y 1, Iterating backward on of the difference equation
Yt . form, we have

Thus the following equation can be used to forecast


for any lead time by starting with the initial
forecast and sometimes called the difference
equation form of the forecast.

In general, since | | < 1, we have simply for large Suppose, the last observed value is 67, so we
would forecast one time period ahead as

As a numerical example, consider the AR(1) model


with the results shown below
For lead time 2, using difference equation we have
Alternatively, we can use Forecast error for AR(1)

Consider the one step ahead forecast error, et(1) as

At lead 5, we have

or

At lead 10, we have


One step ahead forecast error variance is

which is very nearly (= 74.3293).

Express the AR(1) process in terms of et. For


simplicity, assuming =0, we obtain

Thus, step ahead forecast error variance is

Now
or

for large .

MA(1)

Consider the MA(1) case with nonzero mean:


If 1, we may continue this substitution
into the past and obtain the expression
Replacing t by t+1 and taking conditional
expectations of both sides, we have
That is, et is a function of Y1, Y2, , Y 1, Yt and so

Now, rewriting MA(1) as et = Yt + 1 and then


replacing t by 1 and substituting for e Thus the one-step-ahead forecast can be expressed
1 we get
as

For larger lead times, we have Forecast error for MA(1)

One step ahead forecast error is


for > 1, both e and e 1 are independent of Y1,
Y2, , Yt. Consequently, these conditional expected
values are the unconditional expected values,
namely zero, and we have

step ahead forecast error is


ARMA(p,q)
For the general stationary ARMA(p,q) model, the
difference equation form for computing forecasts is
given by

where

For ARMA(p,q) models, the noise terms e 1), , Example


e 1, et appear directly in the computation of the
forecasts for leads = 1, 2, , q. However, for > The following model was fitted to annual
q, the autoregressive portion of the difference bituminous coal production in the United States
equation takes over, and we have from 1920 to 1968:

Thus the general nature of the forecast for long (a) What sort of ARIMA model is this (i.e., what
lead times will be determined by the autoregressive are p, d, and q)?
parameters 1, 2, , p. The variance of the (b) The last five values of the series are given
forecast error is below:

t (year): 1964 1965 1966 1967 1968 ARMA(1,1)


Yt (mt): 467 512 534 552 545 Consider the ARMA(1,1) model with a constant
term 0
The estimated parameters are c = 146:1, 1 = 0.891, Yt = Yt-1+ 0+et- et-1
2 = -0.257, 3 = 0.392, 4 = -0.333. Give forecasts
The one-step-ahead forecast can be expressed as
for the next three years (1969-1971).

The two-step-ahead forecast can be expressed as

More generally,

Prediction Limits As a numerical example, consider the AR(1) model


with = 0.5705, =74.3293, and e2= 24.8. For a
In addition to forecasting or predicting the one-step-ahead prediction, we have
unknown Y , we would like to assess the
precision of our predictions.

We may be (1 100% confident that the future Two-step-ahead prediction interval is


observation Y will be contained within the
71.86072 11.88343
prediction limits
60.71 to 83.18
Notice that this prediction interval is wider than the
previous interval.
Forecasting ten steps ahead leads to In addition, a horizontal line at the estimate for the
process mean is shown. Notice that the forecasts
74.173934 11.88451 approach the mean exponentially as the lead time
62.42 to 86.19 increases. Also the prediction limits increase in
By lead 10, both the forecast and the forecast limits width.
have settled down to their long-lead values.

Following figure displays AR(1) series together


with forecasts out to lead time 12 with the upper
and lower 95% prediction limits for those forecasts.

The following series was fitted by working with the Exercise 9.1
square root and then fitting an AR(3) model. Notice
that the forecasts mimic the approximate cycle in
the actual series even when we forecast with a lead
time out to 25 years.

Many business and economic time series contain a


phenomenon that repeats itself after a regular
period of time. The time period for this repetitive

Seasonal phenomenon is called the seasonal period.

For example, the quarterly sales of ice-cream is

Models high each summer, and the series repeats this


phenomenon each year, giving a seasonal period of
4.

Similarly, monthly auto sales and earnings tend to


decrease during August and September every year
because of the change over to new models, and the
monthly sales of toys rise every year in the month
of December. The seasonal period in the later cases
is 12.
There is a strong upward trend but also a
As an illustration the following figure displays the seasonality that can be seen better in the more
monthly CO2 levels from January 1994 through detailed in the following figure, where only the last
December 2004. few years are graphed using monthly plotting
symbols.
Seasonal ARIMA Models

We begin by studying stationary models and then


consider nonstationary generalizations. We let s
denote the known seasonal period; for monthly
series s = 12 and for quarterly series s = 4.

As we see in the displays, carbon dioxide levels are Consider the time series generated according to
higher during the winter months and much lower in
the summer.

It is evident that such a series is always stationary


and that the autocorrelation function will be
nonzero only at the seasonal lags of s, 2s, 3s, ,
It is observed that such a series is stationary and Qs. In particular,
has nonzero autocorrelations only at lag 12.

Generalizing these ideas, we define a seasonal


MA(Q) model of order Q with seasonal period s by Seasonal autoregressive models can also be
defined. Consider

Multiplying the equation by Yt , taking With this example in mind, we define a seasonal
expectations, and dividing by AR(P) model of order P and seasonal period s by
0 yields

It can be shown that the autocorrelation function is


nonzero only at lags s, 2s, 3s, , where it behaves
like a combination of decaying exponentials and
damped sine functions. In particular,
Similarly, one can show that k = 0 except at the
seasonal lags 12, 24, 36, . At those lags, the
autocorrelation function decays exponentially like
an AR(1) model. with zero correlation at other lags.

By combining the ideas of seasonal and


nonseasonal ARMA models, we can develop
models that contain autocorrelation for the seasonal
lags but also for low lags of neighboring series
values.

The ARIMA notation can be extended readily to where s = number of periods per season.
handle seasonal aspects, and the general shorthand
notation is The algebra is simple but can get lengthy, so for
illustrative purposes consider the following general
ARIMA(1, 1, 1)(1, 1, 1)4 model:
Clearly, such models represent a broad, flexible
class from which to select an appropriate model for
a particular time series.

It has been found empirically that many series can


be adequately fit by these models, usually with a
All the factors can be multiplied out and the
small number of parameters, say three or four.
general model written as follows:

Model Specification, Fitting, and Checking Model Specification

Model specification, fitting, and diagnostic As always, a careful inspection of the time series
checking for seasonal models follow the same plot is the first step. Following figure displays
general techniques. monthly carbon dioxide levels in northern Canada.
The upward trend alone would lead us to specify a
Here we shall simply highlight the application of nonstationary model.
these ideas specifically to seasonal models and pay
special attention to the seasonal lags.

Following figure shows the sample autocorrelation Notice the strong correlation at lags 12, 24, 36, and
function for that series. The seasonal so on. In addition, there is substantial other
autocorrelation relationships are shown quite correlation that needs to be modeled. Following
prominently in this display. figure shows the time series plot of the CO2 levels
after we take a first difference.

Perhaps seasonal differencing will bring us to a


The general upward trend has now disappeared but
series that may be modeled. After taking both a
the strong seasonality is still present, as evidenced
first difference and a seasonal difference, it appears
by the behavior shown in the following figure.
that most, if not all, of the seasonality is gone now.
Following figure confirms that very little We will consider specifying the multiplicative,
autocorrelation remains in the series after these two seasonal ARIMA(0,1,1) (0,1,1)12 model
differences have been taken. This plot also suggests
that a simple model which incorporates the lag 1
and lag 12 autocorrelations might be adequate. As usual, all models are tentative and subject to
revision at the diagnostics stage of model building.

Model Fitting

Having specified a tentative seasonal model for a


particular time series, we proceed to estimate the
parameters of that model as efficiently as possible.

Following table gives the maximum likelihood The coefficient estimates are all highly significant,
estimates and their standard errors for the and we proceed to check further on this model.
ARIMA(0,1,1) (0,1,1)12 model for CO2 levels.

Diagnostic Checking Other than some strange behavior in the middle of


the series, this plot does not suggest any major
To check the estimated the ARIMA(0,1,1) irregularities with the model.
(0,1,1)12 model, we first look at the time series
plot of the residuals. To look further, we graph the sample ACF of the
residuals. The only
correlation is at lag 22, and this correlation has a
value of only 0.17, a very small correlation.

Finally, we should not be surprised that one


autocorrelation out of the 36 displayed is
statistically significant.

This could easily happen by chance alone. Next we investigate the question of normality of
the error terms via the residuals. The quantile-
quantile plot shows - .
As one further check on the model, we consider Forecasting Seasonal Models
overfitting with an ARIMA(0,1,2) (0,1,1)12 model
with the results shown below Computing forecasts with seasonal ARIMA models
is, as expected, most easily carried out recursively
using the difference equation form for the model.

For example, consider the model


ARIMA(0,1,1) (1,0,1)12.
We see that the estimates of 1 and have changed
very little. In addition, the estimate of the new
parameter, 2, is not statistically different from The one-step-ahead forecast from origin t is then
zero.

and the next one is For the ARIMA(0,1,1) (0,1,1)12 model

and so forth. The noise terms et 12, et 11, , et (as The forecast satisfy
residuals) will enter into the forecasts for lead
times l = 1, 2, , 13, but for l > 13 the
autoregressive part of the model takes over and we
have

Prediction Limits

Prediction limits are obtained precisely as in the


nonseasonal case. We illustrate this with the carbon
dioxide time series.

Following figure shows the forecasts and 95%


forecast limits for a lead time of two years for the
ARIMA(0,1,1) (0,1,1)12 model that we fit. The forecasts mimic the stochastic periodicity in
the data quite well, and the forecast limits give a
good feeling for the precision of the forecasts.

Following figure displays the last year of observed Equivalences with exponential smoothing
data and forecasts out four years. At this lead time, models
it is easy to see that the forecast limits are getting
wider, as there is more uncertainty in the forecasts. It is possible to show that the forecasts obtained
from some exponential smoothing models are
identical with forecasts from particular ARIMA
models.

The simple exponential smoothing forecasts are


equivalent to those from an ARIMA(0,1,1) model.
The moving average parameter, , is equivalent to
1- where is the smoothing parameter.
Holt's linear method is equivalent to an Holt-Winters' multiplicative method has no
ARIMA(0,2,2) model. The moving average equivalent ARIMA model.
parameters are 1 = 2- - and 2 = -1 where
and are the two smoothing parameters. Many computer packages use these equivalences to
produce prediction intervals for exponential
Holt-Winters' additive method gives forecasts smoothing.
equivalent to an ARIMA(0,1, s+1)(0,1,0)s model.
There are several parameter restrictions because the
ARIMA model has s + 1 parameters whereas the
Holt-Winters' method uses only three parameters.

Exercise 10.1 Example

Based on quarterly data, a seasonal model of the


form has been fit to a certain time series.

Suppose that 1 = 0.5, 2 = 0.25, and e = 1. Find


forecasts for the next four quarters if data for the
last four quarters are

You might also like