0% found this document useful (0 votes)
9 views15 pages

LN LinearTSModels 3

Uploaded by

sureitan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views15 pages

LN LinearTSModels 3

Uploaded by

sureitan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

HE4020 Econometric Time

Series Analysis
Semester 1, 2015-16
LINEAR TIME SERIES MODELS
1 Model Specification
We now look at statistical inference of ARIM A models.
Specifically we look at how
I how to estimate the parameters of a specific
ARIM A(p; d; q) model:
I how to check on the appropriateness of the fitted
model and improve it if needed.
Broadly, the strategy will first be to decide on reason-
able, but tentative, values of p; d and q: Next we estimate
the 0s; 0s and 2e for that model. Finally, we look crit-
ically at the fitted model to check its adequacy. If the
model appears inadequate in some way, we consider the
nature of the inadequacy to help us select another model.
We then proceed to estimate the new model and check it
for adequacy again.
With a few iterations of this model-building strategy, we
hope to arrive at the best possible model for a given se-
ries.
This strategy is popularized by George E.P. Box and G.M.
Jenkins, commonly referred to as the Box-Jenkins method.

3
1.1 Sample Autocorrelation Function
We have seen that ARM A processes have autocorrela-
tions that exhibit certain patterns.
Our aim is to try to recognise, to the extent possible,
similar patterns in the sample autocorrelations, rk . For
an observed series Y1; Y2; : : : ; Yn, the sample autocorre-
lations can be computed by

P
n
(Yt Y )(Yt k Y)
t=k+1
rk = P
n ; k = 1; 2; ::: (1)
(Yt Y )2
t=1
However, as the rk are only estimates of the k , we need
to investigate their sampling properties to facilitate the
comparison of the estimated correlations with their the-
oretical counterparts.
Sampling properties of rk not easy to derive. Have to
content with a general large-sample result.
Suppose

P
1
Yt = + j et j (2)
j=0

4
2
where the et are iid(0; e ): Further, assume that
P
1 P
1
2
j jj < 1 and j j <1 (3)
j=0 j=0
These will be satisfied by any stationary ARM A model.
Then, for any fixed m, the joint distribution of
p p p
n(r1 1 ); n(r2 2 ); : : : ; n(rm m)

approaches, as n ! 1, a joint normal distribution


with zero means, variances cjj and covariances cij ;
where

P
1
2
cij = ( k+i k+j + k i k+j 2 i k k+j 2 j k k+j +2 i j k)
k= 1
(4)
For large n, we would say that rk is approximately nor-
mally distributed with E(ri) i , V ar(ri ) cii=n and
p
Corr(ri; rj ) cij = ciicjj
Note that the approximate variance is inversely propor-
tional to the sample size, but Corr(ri; rj ) is approxi-
mately constant.
Consider equation (4) for some special cases. Suppose
fYtg is white noise. Then equation (4) reduces consider-
5
ably and we obtain

1
V ar(ri) and Corr(ri; rj ) 0; i 6= j (5)
n
Next, suppose fYtg is generated by an AR(1) process
with s = s for s > 0: Then, it can be shown that
equation (4) yields

1 (1 + 2)(1 2i
) 2i
V ar(ri) 2 2i (6)
n 1
In particular,

2
1
V ar(r1) (7)
n
Notice that the closer is to 1; the more precise our
estimate of 1(= ) becomes.
For large lags, the term involving 2i in equation (6) may
be ignored, and we have

1 (1 + 2)
V ar(ri) 2 for large i (8)
n 1
Notice here, in contrast to equation (7), values of close
to 1 imply large variances for ri: Thus we should not
6
i
expect nearly as precise estimate of i = for large i
as we for small i:
For AR(1), equation (4) can also be simplified for 0 <
i < j as

j i j+i 2
( )(1 + ) j i j+i
cij = 2 +(j i) (j+i) (9)
1
In particular, we find
s
2
1
Corr(r1; r2) 2 2 4 (10)
1+2 3
For the M A(1) process, equation (4) simplifies as fol-
lows:

2 4 2
c11 = 1 3 1 +4 1 and cii = 1 + 2 1 for i > 1 (11)

2
c12 = 2 1(1 1) (12)
For a general M A(q) process, equation (4) reduces to

P
q
2
cii = 1 + 2 j; for i > q (13)
j=1

7
so that

" #
1 Pq
2
V ar(ri) = 1+2 j ; for i > q (14)
n j=1

For an observed time series, we can replace 0s by r0s,


take the square root, and obtain an estimated standard
deviation of ri for large lags. A test of the hypothesis that
the series is M A(q) could be carried out by comparing
ri to plus and minus two standard deviations. We would
reject the null hypothesis if and only if ri lies outside
these bounds.
In general, we should not expect the sample correlation
to mimic the true autocorrelation in great detail. We
should not be surprised to see ripples or "trends" in ri
that have no counterparts in the i:

1.2 Sample Partial Autocorrelation


The partial autocorrelation function can help to deter-
mine the order of an AR process.
The partial autocorrelation between Yt and Yt k is de-
fined as the correlation between Yt and Yt k after remov-
ing the effect of the intervening variables Yt 2; Yt 3; : : : ; Yt k+1 :
8
This is known as the partial autocorrelation at lag k and
will be denoted by kk :
A general method for finding the pacf for any stationary
process with acf k is to recognise that kk satisfy the
Yule-Walker equations:

j = k1 j 1 + k2 j 2 + + for j = 1; 2; :::; k
kk j k ;
(15)
We can write these k equations more explicitly as

k1 + 1 +
k2 2 k3+ + k 1 = 1
kk
1 k1 + k2 + 1 k3 + +
k 2 kk = 2
.. .. ..
k 1 k1 + k 2 k2 + k 3
k3 + + kk = k
(16)
Express equation (16) in matrix algebra and use Cramer’s
Rule to solve for kk :

9
1 1 2 k 2 1
1 1 1 k 3 2
.. .. .. .. .. ..
k 1 k 2 k 3 1 k
kk = (17)
1 1 2 k 2 k 1
1 1 1 k 3 k 2
.. .. .. .. .. ..
k 1 k 2 k 3 1 1

Replacing the 0s with r0s in equation (17) leads to esti-


mates of the partial autocorrelations.
It has been shown that the partial autocorrelations can be
computed more efficiently as follows:

kP1
k k 1;j k j
j=1
kk = kP1
(18)
1 k 1;j j
j=1

where

k;j = k 1;j kk k 1;k j for j = 1; 2; :::; k 1 (19)


10
For example, using 11 = 1 to get started, we have

2
2 11 1 2 1
22 = = 2 (20)
1 11 1 1 1

with 21 = 11 22 11 ; which is needed for the next


step:

3 21 2 22 1
33 = (21)
1 21 2 22 2

1.3 The Extended ACF


The acf and pacf are not informative in determining the
order of an ARM A model.
Tsay and Tiao (JASA, 1984) proposed an approach that
uses the extended acf (eacf ) to specify the order of the
ARM A process.
The basic idea of eacf is that if the AR part of a mixed
ARM A model is known, "filtering out" the autoregres-
sion from the observed time series results in a pure M A
process that has the cutoff property in its acf:
Consider an ARM A(p; q) model. Since the AR and MA
orders are unknown, eacf are computed for a range of
11
values of p and q:
Let

Wt;k;;j = Yt e1Yt 1 ek Yt k (22)


be the autoregressive residuals defined with the AR co-
efficients estimated iteratively assuming the AR order k
and the M A order j: The sample autocorrelations of Wt;k;j
are referred to as the extended sample autocorrelations.
For k = p and j q; fWt;k;j g is approximately an
M A(q) model, so that its theoretical autocorrelations of
lag q + 1 or higher are zero.
For k > p, an overfitting problem occurs, and this in-
creases the M A order for the W process.
Tsay and Tiao suggested summarizing the information
in the sample eacf by a table with element in the k th
row and j th column equal to the symbol X if the lag
j+1 sample correlation of Wt;k;j is significantly different
from 0 and O otherwise.
In such a table, an M A(p; q) process will have a theoret-
ical pattern of triangle of zeroes, with the upper left-hand
vertex corresponding to the ARM A orders.
For example, an ARM A(1; 1) process will have a table

12
that looks like this:

MA
AR 0 1 2 3 4 5 6 7
0 X X X X X X X X
1 X O O O O O O O
(23)
2 X X O O O O O O
3 X X X O O O O O
4 X X X X O O O O
5 X X X X X O O O

In practice, the sample eacf may not show such clear


cut pattern.

1.4 Other Specification Methods


The Akaike’s Information Criterion (AIC ) suggests se-
lecting the model that minimizes

AIC = 2 log(maximum likelihood) + 2k (24)


where k = p + q + 1 if the model contains an intercept
and k = p + q otherwise.
The 2k term in AIC acts as a "penalty function" that
helps to ensure selection of parsimonious models.
13
An alternative information criterion is the Schwarz Bayesian
Information Criterion (BIC) defined as

BIC = 2 log(maximum likelihood) + k log(n) (25)


The AIC criterion is said to be asymptotically efficient,
while BIC is said to be consistent.
If the true process follows an ARM A(p; q) model, then
it is known that the orders specified by minimizing BIC
leads to the true orders as the sample size increases.
However, if the true process is not a finite-order ARM A
process, then minimizing AIC among an increasingly
large class of ARM A models enjoys the appealing prop-
erty that it will lead to an optimal ARMA model that is
closes to the true process among the class of models un-
der study.
Maximum likelihood estimation for ARM A model is
prone to numerical problems due to multimodality of the
likelihood function and the problem of overfitting when
AR and M A orders exceed the true orders.
Hannan and Rissanen (Biometrika, 1982) proposed an
interesting practical solution to this problem.
Their procedure consists of first fitting a high-order AR
14
process with the order determined by minimizing the
AIC .
The second step uses the residuals from the first step as
proxies for the unobservable error terms.
Thus, an ARM A(k; j) model can be approximately es-
timated by regressing the time series on its own lags 1 to
k together with lags 1 to j of the residuals from the high
order autoregression.
The BIC of this autoregressive model is an estimate of
the BIC obtained with maximum likelihood estimation.
Hannan and Rissanen demonstrated that minimizing the
approximate BIC still leads to consistent estimation of
the ARM A orders.
Order determination is related to the problem of finding
the subset of nonzero coefficients of an ARM A model
with sufficiently high ARM A orders.
A subset ARM A(p; q) model is an ARM A(p; q) model
with a subset of its coefficients known to be zero. For
example, the model

Yt = 0:8Yt 12 + et + 0:7et 12 (26)


is a subset ARM A(12; 12) model.
15
BIC

-130
-130
-140
-140
-140
-140
-140
-140
(Intercept)
test-lag1
test-lag2 Example
test-lag3
test-lag4
test-lag5
test-lag6
test-lag7
test-lag8
test-lag9
test-lag10
test-lag11
test-lag12
test-lag13

16
test-lag14
error-lag1
error-lag2
error-lag3
error-lag4
error-lag5
error-lag6
error-lag7
error-lag8
finding an optimal subset ARM A model.

error-lag9
error-lag10
error-lag11
I Search for best subsets of ARM A(14; 14)

error-lag12
error-lag13
error-lag14
I Simulate a series from Yt = 0:8Yt 12 + et + 0:7et
1
The method of Hannan and Rissanen can be extended to

You might also like