PSTAT 174/274 Lecture Notes 6: Model Identification AND Estimation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

PSTAT 174/274

NOTES - 6
LECTURE NOTES 6
MODEL IDENTIFICATION
AND
ESTIMATION

1
MODEL IDENTIFICATION
• We have learned a large class of linear
parametric models for stationary time series
processes.
• Now, the question is how we can find out the
best suitable model for a given observed
series. How to choose the appropriate model
(on order of p and q).

2
MODEL IDENTIFICATION
• ACF and PACF show specific properties for
specific models. Hence, we can use them as a
criteria to identify the suitable model.
• Using the patterns of sample ACF and sample
PACF, we can identify the model.

3
MODEL SELECTION THROUGH CRITERIA
• Besides sACF and sPACF plots, we have also other
tools for model identification.
• With messy real data, sACF and sPACF plots
become complicated and harder to interpret.
• Don’t forget to choose the best model with as
few parameters as possible.
• It will be seen that many different models can fit
to the same data so that we should choose the
most appropriate (with less parameters) one and
the information criteria will help us to decide this.

4
MODEL SELECTION THROUGH CRITERIA
• The three well-known information criteria are
– Akaike’s information criterion (AIC) (Akaike, 1974)
– Schwarz’s Bayesian Criterion (SBC) (Schwarz, 1978).
Also known as Bayesian Information Criterion (BIC)
– Hannan-Quinn Criteria (HQIC) (Hannan&Quinn,
1979)

5
AIC
• Assume that a statistical model of M parameters
is fitted to data
AIC  2 lnmaximum likelihood  2M .
• For the ARMA model and n observations, the
log-likelihood function

ln L   ln 2 a  2 S  p , q ,  
n 2 1
2 2 a 
SS Re sidual

 assuming at ~ N 0, a .


 i .i .d
2 

  6
AIC
• Then, the maximized log-likelihood is
n n
ln̂ L   ln ˆ a  1  ln 2 
2
2 2 

constant

AIC  n ln ˆ a2  2 M

Choose model (or the value of M) with


minimum AIC.
7
SBC
• The Bayesian information criterion (BIC) or
Schwarz Criterion (also SBC, SBIC) is a criterion
for model selection among a class of
parametric models with different numbers of
parameters.
• When estimating model parameters using
maximum likelihood estimation, it is possible to
increase the likelihood by adding additional
parameters, which may result in overfitting.
The BIC resolves this problem by introducing a
penalty term for the number of parameters in
the model.
8
SBC
• In SBC, the penalty for additional parameters
is stronger than that of the AIC.

SBC  n ln ˆ  M ln n
2
a

• It has the most superior large sample


properties.
• It is consistent, unbiased and sufficient.

9
HQIC
• The Hannan-Quinn information criterion
(HQIC) is an alternative to AIC and SBC.

HQIC  n ln ˆ  2 M lnln n 
2
a

• It can be shown [see Hannan (1980)] that in


the case of common roots in the AR and MA
polynomials, the Hannan-Quinn and Schwarz
criteria still select the correct orders p and q
consistently.
10
ESTIMATION
• After specifying the order of a stationary ARMA
process, we need to estimate the parameters.
• We will assume (for now) that:
1. The model order (p and q) is known, and
2. The data has zero mean.
• If (2) is not a reasonable assumption, we can
subtract the sample mean Y , fit a zero-mean
ARMA model:  B  X t    B at where X t  Yt  Y
Then use X t  Y as the model for Yt.
11
ESTIMATION
– Method of Moment Estimation (MME)
– Ordinary Least Squares (OLS) Estimation
– Maximum Likelihood Estimation (MLE)
– Least Squares Estimation
• Conditional
• Unconditional

12
THE METHOD OF MOMENT ESTIMATION
• It is also known as Yule-Walker estimation. Easy but not
efficient estimation method. Works for only AR models
for large n.
• BASIC IDEA: Equating sample moment(s) to population
moment(s), and solve these equation(s) to obtain the
estimator(s) of unknown parameter(s).
1n
E (Yt )   Yt    Y
n t 1
1n OR
E YtYt k    YtYt k   k  ˆk
n t 1
 k  ˆ k 13
THE MAXIMUM LIKELIHOOD
ESTIMATION
• Assume that a ~ N 0, .
i .i . d .
2
t a

• By this assumption we can use the joint pdf


f a1 ,, an   f a1  f an 
instead of f  y1 ,, yn  which cannot be
written as multiplication of marginal pdfs
because of the dependency between time
series observations.

15
MLE METHOD

• For the general stationary ARMA(p,q) model

Yt  1Yt 1     pYt  p  at  1at 1     q at q


or
at  Yt  1Yt 1     pYt  p  1at 1     q at q

where Yt  Yt  .

16
MLE
• The joint pdf of (a1,a2,…, an) is given by
 1 n 2
f a1 ,, an  , , , 2
  2 
2 n / 2
exp 2  at 
 2 a t 1 
a a

where   1 ,, p  and   1 ,. q .

• Let Y=(Y1,…,Yn) and assume that some initial


conditions are known.

17
MLE
• The conditional log-likelihood function is given
by
S*   ,  ,  
ln L , , , a    ln 2 a  
2 n 2

2 2 a2

where S*  , ,    at2  , , Y , Y* , a*  is the


n

t 1

conditional sum of squares.


Initial Conditions: 𝑌∗ , 𝑎∗
18
MLE
• Then, we can find the estimators of =(1,…,p),
=(1,…, q) and  such that the conditional
likelihood function is maximized. Usually,
numerical nonlinear optimization techniques
are required. After obtaining all the estimators,
S* ˆ ,ˆ,ˆ 
 
2
ˆa
d. f .
where d.f.=  of terms used in SS   of parameters
= (np)  (p+q+1) = n  (2p+q+1).
19
MLE
• Asymptotically unbiased, efficient, consistent,
sufficient for large sample sizes but hard to
deal with joint pdf.

26

You might also like