0% found this document useful (0 votes)
93 views195 pages

Chapter 5 - Time Series Models

Chapter 5 introduces time series models, focusing on definitions, properties, and methodologies for analyzing time-dependent data. It covers concepts such as stationarity, data generating processes, and various time series models including AR, MA, and ARMA processes. Additionally, it discusses the Box-Jenkins approach for model identification and validation, along with a bibliography for further reading.

Uploaded by

ermal.bis.16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views195 pages

Chapter 5 - Time Series Models

Chapter 5 introduces time series models, focusing on definitions, properties, and methodologies for analyzing time-dependent data. It covers concepts such as stationarity, data generating processes, and various time series models including AR, MA, and ARMA processes. Additionally, it discusses the Box-Jenkins approach for model identification and validation, along with a bibliography for further reading.

Uploaded by

ermal.bis.16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 195

Chapter 5 - Introduction to Time Series Models

espace
Data science and advanced programming
espace
MSc in Finance - UNIL

Christophe Hurlin

November 17, 2024

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 1 / 195
Outline of the Chapter

1 Introduction

2 Stationarity

3 Wold decomposition and prediction


Wold decomposition
Optimal forecast

4 Univariate time series models


MA process
AR process
ARMA process

5 The Box-Jenkins modeling approach


Identification tools
Identifying an ARMA Process
Validation tests

6 Bibliography

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 2 / 195
Section 1

Introduction

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 3 / 195
1. Introduction

Definition (time series)


A time series is a set of observations {x1 , . . . , xn } issued from a temporal stochastic
process {Xt }t ∈Z .

Notes:
The time elapsed between two observations is assumed to be constant (e.g., daily
data, weekly data, annual data).

Sampling frequency matters.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 4 / 195
1. Introduction

Figure: Example of time series: annual GDP growth rate, France (1949-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 5 / 195
1. Introduction

Figure: Example of time series: Quarterly inflation, France (1990-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 6 / 195
1. Introduction

Figure: Example of time series: Births per month, New York city (Jan 1946 to Dec 1959)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 7 / 195
1. Introduction

Definition (data generating process)


The data generating process (DGP) underlying the realizations {xt } is a (real-valued)
discrete time stochastic process, denoted {Xt , t ∈ Z} or simply Xt .

Notes:
1 The DGP is the ”true” model that has generated the dataset {x1 , . . . , xn }.

2 In reality we can only observe the time series at a finite number of times, and the
sequence of random variables {X1 , , . . . , Xn } is a n-dimensional random vector.

3 However, it is convenient to allow the number of observations to be infinite. In that


case {Xt , t ∈ Z} is called a discrete time stochastic process.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 8 / 195
1. Introduction

The term time series refers either to the stochastic process {Xt , t ∈ Ω} or to the set of
its realizations, {x1 , . . . , xT }.

Time series = The set of observations x1 , . . . , xT .


Time series = The stochastic process {Xt , t ∈ Ω}

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 9 / 195
1. Introduction

Definition (Index Set)


Consider a stochastic process {Xt , t ∈ Ω}. The set Ω is called the index set.

Example
Some examples of index sets are Z = {0, ±1, ±2, . . . }, N = {0, 1, 2, . . . }, etc. The
index set can also be the set of real numbers, R.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 10 / 195
1. Introduction

Definition (Time Series Model)


A time series model for observed data {xt } is a mathematical framework that describes
the underlying process generating the data.

Remarks:
Specifically, it defines the joint distribution—or, in some cases, only the means and
covariances—of the sequence of random variables {Xt , t ∈ Z}.
The general goal of time series econometrics is to specify a time series model that
approximates the Data Generating Process (DGP) as closely as possible.
The time series model may differ from the actual DGP, introducing what is known
as model risk.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 11 / 195
1. Introduction

Example (time series model)


An example of time series model is the AutoRegressive (AR) process of order p or
AR(p) in brief, defined as:

Xt = α0 + α1 Xt −1 + . . . + αp Xt −p + ε t

where ε t is an innovation process. This model specifies the conditional mean of {Xt } as:
 
E Xt | Xt −1 = α0 + α1 Xt −1 + . . . + αp Xt −p

where Xt −1 = {Xt −1 , Xt −2 , . . .} denotes the past values of the process {Xt }.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 12 / 195
1. Introduction

How to specify a ”good” time series model?

1 Study some statistical properties of the observed data {xt }, for instance the
stationarity, the patterns of the autocorrelation function (ACF, or the partial
autocorrelation function (PACF), etc.

2 Compare these properties to the ”theoretical” properties of some typical time series
models, e.g. AR, MA, ARIMA, SARIMA, ARFIMA, GARCH, etc.

3 Choose the most appropriate model and estimate its parameters (generally by ML).

4 Use this model for forecasting.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 13 / 195
1. Introduction

In addition to the general references provided in the introduction, the following textbooks
(Davidson, 2000; Greene, 2007; Hamilton, 1994; Lütkepohl, 2005; Cryer and Chan, 2008;
Enders, 2003; Shumway and Stoffer, 2006), specifically dedicated to time series models,
may also be useful:

References (theoretical):
Davidson, J. (2000), Econometric Theory, Blackwell Publishers.
Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice Hill.
Hamilton, James D. (1994), Time Series Analysis, Princeton University Press.
Lütkepohl, H. (2005), New Introduction to Multiple Time Series Analysis, Springer.

References (applied):
Cryer, J.D. and Chan, K.-S. (2008), Time series Analysis with applications in R, Springer.
Enders, W. (2003), Applied Econometric Time Series, Wiley.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 14 / 195
Outline of the Chapter

1 Introduction

2 Stationarity

3 Wold decomposition and prediction


Wold decomposition
Optimal forecast

4 Univariate time series models


MA process
AR process
ARMA process

5 The Box-Jenkins modeling approach


Identification tools
Identifying an ARMA Process
Validation tests

6 Bibliography

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 15 / 195
Section 2

Stationarity

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 16 / 195
2. Stationarity

Objectives:

1 To define the strict stationarity


2 To define the weak (second-order) stationarity
3 To define the concept of strict white noise or IID noise
4 To define the concept of (uncorrelated) white noise
5 To define the concept of martingale difference

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 17 / 195
2. Stationarity

Fact (Stationarity)
Loosely speaking, a stochastic process is stationary, if its statistical properties do not
change with time.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 18 / 195
2. Stationarity

There exist two definitions of the stationarity:

1 The strict stationarity

2 The weak or second order stationarity

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 19 / 195
2. Stationarity

There exist two definitions of the stationarity:

1 The strict stationarity

2 The weak or second order stationarity

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 20 / 195
2. Stationarity

Let {Xt , t ∈ Z} be a stochastic process and let FX (xt1 +τ , . . . , xtk +τ ) represent the cdf
of the unconditional joint distribution of {Xt } at times t1 + τ, . . . , tk + τ.

Definition (strict stationarity)


The process {Xt , t ∈ Z} is said to be strictly stationary if, for all k and τ, and for all
t1 , . . . , tk ,
FX (xt1 +τ , . . . , xtk +τ ) = FX (xt1 , . . . , xtk )

Interpretation: The unconditional joint probability distribution remains invariant when


shifted in time.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 21 / 195
2. Stationarity

There exist two definitions of the stationarity:

1 The strict stationarity

2 The weak or second order stationarity

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 22 / 195
2. Stationarity

Definition (weak or second-order stationarity)


The time series {Xt , t ∈ Z} is said to be (weakly) stationary if:
• ∀t ∈ Z, E Xt2 < ∞


• ∀t ∈ Z, E (Xt ) = µ
• ∀ (t, h) ∈ Z2 , Cov (Xt , Xt −h ) = γ (h) , does not depend on t.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 23 / 195
2. Stationarity

Remarks:

1 By default, we consider the second-order or weakly stationarity, i.e. we assume that


the two first moments of {Xt , t ∈ Z} are constant over time.

E ( Xt ) = µ Cov (Xt , Xt −h ) = γ (h ) ∀t ∈ Z
2 The condition Cov (Xt , Xt −h ) = γ (h ) implies that the variance of {Xt , t ∈ Z} is
constant over time

V (Xt ) = Cov (Xt , Xt ) = γ (0) ∀t ∈ Z


3 The condition on Cov (Xt , Xt −h ) can be interpreted as the ”covariance does not
change when shifted in time”.

Cov (Xr , Xs ) = Cov (Xr +t , Xs +t ) ∀ (t, r , s ) ∈ Z3

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 24 / 195
2. Stationarity

Figure: Simulation of stationary and non-stationary AR(1) processes

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 25 / 195
2. Stationarity

Fact (Stationarity)
In finance, asset prices are generally non-stationary, whereas returns are stationary.

The prices of an asset recorded over times are often not stationary due to the
increase of productivity, the financial crisis, etc.

However the returns, typically fluctuates around a constant level, suggesting a


constant mean over time.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 26 / 195
2. Stationarity

Figure: Daily closing prices for the S&P500 index are non stationary

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 27 / 195
2. Stationarity

Figure: Daily returns for the S&P500 index are stationary

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 28 / 195
2. Stationarity

Two particular stationary processes are the:

1 The white-noise processes

2 The martingale difference

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 29 / 195
2. Stationarity

Two particular stationary processes are the:

1 The white-noise processes

2 The martingale difference

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 30 / 195
2. Stationarity

Definition (Strict White Noise)


A process {ε t , t ∈ Z} is said to be a strict white noise, denoted as:
 
ε t ∼ i.i.d. 0, σ2 ,

if the random variables ε t are independent and identically distributed with E(ε t ) = 0 and
V(ε t ) = σ2 , for all t ∈ Z.

Note: In signal processing, white noise refers to a random signal with equal intensity
across different frequencies, analogous to white light, which contains all visible
wavelengths.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 31 / 195
2. Stationarity

Remarks:

1 A strict white noise contains no trend or seasonal components and that there is no
dependence (linear or nonlinear) between observations.
 
ε t ∼ i.i.d. 0, σ2 =⇒ ε t is independent from ε t −s ∀s ∈ Z

2 Sequence {ε t } is called a purely random process, IID noise or simply strict white
noise.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 32 / 195
2. Stationarity

Definition (white noise)


A process {ε t , t ∈ Z} is said to be an (uncorrelated) white noise, written
 
ε t ∼ WN 0, σ2

if the random variables ε t and ε s are uncorrelated for t ̸= s, with E (ε t ) = 0 and


V (ε t ) = σ2 , ∀t ∈ Z.

Note. By definition, we have:


   
ε t ∼ IID 0, σ2 =⇒ ε t ∼ WN 0, σ2

but the reverse is not true.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 33 / 195
2. Stationarity

Definition (Gaussian white noise)


A process {ε t , t ∈ Z} is said to be a Gaussian white noise, written
     
i.i.d.
ε t ∼ i.i.d. N 0, σ2 or ε t ∼ IID N 0, σ2 or ε t ∼ N 0, σ2

if the random variables ε t have a normal distribution with E (ε t ) = 0 and V (ε t ) = σ2 ,


∀t ∈ Z.

Note: For a normal distribution, the zero correlation implies independence so that
Gaussian white noise is also a strict white noise.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 34 / 195
2. Stationarity

White-noise and stationarity:


By definition, a white noise (strict or weak) is a stationary process since:
 
E ε2t = σ2 < ∞

E (ε t ) = 0, ∀t ∈ Z

 σ2
 if h = 0
Cov (ε t , ε t −h ) = , does not depend ont

 0 otherwise

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 35 / 195
2. Stationarity

Figure: Simulation of a Gaussian white noise with σ2 = 1.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 36 / 195
2. Stationarity

Two particular stationary processes are the:

1 The white-noise processes

2 The martingale difference

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 37 / 195
2. Stationarity

Definition (Martingale)
A process {Xt , t ∈ Z} is called a martingale if:

E Xt +1 | Xt = Xt ,


where Xt = {Xt , Xt −1 , . . .} is the information set available at time t, including Xt .

Note: If Xt represents an asset’s price at time t, then the martingale property implies
that the expected price tomorrow is equal to today’s price, given the information set
containing the asset’s price history.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 38 / 195
2. Stationarity

Definition (Martingale Difference)


A process {Yt , t ∈ Z}, defined as the first difference of a martingale Xt , is called a
martingale difference and is given by:

Yt = Xt − Xt −1 .

The main property of a martingale difference process is that:

E Yt +1 | Yt = 0,


where Yt denotes the information set available at time t.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 39 / 195
Remarks:

We have:

E Yt +1 | Xt = E Xt +1 − Xt | Xt = E Xt +1 | Xt − E Xt | Xt = Xt − Xt = 0.
   

The martingale difference process implies that, conditional on the asset’s price
history, the expected changes in the asset’s price are zero.

In this sense, the information Xt contained in past prices is fully reflected in the
asset’s current price, making it ineffective for predicting rates of return.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 40 / 195
2. Stationarity

Remarks:

A martingale difference is similar to a (uncorrelated) white noise except that it needs


not have constant conditional variance and that its conditional mean is zero.

(Uncorrelated) white noise and martingale differences have constant mean and zero
autocorrelations. Note that definitions do not specify the nonlinear properties of
such sequences.

A martingale difference with the conditional mean equal to zero and a constant
variance:
E Yt + 1 | Yt = 0 V ( Yt + 1 ) = σ 2


is called a homoscedastic martingale difference.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 41 / 195
2. Stationarity

Summary

Name Notation Properties

ε t ∼ IID 0, σ2

IID noise No dependencies (linear or nonlinear) with past/future

values. Constant variance and mean

ε t ∼WN 0, σ2

White noise No correlation with past/future values

Constant variance and mean


i.i.d. 2

Gaussian WN ε t ∼ N 0, σ No dependencies (linear or nonlinear) with past/future

values. ε t has a normal distribution


 
Martingale diff. E ε t | ε t −1 = 0 No correlation with past values

Conditional mean equal to 0, no constraint on

the conditional variance

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 42 / 195
2. Stationarity

Key Concepts

1 Strict stationarity
2 (Weak) stationarity
3 IID noise or strict white noise
4 Uncorrelated white noise or white noise
5 Gaussian white noise
6 Martingale and martingale difference

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 43 / 195
Section 3

Wold decomposition and prediction

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 44 / 195
3. Wold decomposition and prediction

Objectives:

1 To define the Wold decomposition


2 To introduce the notion of optimal forecast
3 To define the innovationprocess
4 To introduce the lag operator
5 To define the lag polynomials process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 45 / 195
3. Wold decomposition and prediction

Wold’s Theorem (Wold, 1938, 1954) is a foundational result in time series analysis.
It establishes that any weakly stationary stochastic process can be decomposed into a
deterministic and a purely stochastic component.

Wold, H. (1938), A Study in the Analysis of Stationary Time Series. Almqvist and Wiksell.
Wold, H. (1954) A Study in the Analysis of Stationary Time Series, Second revised edition.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 46 / 195
3. Wold decomposition and prediction

Theorem (Wold decomposition)


Any weak stationary time series {Xt , t ∈ Z} can be represented as a Wold
decomposition, given by:

Xt = ∑ ψj ε t −j + µ = ε t + ψ1 ε t −1 + ψ2 ε t −2 + . . . + µ
j =0

where the parameters ψj satisfy ψ0 = 1, ψj ∈ R, ∀ j ∈ N∗ , ∑j∞=0 ψj2 < ∞ ,


ε t ∼WN 0, σ2 is a white noise process and µ = E (Xt ) denotes the mean of Xt .


Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 47 / 195
3. Wold decomposition and prediction

Remarks:

1 This representation only exploits the covariance stationary property: neither a


distributional assumption, nor the independence of the error terms are required.

2 The Wold representation can also be written as:



Xt = ∑ ψj ε t −j + µt
j =0

where µt denotes the deterministic linear component such that cov (µt , ε s ) = 0,
∀ (t, s ) ∈ Z2 .
3 The condition ψ0 = 1 is a normalization of the variance of the white noise process.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 48 / 195
3. Wold decomposition and prediction

Example (normalization of the variance of the white noise)


Let us consider the following process
∞  2  3
1 1 1
Xt = µ + ∑ ψej vt −j = µ + 2 vt + 2
vt −1 +
2
vt −2 + . . .
j =0

with vt ∼WN 0, σ2 and σv2 = 1. It is possible to normalize the variance of the white


noise process such that the first parameter ψ0 is equal to one. Define ε t such that
1  
ε t = vt ∼ WN 0, σε2
2
with σε2 = 1/4. The process {Xt , t ∈ Z} can be rewritten as
∞  2  3
1 1 1
Xt = µ + ∑ ψj ε t −j = µ + ε t + 2 ε t −1 + 2
ε t −2 +
2
ε t −3 + ....
j =0

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 49 / 195
Exercise: MATLAB Code

Exercise: Approximation of the Wold decomposition

Consider the annual French GDP growth rate Yt for the period 1961-2017
(source: World Bank national accounts data), and generate a Gaussian white noise
ε t ∼i.i.d. N 0, σ2 with σ2 = 1. Question: (1) estimate the parameters of the
following model (without normalization on ψ0 ):

Yt = µ + ψ0 ε t + ψ1 ε t −1 + . . . + ψ20 ε t −20 + vt

where vt is an error term, and (2) evaluate the goodness of fit. The data are available
within the file GDP growth-rate.xlsx.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 50 / 195
Solution: MATLAB code - Part 1

1 clear; clc; close all; warning off


2
3 % Data importation
4 data = readtable(’Inflation_France.xlsx’);
5 Y = data.Inflation; % Inflation series
6 time_Y = data.Date; % Define time vector
7 n = length(Y); % Sample size
8
9 %==========================
10 %=== Wold approximation ===
11 %==========================
12 rng(1234); % Fix the random seed for reproducibility
13 eps = normrnd(0, 1, n+50, 1); % Gaussian white noise with mean 0, std 1
14
15 % Create matrix of lagged values of eps for the first 50 lags
16 lagged_eps = zeros(n, 50);
17 for j = 1:50
18 lagged_eps(:, j) = eps(51 - j:end - j); % Shifting to create lagged series
19 end
20
21 % Convert to table format for fitlm
22 T = array2table(lagged_eps, ’VariableNames’, [strcat(’Lag’, string(1:50))]);

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 51 / 195
Solution: MATLAB code - Part 2

1 % Regression of Y on the lagged values of eps using fitlm without listing each
regressor
2 lm = fitlm(T, Y); % ’.’ includes all other variables in the table as regressors
3
4 % Display summary of the linear model, including coefficients and R-squared
5 disp(lm);
6
7 % Plot original series and approximation
8 figure;
9 plot(time_Y, Y, ’b’, ’LineWidth’, 1); hold on;
10 Y_hat = predict(lm, T); % Fitted values from the regression
11 plot(time_Y, Y_hat, ’r’, ’LineWidth’, 1);
12 xlabel(’Time’);
13 ylabel(’Inflation’);
14 legend(’Actual Inflation’, ’Wold Approximation’, ’Location’, ’Best’);
15 title(’Wold Theorem Approximation of Inflation Series’);
16 grid on;

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 52 / 195
Solution: MATLAB Code - Part 3

Figure: Wold Approximation

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 53 / 195
Solution: MATLAB Code - Part 4

Figure: Wold Approximation

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 54 / 195
3. Wold decomposition and prediction

It is possible to rewrite the Wold decomposition by introducing a lag operator.

Definition (lag operator)


The lag operator (or backshift operator), denoted L or B, is defined by:

LXt = Xt −1 , ∀ t ∈ Z

where {Xt , t ∈ Z} is a time series process.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 55 / 195
3. Wold decomposition and prediction

Properties of the lag operator

Property 1. Lj Xt = Xt −j , ∀ j ∈ Z, and in particular L0 Xt = Xt .

Property 2. If Xt = c, ∀ t ∈ Z with c ∈ R, Lj Xt = Lj c = c, ∀ j ∈ Z.

Property 3. Li Lj Xt = Li +j Xt = Xt −i −j ∀ ( i, j ) ∈ Z2 .


Property 4. L−i Xt = Xt +i ∀ i ∈ Z.

Property 5. Li + Lj Xt = Li Xt + Lj Xt = Xt −i + Xt −j ∀ ( i, j ) ∈ Z2 .


Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 56 / 195
3. Wold decomposition and prediction

Example (Properties of the lag operator)


Consider a a time series {Xt , t ∈ Z}, then we have:

5L2 Xt = 5Xt −2

3L3 (L2 Xt ) = 3L5 Xt = 3Xt −5


L−1 Xt = Xt +1
(L2 + L3 )Xt = Xt −2 + Xt −3

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 57 / 195
3. Wold decomposition and prediction

Definition (lag polynomial)


A lag polynomial is as polynomial of lag operators.

Example (lag polynomial)


Consider the lag polynomial given by:

Θ (L) = 1 − 2L + 3L2

and a time series {Xt , t ∈ Z}. Then, we have:


 
Θ (L) Xt = 1 − 2L + 3L2 Xt = Xt − 2Xt −1 + 3Xt −2

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 58 / 195
3. Wold decomposition and prediction

Example (lag polynomial)


Consider the lag (infinite) polynomial given by:

Ψ (L) = ∑ ψj Lj = ψ0 + ψ1 L + ψ2 L2 + +ψ3 L3 + . . .
j =0

and a time series {Xt , t ∈ Z}. Then, we have:


∞ ∞
Ψ (L) Xt = ∑ ψj Lj Xt = ∑ ψj Xt −j = ψ0 Xt + ψ1 Xt −1 + ψ2 Xt −2 + . . .
j =0 j =0

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 59 / 195
3. Wold decomposition and prediction

Theorem (Wold decomposition)


Any stationary time series {Xt , t ∈ Z} can be represented as a Wold decomposition:

Xt = Ψ ( L ) ε t + µ

with Ψ (L) = ∑j∞=0 ψj Lj , with ψ0 = 1, ψj ∈ R, ∀ j ∈ N∗ , ∑j∞=0 ψj2 < ∞ , ε t ∼WN 0, σ2




is a white noise process and µ = E (Xt ) denotes the mean of Xt .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 60 / 195
3. Wold decomposition and prediction

Fact (Wold decomposition in practice)


While the Wold’s decomposition provides a theoretical foundation for modeling stationary
time series, its practical application is limited:
Using a large number of lagged terms introduces too many parameters.
It often results in non-significant coefficients and poor out-of-sample prediction
accuracy.
Consequently, simpler models (with less parameters), like ARMA, are preferred for
effective forecasting.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 61 / 195
Solution: MATLAB Code - Part 5

Figure: Estimates of the First Parameters of the Wold Approximation

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 62 / 195
3. Wold decomposition and prediction

Forecasting:

Suppose we are interested in forecasting the value of Yt +1 based on a set of


variables Xt observed at date t.

For instance, we might want to forecast Yt +1 based on its m most recent values. In
this case, Xt would consist in a constant plus Yt −1 , Yt −2 , . . . , Yt −m .

Let Yb t +1|t denote a forecast of Yt +1 based on Xt .

To evaluate the usefulness of this forecast we need a to specify a loss function.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 63 / 195
3. Wold decomposition and prediction

Definition (mean squared error and optimal forecast)


The mean squared error (MSE) associated to the forecast Yb t +1|t is defined as:
   2 
MSE Yb t +1|t = E Yt +1 − Yb t +1|t

The optimal forecast with the smallest MSE is the expectation of Yt +1 conditional on
the past values X t = {Xt , Xt −1 , . . .}:

Yb t∗+1|t = E ( Yt +1 | X t )

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 64 / 195
3. Wold decomposition and prediction

Forecasts and Wold decomposition:


Any (weak) stationary time series {Xt , t ∈ Z} can be represented in the form:

Xt + 1 = µ + ∑ ψj ε t +1−j = µ + ε t +1 + ψ1 ε t + ψ2 ε t −1 + ψ3 ε t −2 . . .
j =0

Let Xb t +1|t denote a forecast of Xt +1 based on the past values X t = {Xt , Xt −1 , . . .}

Xb t +1|t = E ( Xt +1 | X t ) = E ( Xt +1 | εt )

since Xt depends on the current and past values of ε t .

Xb t +1|t = E ( Xt + 1 | ε t )
= µ + E ( ε t +1 | εt ) + ψ1 E ( ε t | εt ) + ψ2 E ( ε t −1 | εt ) + ψ3 E ( ε t −2 | εt ) . . .
= µ + 0 + ψ1 ε t + ψ2 ε t −1 + ψ3 ε t −2 + . . .

µ + ∑ ψj E ε t +1−j εt

=
j =1

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 65 / 195
3. Wold decomposition and prediction

Definition (Wold decomposition and optimal forecast)


The optimal forecast Xb t +1|t of Xt +1 based on the Wold decomposition is given by:

Xb t +1|t = E ( Xt +1 | X t ) = µ + ∑ ψj ε t +1−j
j =1

The corresponding forecast error is defined by:

Xt +1 − Xb t +1|t = ε t +1

Notes:
1 ε t +1 is a (weak) white noise process. Say differently, ε t +1 is the new information
that appears at time t + 1 and that was not predictable at time t.

2 Note that E (ε t +1 ) = 0 and E (ε t +1 Yt −k ) = Cov (ε t +1 , Yt −k ) = 0 for k ≥ 0.


This is like an ”exogeneity assumption”.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 66 / 195
3. Wold decomposition and prediction

Definition (innovation process)


The innovation process of {Xt , t ∈ Z} is defined to be:

ε t = Xt − E ( Xt | X t −1 )

where the optimal forecast of Xt given the available information at time t − 1 denoted
X t −1 = {Xt −1 , Xt −2 , . . .} is defined to be:

Xb t |t −1 = E ( Xt | X t −1 )

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 67 / 195
3. Wold decomposition and prediction

Key Concepts

1 Wold decomposition
2 Optimal forecast
3 Loss function
4 Innovation process
5 Lag operator
6 Lag polynomial

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 68 / 195
Section 4

Univariate time series models

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 69 / 195
4. Univariate time series models

Objectives

1 To define the moving average (MA) process


2 To define the autoregressive (AR) process
3 To define the autoregressive moving average (ARMA) process
4 To introduce the invertibility and stationarity conditions

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 70 / 195
4. Univariate time series models

Some time series models are particularly useful for empirical applications:

1 The moving average (MA) model

2 The autoregressive (AR) model

3 The mixed autoregressive moving average (ARMA) model

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 71 / 195
Subsection 4.1

MA process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 72 / 195
4. Univariate time series models

Definition (moving average - MA process)


The process {Xt , t ∈ Z} is said to be a moving average of order q or a MA(q)
process, if:
Xt = c + ε t + θ 1 ε t − 1 + . . . + θ q ε t − q
where θ1 , . . . , θq are constants and ε t ∼WN 0, σ2 is a white noise process.


Note: by definition, c = E (Xt ) .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 73 / 195
4. Univariate Time Series Models

Example (MA Processes)


Let ε t be a white noise process with E(ε t ) = 0 and V(ε t ) = σ2 .
The process {Xt , t ∈ Z} is a MA(1) process with zero mean if:

Xt = ε t + θ1 ε t −1 .

The process {Zt , t ∈ Z} is a MA(3) process if:

Zt = c + ε t + θ1 ε t −1 + θ2 ε t −2 + θ3 ε t −3 .

Similarly, the process {Yt , t ∈ Z} is also a MA(3) process, defined as:

Yt = c + ε t + θ3 ε t −3 ,

even if the parameters θ1 and θ2 are zero.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 74 / 195
4. Univariate Time Series Models

Figure: Example of MA estimation output: Eviews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 75 / 195
4. Univariate Time Series Models

Figure: Example of MA estimation output: Eviews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 76 / 195
4. Univariate time series models

The MA(q) process can be rewritten using a lag polynomial.

Definition (moving average - MA process)


The process {Xt , t ∈ Z} is said to be a MA(q) process, if:

Xt = c + Θ (L) ε t

where the lag polynomial Θ (L) is defined by:


q
Θ (L) = ∑ θ j Lj
j =0

with ∀j < q, θj ∈ R, θ0 = 1, θq ∈ R∗ , and ε t ∼WN 0, σ2 is a white noise process.




Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 77 / 195
4. Univariate time series models

Example (MA processes)


Consider the following MA processes

Xt = ε t + 0.5ε t −1

Zt = 1 − 0.8ε t −3 + 1.2ε t −2 + ε t
Yt = −0.5 + ε t − 0.6ε t −3
Question: write the lag order polynomials associated to these MA processes.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 78 / 195
4. Univariate time series models

Solution
First MA process:
Xt = ε t + 0.5ε t −1
Xt = Θ (L) ε t with Θ (L) = 1 + 0.5L

Second MA process:
Zt = 1 − 0.8ε t −3 + 1.2ε t −2 + ε t
Zt = c + Θ (L) ε t with Θ (L) = 1 + 1.2L2 − 0.8L3 and c = 1

Third MA process:
Yt = −0.5 + ε t − 0.6ε t −3
Yt = c + Θ (L) ε t with Θ (L) = 1 − 0.6L3 and c = −0.5

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 79 / 195
Exercise: MATLAB Code

Exercise: Estimation of an MA(3) process on the French GDP growth rate

We consider the annual GDP growth rate Xt for France over the period the period
1950-2022 and we want to estimate an MA(3) model:

Xt = µ + ε t + ψ1 ε t −1 + ψ2 ε t −2 + ψ3 ε t −3

where ε t is a (weak) white noise with E(ε t ) = 0 and V(ε t ) = σ2 .


Import the data from file GDP France.xlsx and compute the GDP growth
rate.
Estimate the parameters µ, ψ1 , ψ2 , ψ3 and σ by MLE.
Compute the in-sample forecasts of Xt based on the MA(3) model.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 80 / 195
Solution: MATLAB code - Part 1

Matlab Code: estimation of a MA(3)


1 clear; clc; close all; warning off;
2
3 % Data importation
4 data = readtable(’GDP_France.xlsx’);
5 GDP = data.GDP; % GDP series
6 X = (GDP - lagmatrix(GDP,1)) ./ lagmatrix(GDP,1); % GDP growth rate
7 time = (1949:1:2022)’; % Define time vector
8
9 % Remove missing values resulting from lag
10 X = X(˜isnan(X));
11 time = time(˜isnan(X));
12
13 % Moving Average (MA) model
14 disp(’MA(3) Model:’);
15 model_ma3 = arima(’Constant’, NaN, ’MALags’, [1, 2, 3]); % Specify MA(3) model with
constant
16 est_ma3 = estimate(model_ma3, X);
17 disp(est_ma3);

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 81 / 195
Solution: MATLAB code - Part 2

1 % Extract parameters
2 c = est_ma3.Constant; % Constant term
3 theta1 = est_ma3.MA{1}; % MA(1) coefficient
4 theta2 = est_ma3.MA{2}; % MA(2) coefficient
5 theta3 = est_ma3.MA{3}; % MA(3) coefficient
6 sigma2 = est_ma3.Variance; % Variance of residuals
7
8 % Compute residuals
9 [residuals, ˜] = infer(est_ma3, X);
10
11 % Compute fitted values
12 fitted_values = nan(size(X)); % Initialize fitted values
13 for t = 4:length(X) % Start from t = 4 since lags up to t-3 are required
14 fitted_values(t) = c + theta1 * residuals(t-1) + ...
15 theta2 * residuals(t-2) + ...
16 theta3 * residuals(t-3); % Compute fitted value
17 end
18
19 % Plot the original series and fitted values
20 figure;
21 plot(time, X, ’b-’, ’LineWidth’, 1.2); % Original series
22 hold on;
23 plot(time, fitted_values, ’r--’, ’LineWidth’, 1.5); % Fitted values
24 xlabel(’Year’);
25 ylabel(’Growth Rate’);
26 title(’GDP Growth Rate and Fitted Values (MA(3))’);
27 legend(’Observed’, ’Fitted’, ’Location’, ’Best’);
28 grid on;
29 hold off;

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 82 / 195
Solution: MATLAB Code - Part 3

Figure: Estimation of MA(3) process on the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 83 / 195
Solution: MATLAB Code - Part 4

Figure: Estimation of MA(3) process on the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 84 / 195
4. Univariate time series models

Remark: Wold decomposition


The Wold decomposition is an MA process with an infinite order, denoted MA(∞).

Xt = Ψ ( L ) ε t + µ

with:

Ψ (L) = ∑ ψj Lj
j =0

ψ0 = 1 ∑ ψj2 < ∞
j =0
 
ε t ∼ WN 0, σ2

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 85 / 195
4. Univariate time series models

Stationarity of a MA(q) process

1 A MA(q) process is the weighted sum of q lagged values of a white noise, which is a
stationary process.

2 By definition, a MA(q) process is always stationary whatever the values of the


parameters θ1 , . . . , θq .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 86 / 195
4. Univariate time series models

Example (simulation of MA processes)


i.i.d.
Consider a Gaussian white noise ε t ∼ N (0, 1) and

Xt = ε t + 2ε t −3

Yt = 4 + ε t − 0.8ε t −1
where Xt is a MA(3) process with E (Xt ) = 0, and Yt is a MA(1) process with
E (Yt ) = 4. Question: simulate a sample of size n = 500 of the two MA processes.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 87 / 195
4. Univariate time series models

Figure: Simulation of MA(1) and MA(3) processes

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 88 / 195
Subsection 4.2

AR process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 89 / 195
4. Univariate time series models

Definition (AutoRegressive - AR process)


The process {Xt , t ∈ Z} is said to be an autoregressive process of order p, or briefly an
AR(p) process, if:
Xt = c + ϕ1 Xt −1 + . . . + ϕp Xt −p + ε t
where ϕ1 , . . . , ϕp are constants and ε t ∼WN 0, σ2 is a white noise process.


Property: The mean of the process Xt is given by:


c
E (Xt ) =
1 − ϕ1 − . . . − ϕp

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 90 / 195
4. Univariate Time Series Models

Example (AR Processes)


Let ε t be a white noise process with E(ε t ) = 0 and V(ε t ) = σ2 .
The process {Xt , t ∈ Z} is an AR(1) process with zero mean if

Xt = ϕ1 Xt −1 + ε t .

The process {Zt , t ∈ Z} is an AR(3) process if

Zt = c + ϕ1 Zt −1 + ϕ2 Zt −2 + ϕ3 Zt −3 + ε t .

The process {Yt , t ∈ Z} is also an AR(3) process, defined as

Yt = c + ϕ3 Yt −3 + ε t ,

even when the parameters ϕ1 and ϕ2 are zero.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 91 / 195
4. Univariate Time Series Models

Figure: Example of AR estimation output: Eviews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 92 / 195
4. Univariate Time Series Models

Figure: Example of AR estimation output: Eviews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 93 / 195
4. Univariate time series models

The AR(p) process can be rewritten using a lag polynomial.

Definition (AutoRegressive - AR process)


The process {Xt , t ∈ Z} is said to be a AR(p) process, if:

Φ ( L ) Xt = c + ε t

where the lag polynomial Φ (L) is defined by:


p
Φ (L) = 1 − ∑ ϕj Lj
j =1

where ∀j < p, ϕj ∈ R,, ϕp ∈ R∗ , and ε t ∼WN 0, σ2 is a white noise process.




Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 94 / 195
4. Univariate Time Series Models

Definition (Mean of an AR Process)


The mean of the process Xt is given by:

E(Xt ) = cΦ(1)−1 .

Note: The notation Φ(1), Ψ(1), or Θ(1) corresponds to the sum of the coefficients of
the respective polynomial. For instance:
p
Φ (1) = 1 − ∑ ϕj = 1 − ϕ1 − . . . − ϕp .
j =1

Thus, we have:
c
E(Xt ) = cΦ(1)−1 = .
1 − ϕ1 − . . . − ϕp

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 95 / 195
4. Univariate time series models

Example (AR processes)


Consider the following AR processes

Xt = 0.5Xt −1 + ε t

Zt = 1 − 0.8Zt −1 + 1.2Zt −2 + ε t
Yt = −0.5 − 0.6Yt −2 + ε t
Question: write the lag order polynomials associated to these AR processes.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 96 / 195
4. Univariate time series models

Solution:
First AR process:
Xt = 0.5Xt −1 + ε t
Φ (L) Xt = ε t with Φ (L) = 1 − 0.5L

Second AR process:
Zt = 1 − 0.8Zt −1 + 1.2Zt −2 + ε t
Φ (L) Zt = c + ε t with Φ (L) = 1 + 0.8L − 1.2L2 and c = 1

Third AR process:
Yt = −0.5 − 0.6Yt −2 + ε t
Φ (L) Yt = c + ε t with Φ (L) = 1 + 0.6L2 and c = −0.5

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 97 / 195
Exercise: MATLAB Code

Exercise: Estimation of an AR(2) process on the French GDP growth rate

Consider the annual GDP growth rate Xt for France over the period the period 1950-
2022 and estimate an AR(2) model:

Xt = µ + θ1 Xt −1 + θ2 Xt −2 + ε t

where ε t is a (weak) white noise with E(ε t ) = 0 and V(ε t ) = σ2 .


Import the data from file GDP France.xlsx and compute the GDP growth
rate.
Estimate the parameters µ, θ1 , θ2 , and σ by MLE.
Compute the in-sample forecasts of Xt based on the AR(2) model.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 98 / 195
Solution: MATLAB code - Part 1

Matlab Code: estimation of a AR(2)


1 clear; clc; close all; warning off;
2
3 % Data importation
4 data = readtable(’GDP_France.xlsx’);
5 GDP = data.GDP; % GDP series
6 X = (GDP - lagmatrix(GDP,1)) ./ lagmatrix(GDP,1); % GDP growth rate
7 time = (1949:1:2022)’; % Define time vector
8
9 % Remove missing values resulting from lag
10 X = X(˜isnan(X));
11 time = time(˜isnan(X));
12
13 % Estimate AR(2) model
14 disp(’Estimating AR(2) model...’);
15 model_ar2 = arima(’ARLags’, [1, 2]); % Specify AR(2) model
16 est_ar2 = estimate(model_ar2, X); % Estimate model parameters
17 disp(est_ar2);

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 99 / 195
Solution: MATLAB code - Part 2

1 % Compute fitted values


2 fitted_values = nan(size(X)); % Initialize fitted values
3 [residuals, ˜] = infer(est_ar2, X); % Compute residuals
4 phi1 = est_ar2.AR{1}; % AR(1) coefficient
5 phi2 = est_ar2.AR{2}; % AR(2) coefficient
6 c = est_ar2.Constant; % Constant term
7
8 for t = 3:length(X) % Start from t = 3 since lags up to t-2 are required
9 fitted_values(t) = c + phi1 * X(t-1) + phi2 * X(t-2); % AR(2) equation
10 end
11
12 % Plot the original series and fitted values
13 figure;
14 plot(time, X, ’b-’, ’LineWidth’, 1.2); % Original series
15 hold on;
16 plot(time, fitted_values, ’r--’, ’LineWidth’, 1.5); % Fitted values
17 xlabel(’Year’);
18 ylabel(’Growth Rate’);
19 title(’GDP Growth Rate and Fitted Values (AR(2))’);
20 legend(’Observed’, ’Fitted’, ’Location’, ’Best’);
21 xlim([1950, 2022]); % Set x-axis limits
22 grid on;
23 hold off;

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 100 / 195
Solution: MATLAB Code - Part 3

Figure: Estimation of an AR(2) process on the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 101 / 195
Solution: MATLAB Code - Part 4

Figure: Estimation of an AR(2) process on the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 102 / 195
4. Univariate time series models

Stationarity of a AR(p) process

Theorem (Stationarity of an AR process)


An AR(p) process may be stationary or not, depending on the values of the parameters
ϕ1 , . . . , ϕq .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 103 / 195
4. Univariate time series models

Example (simulation of a AR processes)


i.i.d.
Consider a Gaussian white noise ε t ∼ N (0, 0.25) and

Xt = Xt −1 + ε t

Yt = 2 + 0.5Xt −1 + ε t
where Xt is a non-stationary AR(1) process with E (Xt ) = 0, and Yt is a stationary
AR(1) process with E (Yt ) = 2/ (1 − 0.5) = 4.
Question: simulate a sample of size n = 500 of the two AR processes.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 104 / 195
4. Univariate time series models

Figure: Simulation of two AR(1) processes

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 105 / 195
Subsection 4.3

ARMA process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 106 / 195
4. Univariate time series models

Definition (ARMA process)


The process {Xt , t ∈ Z} is said to be a ARMA(p, q) process, if:

Φ (L) Xt = c + Θ (L) ε t

where the lag polynomials Φ (L) and Θ (L) are defined by:
p q
Φ (L) = 1 − ∑ ϕj Lj Θ (L) = ∑ θ j Lj
j =1 j =0

with ∀j < p, ϕj ∈ R, ϕp ∈ R∗ , ∀j < q, θj ∈ R, θ0 = 1, θq ∈ R∗ , and ε t ∼WN 0, σ2 is




a white noise process.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 107 / 195
4. Univariate time series models

Definition (Mean of an ARMA process)


The mean of the process Xt is given by:
c
E (Xt ) = cΦ (1)−1 =
1 − ϕ1 − . . . − ϕp

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 108 / 195
4. Univariate time series models

Example (ARMA processes)


Denote by ε t a white noise process with E (ε t ) = 0 and V (ε t ) = σ2 .
The process {Xt , t ∈ Z} is a ARMA(1,3) process with a null mean if:

Xt = ϕXt −1 + θ3 ε t −3 + θ2 ε t −2 + θ1 ε t −1 + ε t

The process {Zt , t ∈ Z} is a ARMA(2,1) process if:

Zt = c + ϕ1 Zt −1 + ϕ2 Zt −2 + θ1 ε t −1 + ε t

with:
c
E (Zt ) =
1 − ϕ1 − ϕ2

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 109 / 195
4. Univariate Time Series Models

Figure: Example of ARMA estimation output: Eviews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 110 / 195
4. Univariate Time Series Models

Figure: Example of ARMA estimation output: Eviews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 111 / 195
Exercise: MATLAB Code

Exercise: Estimation of an ARMA process on the French GDP growth rate

Consider the annual GDP growth rate Xt for France over the period the period 1950-
2022 and estimate an ARMA(1,2) model:

Xt = µ + θ1 Xt −1 + ψ2 ε t −2 + ψ1 ε t −1 + ε t

where ε t is a (weak) white noise with E(ε t ) = 0 and V(ε t ) = σ2 .


Import the data from file GDP France.xlsx and compute the GDP growth
rate.
Estimate the parameters µ, θ1 , ψ1 , ψ2 , and σ by MLE.
Compute the in-sample forecasts of Xt based on the ARMA(1,2) model.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 112 / 195
Solution: MATLAB code - Part 1

Matlab Code: estimation of a ARMA(1,2)


1 clear; clc; close all; warning off;
2
3 % Data importation
4 data = readtable(’GDP_France.xlsx’);
5 GDP = data.GDP; % GDP series
6 X = (GDP - lagmatrix(GDP,1)) ./ lagmatrix(GDP,1); % GDP growth rate
7 time = (1949:1:2022)’; % Define time vector
8
9 % Remove missing values resulting from lag
10 X = X(˜isnan(X));
11 time = time(˜isnan(X));
12
13 % Estimate ARMA(1,2) model
14 disp(’Estimating ARMA(1,2) model...’);
15 model_arma12 = arima(’ARLags’, 1, ’MALags’, [1, 2]); % Specify ARMA(1,2) model
16 est_arma12 = estimate(model_arma12, X); % Estimate model parameters
17 disp(est_arma12);

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 113 / 195
Solution: MATLAB code - Part 2

1 % Compute fitted values


2 [residuals, ˜] = infer(est_arma12, X); % Compute residuals
3 phi1 = est_arma12.AR{1}; % AR(1) coefficient
4 theta1 = est_arma12.MA{1}; % MA(1) coefficient
5 theta2 = est_arma12.MA{2}; % MA(2) coefficient
6 c = est_arma12.Constant; % Constant term
7
8 % Initialize fitted values
9 fitted_values = nan(size(X));
10 for t = 3:length(X) % Start from t = 3 since two lags are required
11 fitted_values(t) = c + phi1 * X(t-1) + ...
12 theta1 * residuals(t-1) + ...
13 theta2 * residuals(t-2); % ARMA(1,2) equation
14 end
15
16 % Plot the original series and fitted values
17 figure;
18 plot(time, X, ’b-’, ’LineWidth’, 1.2); % Original series
19 hold on;
20 plot(time, fitted_values, ’r--’, ’LineWidth’, 1.5); % Fitted values
21 xlabel(’Year’);
22 ylabel(’Growth Rate’);
23 legend(’Observed’, ’Fitted’, ’Location’, ’Best’);
24 xlim([1950, 2022]); % Set x-axis limits
25 grid on;
26 hold off;

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 114 / 195
Solution: MATLAB Code - Part 3

Figure: Estimation of an ARMA(1,2) process on the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 115 / 195
Solution: MATLAB Code - Part 4

Figure: Estimation of an ARMA(1,2) process on the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 116 / 195
4. Univariate time series models

Two properties of the AR / MA / ARMA processes are generally considered

1 The invertibility condition means that the process can be inverted. For instance, an
AR(p) model can be alternatively represented as an MA(∞) , an MA(q) process can
be represented as an AR(∞) , etc.

2 The stationarity condition guarantees that the process is (weakly) stationary.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 117 / 195
4. Univariate time series models

Theorem (invertibility and stationarity of the MA processes)


A process {Xt , t ∈ Z} with an MA(q ) representation is always stationary. It is also
invertible if all the roots λj of the polynomial Θ (L) are all outside the unit circle, i.e.
their modulus is larger than 1.
q q  
1
Θ λj = ∑ θi λj = ∏ 1 − L = 0 with λj > 1, ∀j
i

i =0 j =1
λj

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 118 / 195
4. Univariate time series models

Remarks:

The property of invertibility of an MA process refers to the possibility of expressing


it as an AR(∞) process.

For instance, if |a| < 1 then:


∞ ∞
(1 − aL)−1 Xt = ∑ aj Lj Xt = ∑ aj Xt −j = Xt + aXt −1 + a2 Xt −2 + ...
j =0 j =0

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 119 / 195
4. Univariate time series models

Definition (Modulus of a Complex Number)


Let z ∈ C be a complex number, written in the form z = a + bi, where a ∈ R and
b ∈ R. The modulus of z, denoted |z |, is defined as:
p
|z | = a2 + b 2 .

Example (Modulus)
For z = 3 + 4i: p √
|z | = 32 + 42 = 9 + 16 = 5.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 120 / 195
4. Univariate time series models

Example (invertibility and stationarity of the MA processes)


Consider the MA(1) process given by

Xt = ε t − 0.5ε t −1 = Θ (L) ε t

with Θ (L) = 1 − 0.5L. The root of the polynomial is equal to 2.

Θ (λ) = 1 − 0.5λ = 0 ⇔ λ = 2

The root is outside the unit circle: the MA process is stationary (by definition) and
invertible. This last property means that Xt can be expressed as an AR(∞) with:

Θ ( L ) − 1 Xt = ε t

Θ (L)−1 = (1 − 0.5L)−1 = ∑ 0.5j Lj = 1 + 0.5L + 0.52 L2 + 0.53 L3 + . . .
j =0

or equivalently:
Xt + 0.5Xt −1 + 0.52 Xt −2 + . . . = ε t

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 121 / 195
4. Univariate Time Series Models

Figure: Example: Inverted Roots for an MA Process in EViews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 122 / 195
4. Univariate time series models

Theorem (invertibility and stationarity of the AR processes)


A process {Xt , t ∈ Z} with an AR(p ) representation is always invertible. It is also
stationary if all the roots λj ∀j ≤ p of the polynomial Φ (L) are all outside the unit
circle, i.e. their modulus is larger than 1.
p p  
1
Φ λj = ∑ ϕi λj = ∏ 1 − L = 0 with λj > 1, ∀j
i

i =0 j =1
λj

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 123 / 195
4. Univariate time series models

Example (invertibility and stationarity of the AR processes)


Consider the two AR(2) processes given by

Xt = 2 − 0.5Xt −1 + 0.2Xt −3 + ε t

Yt = 0.2Yt −1 + 1.5Yt −2 + ε t
where ε t is a white noise. Question: Check if the processes Xt and Yt are stationary.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 124 / 195
4. Univariate time series models

Solution:
First MA process:
Xt = 2 − 0.5Xt −1 + 0.2Xt −3 + ε t
The lag polynomial is given by:

Φ (L) Xt = 2 + ε t with Φ (λ) = 1 + 0.5λ − 0.2λ3 = 0

⇐⇒ λ1 = 2.18 and λj = −1.09 ± 1.04i j = 2, 3


|λ1 | = 2.18 |λ2 | = |λ3 | = 1.50 All the roots are outside the unit circle: Xt is stationary

Second MA process:
Yt = 0.2Yt −1 + 1.5Yt −2 + ε t
Φ (L) Yt = ε t with Φ (λ) = 1 − 0.2λ − 1.5λ2 = 0 ⇐⇒ λ1 = −0.88 and λ2 = 0.75
|λ1 | = 0.88 and |λ2 | = 0.75 Xt is non-stationary

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 125 / 195
4. Univariate Time Series Models

Figure: Example: Inverted Roots for an AR Process in EViews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 126 / 195
4. Univariate time series models

Theorem (invertibility of an ARMA process)


A process {Xt , t ∈ Z} with an ARMA(p, q ) representation is invertible if the roots of
its MA polynomial Θ (L) are all outside the unit circle.

Theorem (stationarity of an ARMA process)


A process {Xt , t ∈ Z} with an ARMA(p, q ) representation is stationary if the roots of
its AR polynomial Φ (L) are all outside the unit circle.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 127 / 195
4. Univariate Time Series Models

Figure: Example: Inverted Roots for an ARMA Process in EViews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 128 / 195
4. Univariate time series models

Figure: Example: Inverted Roots for an ARMA Process in EViews

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 129 / 195
4. Univariate time series models

Key Concepts

1 Moving average (MA) process


2 Autoregressive (AR) process
3 Autoregressive moving average (ARMA) process
4 Invertibility and stationarity conditions

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 130 / 195
Section 5

The Box-Jenkins modeling approach

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 131 / 195
5. The Box-Jenkins modeling approach

Objectives

1 To introduce the Box-Jenkins modeling approach


2 To introduce the principle of parsimony
3 To define the autocorrelation function
4 To define the partial autocorrelation function
5 To identify the AR, MA and ARMA processes

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 132 / 195
5. The Box-Jenkins modeling approach

The Box-Jenkins modeling approach:

1 The Wold decomposition is useful for theoretical reasons. However, in practice,


applications of models with an infinite number of parameters are hardly useful.

2 Many forecasters are persuaded of the benefits of parsimony, or using as few


parameters as possible.

3 Although complicated parameters can track the data very well over the historical
period for which parameters are estimated, they often perform poorly when used for
out-of-sample forecasts.

4 Box and Jenkins (1976) recommend the use of univariate time series models with a
”small” number of parameters.

Box, G.E and G.M Jenkins, Time Series Analysis, Forecasting and Control, Wiley, 1976.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 133 / 195
5. The Box-Jenkins modeling approach

The approach of Box and Jenkins (1976) can be broken into five steps:
Step 1: Transform the data, if necessary, so that the assumption of (weak) stationarity is
a reasonable one.
Step 2: Use some identification tools (autocorrelation function, partial autocorrelation
function, etc.) in order to compare some properties of the data to the ”theoretical”
properties of some times series models (AR, MA, ARMA, etc.), and choose a model.
Step 3: Estimate the parameters of the model.
Step 4: Perform diagnostic analysis to confirm that the model is indeed consistent with
the observed features of the data.
Step 5: Use the estimated model to produce the forecasts.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 134 / 195
5. The Box-Jenkins modeling approach

There are two main identification tools for the times series models

1 The autocorrelation function (ACF)

2 The partial autocorrelation function (PACF)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 135 / 195
5. The Box-Jenkins modeling approach

There are two main identification tools for the times series models

1 The autocorrelation function (ACF)

2 The partial autocorrelation function (PACF)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 136 / 195
5. The Box-Jenkins Modeling Approach

Definition (Autocorrelation Function)


Let {Xt , t ∈ Z} be a stationary time series, its autocorrelation function is defined as:

Cov (Xt , Xt −k ) γ (k )
ρ (k ) ≡ Corr (Xt , Xt −k ) = = ∀k ∈ Z,
V ( Xt ) γ (0)

where γ (k ) ≡ Cov (Xt , Xt −k ) is the autocovariance function.

Properties:
ρ (k ) = ρ (−k ) , ∀k ∈ Z

ρ (0) = 1.

The range of ρ (k ) is [−1, 1] .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 137 / 195
5. The Box-Jenkins Modeling Approach

Definition (Sample Autocorrelation)


The (sample) autocorrelation function (ACF), denoted ρb (k ), of a stationary process
{Xt , t ∈ Z} is a consistent estimator of ρ (k ) defined as:

∑nt=k +1 (xt − µ
b) (xt −k − µ
b)
ρb (k ) = corr (Xt , Xt −k ) = ,
b)2
∑nt=1 (xt − µ

b = n−1 ∑nt=1 xt is the sample mean of {x1 , . . . , xn } .


where µ

Note:
The sample ACF is a consistent estimator of the ACF:
p
ρb (k ) −
→ ρ (k ) .

In general, the ACF refers to the (sample) autocorrelation function.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 138 / 195
5. The Box-Jenkins modeling approach

Figure: Example of ACF of the French GDP annual growth rate (1950-2022): Eviews output

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 139 / 195
5. The Box-Jenkins modeling approach

Testing the nullity of the autocorrelations:

It is possible to test the nullity of the autocorrelations ρ(k ) for any lag order k using
a z-statistic:
H0 : ρ k = 0
H1 : ρ k ̸ = 0
It is possible to test the nullity of the K first autocorrelations through a Q-test:
Box-Pierce test or Ljung-Box test:

H0 : ρ 1 = · · · = ρ K = 0

H1 : ∃k ∈ {1, K } such that ρk ̸= 0

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 140 / 195
5. The Box-Jenkins modeling approach

Definition (Test for the Nullity of Autocorrelations)


The z-statistic associated with the test H0 : ρk = 0 is defined as:
ρ̂k
tρ̂k =
V(ρ̂k )1/2
where the variance is defined as:
!
K
1
V(ρ̂k ) =
n
1+2 ∑ ρ̂2j
j =1

Under H0 , this z-statistic converges to a standard normal distribution:


d
tρ̂k −−−→ N (0, 1) ∀k ∈ Z.
n→∞

Decision Rule:
At the α = 5% significance level, if |tρ̂k | > 1.96, we reject the null hypothesis H0 ,
meaning the autocorrelation ρk is not equal to zero.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 141 / 195
5. The Box-Jenkins Modeling Approach

Definition (Box-Pierce Test Statistic)


For any stationary process Xt , the Box-Pierce test statistic associated with the null
hypothesis H0 : ρ1 = · · · = ρK = 0 is given by:
K
QBP = n ∑ ρb2k ,
k =1

where ρb(k ) = corr(Xt , Xt −k ) is the empirical autocorrelation of order k. Under H0 , the


statistic converges in distribution to a chi-squared distribution:
d
QBP −−−→ χ2 (K ).
n→∞

Decision Rule:
Reject H0 at the 5% significance level if QBP exceeds the 0.95 quantile of the
corresponding χ2 distribution.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 142 / 195
5. The Box-Jenkins Modeling Approach

Definition (Ljung-Box Test Statistic)


For any stationary process Xt , the Ljung-Box test statistic associated with the null
hypothesis H0 : ρ1 = · · · = ρK = 0 is given by:
K ρb2k
QK = n (n + 2) ∑ n−k
,
k =1

where ρb(k ) = corr(Xt , Xt −k ) is the empirical autocorrelation of order k. Under H0 , the


statistic converges in distribution to a chi-squared distribution:
d
QK −−−→ χ2 (K ).
n→∞

Decision Rule:
Reject H0 at the 5% significance level if QK exceeds the 0.95 quantile of the
corresponding χ2 distribution.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 143 / 195
5. The Box-Jenkins modeling approach

Figure: Ljung-Box test on the French GDP annual growth rate (1950-2022): Eviews output

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 144 / 195
Exercise: MATLAB Code

Exercise: ACF of the French GDP Growth Rate

Compute the AutoCorrelation Function (ACF) of the annual GDP growth rate Xt for
France over the period 1950-2022 for lags 1 to 15, and perform a Ljung-Box test.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 145 / 195
Solution: MATLAB code - Part 1

Matlab Code: ACF and Ljung-Box test


1 % Compute ACF values and lags
2 numLags = 15;
3 [acfValues, lags] = autocorr(X, ’NumLags’, numLags);
4
5 % Manually calculate 95% confidence bounds
6 n = length(X);
7 confBound = 1.96 / sqrt(n);
8 lowerBounds = -confBound * ones(numLags, 1);
9 upperBounds = confBound * ones(numLags, 1);
10
11 % Ljung-Box Q-Test for significance
12 [h, pValue, Qstat] = lbqtest(X, ’Lags’, 1:numLags); % Test for lags 1 to numLags
13
14 % Ensure all vectors have consistent sizes
15 acfValues = acfValues(2:numLags+1); % Remove lag 0 from ACF values
16 lowerBounds = lowerBounds(1:numLags);
17 upperBounds = upperBounds(1:numLags);
18 lags = lags(2:numLags+1); % Remove lag 0
19
20 % Create Table of ACF and Q-Test Values
21 acfQTestTable = table(lags, acfValues, lowerBounds, upperBounds, Qstat’, pValue’, ...
22 ’VariableNames’, {’Lag’, ’ACF’, ’LowerBound’, ’UpperBound’, ’QTestStat’, ’PValue’
});
23
24 % Display the table

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 146 / 195
Solution: MATLAB Code - Part 2

Figure: ACF of the French GDP annual growth rate (1950-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 147 / 195
Solution: MATLAB Code - Part 3

Figure: ACF of the French GDP annual growth rate (1950-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 148 / 195
5. The Box-Jenkins modeling approach

There are two main identification tools for the times series models

1 The autocorrelation function (ACF)

2 The partial autocorrelation function (PACF)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 149 / 195
5. The Box-Jenkins Modeling Approach

Definition (Partial Autocorrelation Function)


The partial autocorrelation function (PACF), denoted α (k ), of a stationary time series
{Xt , t ∈ Z} is defined as the correlation between Xt and Xt −k , conditional on the
intervening observations xt −1 , . . . , xt −k +1 :

α (k ) ≡ Corr ( Xt , Xt −k | Xt −1 , . . . , Xt −k +1 ) ∀k ∈ Z.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 150 / 195
5. The Box-Jenkins Modeling Approach

Definition (Partial Autocorrelation Function)


The partial autocorrelation function α (k ) corresponds to the sequence of the k th
autoregressive coefficients akk obtained in the multiple linear regression model:

Xt = c + ak1 Xt −1 + ak2 Xt −2 + . . . + akk Xt −k + vt .

Then, we have:
α (k ) = akk .

Properties:
α (0) = 1.

α (1) = ρ (1) .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 151 / 195
5. The Box-Jenkins Modeling Approach

Definition (Sample Partial Autocorrelation Function)


The (sample) partial autocorrelation function (PACF), denoted b α (k ), corresponds to
the sequence of the k th autoregressive estimated coefficients akk obtained by OLS in the
regression:
Xt = c + ak1 Xt −1 + ak2 Xt −2 + . . . + akk Xt −k + vt .
Then, we have:
α (k ) = abkk .
b

Note: In general, the PACF refers to the (sample) partial autocorrelation function.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 152 / 195
5. The Box-Jenkins modeling approach

Figure: PACF of the French GDP annual growth rate (1961-2017)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 153 / 195
5. The Box-Jenkins modeling approach

Figure: PACF of the French GDP annual growth rate (1961-2017)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 154 / 195
Exercise: MATLAB Code

Exercise: PACF of the French GDP Growth Rate

Compute the Partial AutoCorrelation Function (PACF) of the annual GDP growth
rate Xt for France over the period 1950-2022 for lags 1 to 15.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 155 / 195
Solution: MATLAB code - Part 1

Matlab Code: PACF


1 % Compute PACF values and lags
2 numLags = 15;
3 [pacfValues, lags] = parcorr(X, ’NumLags’, numLags,Method="Yule-Walker");
4
5 % Manually calculate 95% confidence bounds
6 n = length(X);
7 confBound = 1.96 / sqrt(n);
8 lowerBounds = -confBound * ones(size(lags));
9 upperBounds = confBound * ones(size(lags));
10
11 % Create Table of PACF Values
12 pacfTable = table(lags, pacfValues, lowerBounds, upperBounds, ...
13 ’VariableNames’, {’Lag’, ’PACF’, ’LowerBound’, ’UpperBound’});
14
15 % Display the table
16 disp(’Table of PACF Values:’);
17 disp(pacfTable);
18
19 % Plot the PACF
20 figure;
21 parcorr(X, ’NumLags’, numLags); % Plot PACF with confidence parameter
22 xlabel(’Lags’);
23 ylabel(’Partial Autocorrelation’);
24 grid on;

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 156 / 195
Solution: MATLAB Code - Part 2

Figure: PACF of the French GDP annual growth rate (1950-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 157 / 195
Solution: MATLAB Code - Part 3

Figure: PACF of the French GDP annual growth rate (1950-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 158 / 195
5. The Box-Jenkins modeling approach

How to choose an ARMA specification?


⇒ steps 2 and 3 of the Box-Jenkins approach
Step 2: Use some identification tools (autocorrelation function, partial autocorrelation
function, etc.) in order to compare some properties of the data to the theoretical
properties of some times series models (AR, MA,ARMA, etc.), and choose a model.
Step 3: Estimate the parameters of the model by Maximum Likelihood (ML).

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 159 / 195
5. The Box-Jenkins modeling approach

Lemma (identification for MA process)


The autocorrelation function (ACF) of a MA(q) process is zero at lag q + 1 and greater.

ρ (k ) = 0 for k > q

Note: Therefore, we determine the appropriate maximum lag order by examining the
(sample) autocorrelation function to see where it becomes insignificantly different from
zero for all lags beyond a certain lag, which is designated as the maximum lag q.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 160 / 195
5. The Box-Jenkins modeling approach

Figure: ACF of a simulated MA(1) process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 161 / 195
5. The Box-Jenkins modeling approach

Figure: ACF of a simulated MA(3) process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 162 / 195
5. The Box-Jenkins modeling approach

Lemma (identification for AR process)


The partial autocorrelation function (PACF) of an AR(p) process is zero at lag p + 1
and greater.
α (p ) = ϕp α (k ) = 0 for k > p

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 163 / 195
5. The Box-Jenkins modeling approach

Figure: PACF of a simulated AR(1) process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 164 / 195
5. The Box-Jenkins modeling approach

Figure: PACF of a simulated AR(3) process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 165 / 195
5. The Box-Jenkins modeling approach

Lemma
In the case of a mixed ARMA(p, q) with p ̸= 0 and q ̸= 0, neither the theoretical
autocorrelation function not the theoretical partial autocorrelation function have any
abrupt cutoffs.

Note: Thus, there is little that can be inferred from ACF and PACF beyond the fact that
neither a pure MA model nor a pure AR model would be inappropriate.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 166 / 195
5. The Box-Jenkins modeling approach

Figure: ACF and PACF of a simulated ARMA(3,2) process

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 167 / 195
5. The Box-Jenkins Modeling Approach

Identification of the Lag Orders for an ARMA Process

1 Brockwell and Davis (2009) recommend using an information criterion (AIC, BIC, or
others) to determine the optimal values of p and q.

2 We consider two maximum lag orders pmax and qmax , and search for the optimal
values of p and q such that:

(p, q ) = arg min Information Criterion(p, q ).


q ∈{1,...,qmax }
p ∈{1,...,pmax }

3 All software offers an automatic procedure to determine the orders of an ARMA


model.

4 After selecting p and q, the parameters of the ARMA model are estimated using
least squares regression. Various validation tests (e.g., tests for residual
autocorrelation) are then applied to verify the adequacy of the chosen (p, q ) values.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 168 / 195
5. The Box-Jenkins Modeling Approach

Definition (AIC (Akaike Information Criterion))


The Akaike Information Criterion (AIC) is a function that decreases with the
log-likelihood of the sample, denoted ℓn (θ; y | x), and increases with the number of
parameters K = p + q + 1 of the ARMA(p, q ) model:

AIC = −2ℓn (θ; y | x) + 2K .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 169 / 195
5. The Box-Jenkins Modeling Approach

Definition (BIC (Bayesian Information Criterion))


The Bayesian Information Criterion (BIC), also known as the Schwarz Information
Criterion, is defined as:
BIC = −2ℓn (θ; y | x) + K ln(n ),
where n denotes the sample size, ℓn (θ; y | x) is the log-likelihood of the sample, and
K = p + q + 1 in the case of an ARMA(p, q ) model.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 170 / 195
5. The Box-Jenkins modeling approach

Figure: Example of AIC and BIC: Eviews output

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 171 / 195
5. The Box-Jenkins modeling approach

Comparison Between AIC and BIC:


AIC:
Less penalizing for additional variables.
Favors models with good fit quality, even if they are more complex.

BIC:
More penalizing for complex models.
Favors simpler models, especially for large sample sizes.

Criterion Choice:
For predictive models, AIC is often preferred.
For explanatory models, or if parsimony is desired, BIC is generally preferred.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 172 / 195
Exercise: MATLAB Code

Exercise: Identification of an ARMA Model for the French GDP Growth Rate

Consider the annual GDP growth rate Xt for France over the period 1950–2022, and
identify the optimal ARMA model. Propose an optimal model based on the analysis
of:
The autocorrelation function (ACF) for lags 1 to 15.
The partial autocorrelation function (PACF) for lags 1 to 15.
An optimal selection procedure using the AIC and BIC criteria.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 173 / 195
Solution: MATLAB code - Part 1

Matlab Code: Identification of ARMA model


1 clear; clc; close all;
2
3 % Data importation
4 data = readtable(’GDP_France.xlsx’);
5 GDP = data.GDP; % GDP series
6 X = (GDP - lagmatrix(GDP,1)) ./ lagmatrix(GDP,1); % GDP growth rate
7 time = (1949:1:2022)’; % Define time vector
8
9 % Remove missing values resulting from lag
10 X = X(˜isnan(X));
11 time = time(˜isnan(X));
12
13 % Parameters for model selection
14 p_max = 3; % Maximum AR order
15 q_max = 3; % Maximum MA order
16 n = length(X); % Number of observations
17
18 % Initialize table to store results
19 AIC_BIC_Table = [];
20 RowIndex = 1;

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 174 / 195
Solution: MATLAB code - Part 2

1 % Loop through all combinations of p and q


2 for p = 0:p_max
3 for q = 0:q_max
4
5 % Specify ARMA(p, q) model
6 model = arima(’ARLags’, 1:p, ’MALags’, 1:q);
7
8 % Estimate the model
9 [˜,˜,logL] = estimate(model, X, ’Display’, ’off’);
10
11 % Compute log-likelihood, AIC, and BIC
12 numParams = p + q +1 ; % AR + MA + constant
13 aic = -2*logL + 2*numParams;
14 bic = -2*logL + log(n)*numParams;
15
16 % Store results in the table
17 AIC_BIC_Table(RowIndex, :) = [p, q, aic, bic,logL];
18 RowIndex = RowIndex + 1;
19 end
20 end
21
22 % Convert results into a table
23 AIC_BIC_Table = array2table(AIC_BIC_Table, ’VariableNames’, {’AR_Order_p’, ’MA_Order_q
’, ’AIC’, ’BIC’,’LogLik’});
24
25 % Sort the table by AIC, placing the lowest AIC at the top
26 AIC_BIC_Table = sortrows(AIC_BIC_Table, ’AIC’, ’ascend’);
27
28 % Display the full table
29 disp(’Table of AIC and BIC Values:’);
30 disp(AIC_BIC_Table);

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 175 / 195
Solution: MATLAB code - Part 3

1 % Find the optimal lags based on AIC and BIC


2 [˜, minAICIndex] = min(AIC_BIC_Table.AIC);
3 [˜, minBICIndex] = min(AIC_BIC_Table.BIC);
4
5 optimalAIC = AIC_BIC_Table(minAICIndex, :);
6 optimalBIC = AIC_BIC_Table(minBICIndex, :);
7
8 % Display the optimal results
9 disp(’Optimal Model Based on AIC:’);
10 disp(optimalAIC);
11
12 disp(’Optimal Model Based on BIC:’);
13 disp(optimalBIC);

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 176 / 195
Solution: MATLAB Code - Part 4

Figure: Identification of an ARMA model for the French GDP growth rate (PACF Function)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 177 / 195
Solution: MATLAB Code - Part 5

Figure: Identification of an ARMA model for the French GDP growth rate (ACF Function)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 178 / 195
Solution: MATLAB Code - Part 6

Figure: Identification of an ARMA model for the French GDP growth rate : AIC and BIC

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 179 / 195
Solution: MATLAB Code - Part 7

Figure: Identification of an ARMA model for the French GDP growth rate : AIC and BIC

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 180 / 195
5. The Box-Jenkins Modeling Approach

Validation Tests:
Once the ARMA model has been specified and its parameters estimated, we should
assess the validity of the specification through validation tests. Among others, the
following tests need to be applied:
Test the absence of autocorrelation in the residuals b
ε t : if autocorrelation is detected,
the assumption that ε t is a (weak) white noise is rejected.

Test the homoscedasticity of the residuals: if heteroscedasticity is detected, the


assumption that ε t is a (weak) white noise is rejected.

Test the stability of the model (e.g., CUSUM, CUSUM-Q tests).

Test the normality of the residuals b


ε t : if the normality assumption is rejected, the
MLE estimates may be invalid.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 181 / 195
5. The Box-Jenkins modeling approach

In the sequel, we consider that the optimal model for the French GDP growth rate is an
AR(1) model such that:
Xt = c + θ 1 Xt − 1 + ε t
where ε t is a weak white noise with E(ε t ) = 0 and V(ε t ) = σ2 .

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 182 / 195
5. The Box-Jenkins modeling approach

Figure: Optimal AR(1) model for the French GDP growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 183 / 195
5. The Box-Jenkins modeling approach

Definition (Box-Pierce Test Statistic)


For an ARMA(p, q) process, the Box-Pierce test statistic associated to the null of
absence of autocorrelation of the residuals H0 : ρ1 = · · · = ρK = 0 is given by:
K
QBP = n ∑ ρb2k
k =1

where ρbk = corr (b ε t −k ) is the residual autocorrelation of order k. Under H0 , the


εt , b
statistic converges in distribution to a chi-squared distribution:
d
QBP −−−→ χ2 (K − p − q ).
n→∞

Note: The distribution of the Box-Pierce test statistic is only defined for K ≥ p + q.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 184 / 195
5. The Box-Jenkins modeling approach

Definition (Ljung-Box Test Statistic)


For an ARMA(p, q) process, the Ljung-Box test statistic associated to the null of
absence of autocorrelation of the residuals H0 : ρ1 = · · · = ρK = 0 is given by:
K ρ2k
QK = n (n + 2) ∑ n−k
k =1

where ρbk = corr (b ε t −k ) is the residual autocorrelation of order k. Under H0 , the


εt , b
statistic converges in distribution to a chi-squared distribution:
d
QK −−−→ χ2 (K − p − q ).
n→∞

Note: The distribution of the Ljung-Box test statistic is only defined for K ≥ p + q.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 185 / 195
5. The Box-Jenkins modeling approach

Figure: ACF of the residuals of the AR(1) model

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 186 / 195
5. The Box-Jenkins modeling approach

Definition (CUSUM Test)


The CUSUM test is based on the cumulative sum of recursive residuals:
t
St = ∑ ej for t = k + 1, . . . , T .
j =k +1

If the coefficients are stable over time, the recursive residuals St must remain within the
interval: " √ √ #
2t + T − 3k 2t + T − 3k
−β √ ,β √ ,
T −k T −k
where β = 1.143, 0.948, 0.850 for significance levels of 1%, 5%, and 10%, respectively.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 187 / 195
5. The Box-Jenkins modeling approach

Definition (CUSUM of Squares Test)


The CUSUM of squares test is based on the cumulative sum of squared recursive
residuals:
∑tj=k +1 ej2
St0 = T for t = k + 1, . . . , T .
∑j =k +1 ej2
If the coefficients are stable over time, the recursive residuals St0 must remain within the
interval: " r r #
t −T t −T
−C ,C ,
T −k T −k
where C is the Kolmogorov-Smirnov statistic.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 188 / 195
5. The Box-Jenkins modeling approach

Figure: CUSUM test of the AR(1) model

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 189 / 195
5. The Box-Jenkins modeling approach

Figure: CUSUM Squared test of the AR(1) model

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 190 / 195
5. The Box-Jenkins modeling approach

Figure: In sample fit for the French GDP annual growth rate (1950-2022)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 191 / 195
5. The Box-Jenkins modeling approach

Figure: In sample fit for the French GDP annual growth rate

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 192 / 195
5. The Box-Jenkins modeling approach

Key Concepts

1 The Box-Jenkins modeling approach


2 The parsimony principle
3 The autocorrelation function
4 The partial autocorrelation function
5 ACF and PACF of the AR and MA processes
6 Identification of the ARMA processes

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 193 / 195
References

Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and Control.
Wiley.
Brockwell, P. J. and Davis, R. A. (2009). Time Series: Theory and Methods. Springer,
New York, 2nd edition.
Cryer, J. D. and Chan, K.-S. (2008). Time Series Analysis with Applications in R.
Springer.
Davidson, J. (2000). Econometric Theory. Blackwell Publishers.
Enders, W. (2003). Applied Econometric Time Series. Wiley.
Greene, W. H. (2007). Econometric Analysis. Pearson Prentice Hall, 6th edition.
Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press, New Jersey.
Lütkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer.
Shumway, R. H. and Stoffer, D. S. (2006). Time Series Analysis and Its Applications
with R Examples. Springer.
Wold, H. (1938). A Study in the Analysis of Stationary Time Series. Almqvist and
Wiksell.
Wold, H. (1954). A Study in the Analysis of Stationary Time Series. Almqvist and
Wiksell, 2nd revised edition.

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 194 / 195
End of Chapter 5

Christophe Hurlin (University of Orléans)

Christophe Hurlin Chapter 5 - Introduction to Time Series Models November 17, 2024 195 / 195

You might also like