0% found this document useful (0 votes)
132 views31 pages

4 Time Series

Uploaded by

J.i. Lopez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views31 pages

4 Time Series

Uploaded by

J.i. Lopez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

4 Time Series Concepts

Updated: February 3, 2022

Copyright © Eric Zivot 2015, 2016, 2017

Financial variables such as asset prices and returns are naturally


ordered by time. That is, these
variables are time series variables.
When we construct returns, the time index or data frequency becomes
the investment horizon associated with the return. Typical data frequencies
are daily, weekly, monthly and
annual. In building probability models
for time series variables, the time ordering of the data matters
because
we may think there are important temporal dependencies among the variables.
For example, we
might have reason to believe that the return on an
asset this month is correlated with the return on the
same asset in
the previous month. This autocorrelation can only be defined if the
time ordering of the data
is preserved. A major complication in analyzing
time series data is that the usual assumption of random
sampling from
a common population is not appropriate because it does not allow for
any kind of time
dependence in the time series variables. We would
like to retain the notion that the observed data come
from some population
model, perhaps with time-invariant parameters, but we would like to
allow the
variables to have time dependencies. Fortunately, we can
do this if the time series data come from a
stationary time series
process.

This chapter reviews some basic times series concepts that are essential
for describing and modeling
financial time series. Section 4.1
defines univariate time series processes and introduces the important
concepts of stationarity and ergodicity. Covariance stationary time
series processes are defined, which
gives meaning to measuring linear
time dependence using autocorrelation. The benchmark Gaussian
White
Noise process and related processes are introduced and illustrated
using R. Some common non-
stationary time series processes are also
discussed including the famous random walk model. Section 4.2
introduces covariance stationary multivariate time series process.
Such processes allow for dynamic
interactions among groups of time
series variables. Section 4.3 discusses
time series model building and
introduces the class of univariate
autoregressive-moving average time series models and multivariate
vector autoregression models. The properties of some simple models
are derived and it is shown how to
simulate observations from these
models using R. The chapter concludes with a brief discussion of
forecasting
from time series models.

The R packages used in this chapter are mvtnorm and vars.


Make sure these packages are installed
and loaded before running the
R examples in this chapter.

4.1 Stochastic Processes

A discrete-time stochastic process or time series process


{…, Y 1, Y 2, …, Y t, Y t + 1, …} = {Y t} t = − ∞,
is a sequence of random variables indexed by time t17. In most applications, the time index is a regularly
spaced index
representing calendar time (e.g., days, months, years, etc.) but it
can also be irregularly
spaced representing event time (e.g., intra-day
transaction times). In modeling time series data, the
ordering imposed
by the time index is important because we often would like to capture
the temporal
relationships, if any, between the random variables in
the stochastic process. In random sampling from a
population, the
ordering of the random variables representing the sample does not
matter because they
are independent.

A realization of a stochastic process with T observations is the


sequence of observed data

T
{Y 1 = y 1, Y 2 = y 2, …, Y T = y T} = {y t} t = 1.

The goal of time series modeling is to describe the probabilistic


behavior of the underlying stochastic
process that is believed to
have generated the observed data in a concise way. In addition, we
want to be
able to use the observed sample to estimate important characteristics
of a time series model such as
measures of time dependence. In order
to do this, we need to make a number of assumptions regarding
the
joint behavior of the random variables in the stochastic process such
that we may treat the stochastic
process in much the same way as we
treat a random sample from a given population.

4.1.1 Stationary stochastic processes

We often describe random sampling from a population as a sequence


of independent, and identically
distributed (iid) random variables
X 1, X 2… such that each X i is described by the same
probability
distribution F X, and write X i ∼ F X. With
a time series process, we would like to preserve the identical
distribution
assumption but we do not want to impose the restriction that each
random variable in the
sequence is independent of all of the other
variables. In many contexts, we would expect some
dependence between
random variables close together in time (e.g, X 1, and X 2)
but little or no
dependence between random variables far apart in
time (e.g., X 1 and X 100). We can allow for this type of
behavior using the concepts of stationarity and ergodicity.

We start with the definition of strict stationarity.

Definition 4.1 (Strict stationarity)


A stochastic process {Y t} is strictly stationary if,
for any given finite integer r and for any set of subscripts
t 1, t 2, …, t r
the joint distribution of (Y t , Y t , …, Y t )
depends only on t 1 − t, t 2 − t, …, t r − t but not on t.
In
1 2 r

other words, the joint distribution of (Y t , Y t , …, Y t )


is the same as the joint distribution of
1 2 r

(Y t − t, Y t − t, …, Y t − t)
for any value of t.
1 2 r

In simple terms, the joint distribution of random variables in a strictly


stationary stochastic process is time
invariant. For example, the
joint distribution of (Y 1, Y 5, Y 7) is the same as the distribution
of (Y 12, Y 16, Y 18).
Just like in an iid sample, in a strictly
stationary process all of the individual random variables Y t
(
t = − ∞, …, ∞) have the same marginal distribution F Y.
This means they all have the same mean, variance
etc., assuming these
quantities exist. However, assuming strict stationarity does not make
any
assumption about the correlations between Y t, Y t , …, Y t
other than that the correlation between Y t and Y t
1 r r

only depends on t − t r (the time between Y t and Y t )


and not on t. That is, strict stationarity allows for
r

general temporal
dependence between the random variables in the stochastic process.
A useful property of strict stationarity is that it is preserved under
general transformations, as summarized
in the following proposition.

Proposition 4.1 Let {Y t} be strictly stationary and let


g( ⋅ ) be any function of the elements in {Y t}. Then

{g(Y t)}, is also strictly stationary.

2
For example, if {Y t} is strictly stationary then {Y t }
and {Y tY t − 1} are also strictly stationary.

The following are some simple examples of strictly stationary processes.

Example 4.1 (iid sequence)


If {Y t} is an iid sequence, then it is strictly stationary.

Example 4.2 (Non iid sequence)


Let {Y t} be an iid sequence and let X ∼ N(0, 1) independent
of {Y t}. Define Z t = Y t + X. The sequence {Z t}
is not an independent sequence (because of the common X) but is
an identically distributed sequence
and is strictly stationary.

Strict stationarity places very strong restrictions on the behavior


of a time series. A related concept that
imposes weaker restrictions
and is convenient for time series model building is
covariance stationarity
(sometimes called weak stationarity).

Definition 4.2 (Covariance stationarity)


A stochastic process {Y t} is covariance stationary
if

1. E[Y t] = μ < ∞ does not depend on t


2. var(Y t) = σ 2 < ∞ does not depend on t
3. cov(Y t, Y t − j) = γ j < ∞, and depends only on j but not on t for j = 0, 1, 2, …

The term γ j is called the j th order autocovariance.


The j th order autocorrelation is defined as:

cov(Y t, Y t − j) γj
ρj = = .
σ2
√ var(Y t)var(Y t − j)

The autocovariances, γ j, measure the direction of linear


dependence between Y t and Y t − j. The
autocorrelations,
ρ j, measure both the direction and strength of linear dependence
between Y t and Y t − j.
With covariance stationarity, instead
of assuming the entire joint distribution of a collection of random
variables is time invariant we make a weaker assumption that only
the mean, variance and
autocovariances of the random variables are
time invariant. A strictly stationary stochastic process {Y t}
such that μ, σ 2, and γ ij
exist is a covariance stationary stochastic process. However,
a covariance
stationary process need not be strictly stationary.

The autocovariances and autocorrelations are measures of the linear


temporal dependence in a
covariance stationary stochastic process.
A graphical summary of this temporal dependence is given by
the plot
of ρ j against j, and is called the autocorrelation function (ACF). Figure 4.1 illustrates
an ACF for a
hypothetical covariance stationary time series with ρ j = (0.9) j
for j = 1, 2, …, 10 created with
rho = 0.9

rhoVec = (rho)^(1:10)

ts.plot(rhoVec, type="h", lwd=2, col="blue", xlab="Lag j",

ylab=expression(rho[j]))

Figure 4.1: ACF for time series with ρ j = (0.9) j

For this process the strength of linear time dependence decays toward
zero geometrically fast as j
increases.

The definition of covariance stationarity requires that E[Y t] < ∞


and var(Y t) < ∞. That is, E[Y t] and var(Y t)
must exist and be finite numbers. This is true if Y t is normally
distributed. However, it is not true if, for
example, Y t has a Student’s t
distribution with one degree of freedom.18
Hence, a strictly stationary
stochastic process {Y t} where
the (marginal) pdf of Y t (for all t) has very fat tails may
not be covariance
stationary.

Example 4.3 (Gaussian White Noise)


Let Y t ∼ iid N(0, σ 2). Then {Y t} is called
a Gaussian white noise (GWN) process and is denoted
Y t ∼ GWN(0, σ 2).
Notice that:

E[Y t] = 0 independent of t,


var(Y t) = σ 2 independent of t,
cov(Y t, Y t − j) = 0 (for j > 0) independent of t for all j,

so that {Y t} satisfies the properties of a covariance stationary


process. The defining characteristic of a
GWN process is the lack
of any predictable pattern over time in the realized values of the
process. The
term white noise comes from the electrical engineering
literature and represents the absence of any
signal.19
Simulating observations from a GWN process in R is easy: just simulate
iid observations from a normal
distribution. For example, to simulate
T = 250 observations from the GWN(0,1) process use:

set.seed(123)

y = rnorm(250)

The simulated iid N(0,1) values are generated using the  rnorm() 
function. The command
 set.seed(123)  initializes R’s internal
random number generator using the seed value 123. Every time
the random
number generator seed is set to a particular value, the random number
generator produces
the same set of random numbers. This allows different
people to create the same set of random numbers
so that results are
reproducible. The simulated data is illustrated in Figure 4.2
created using:

ts.plot(y,main="Gaussian White Noise Process",xlab="time",

ylab="y(t)", col="blue", lwd=2)

abline(h=0)

Figure 4.2: Realization of a GWN(0,1) process.

The function  ts.plot()  creates a time series line plot with


a dummy time index. An equivalent plot can
be created using the generic
 plot()  function with optional argument  type="l" .
The data in Figure 4.2
fluctuate
randomly about the mean value zero and exhibit a constant volatility
of one (typical magnitude of
a fluctuation about zero). There is no
visual evidence of any predictable pattern in the data.

Example 4.4 (GWN model for continuously compounded returns)


Let r t denote the continuously compounded monthly return on
Microsoft stock and assume that
r t ∼ iid N(0.01, (0.05) 2).
We can represent this distribution in terms of a GWN process as follows
r t = 0.01 + ε t, ε t ∼ GWN(0, (0.05) 2).

Here, r t is constructed as a GWN(0, σ 2) process plus a constant.


Hence, {r t} is a GWN process with a
non-zero mean: r t ∼ GWN(0.01, (0.05) 2).
T = 60 simulated values of {r t} are computed using:

set.seed(123)

y = rnorm(60, mean=0.01, sd=0.05)

ts.plot(y,main="GWN Process for Monthly Continuously Compounded Returns",

xlab="time",ylab="r(t)", col="blue", lwd=2)

abline(h=c(0,0.01,-0.04,0.06), lwd=2,

lty=c("solid","solid","dotted","dotted"),

col=c("black", "red", "red", "red"))

Figure 4.3: Simulated returns from GWN(0.01,(0.05) 2).

and are illustrated in Figure 4.3.


Notice that the returns fluctuate around the mean value of 0.01 (solid red
line)
and the size of a typical deviation from the mean is about 0.05. The red dotted lines
show the values
0.1 ± 0.05 .

An implication of the GWN assumption for monthly returns is that non-overlapping


multi-period returns are
also GWN. For example, consider the two-month
return r t(2) = r t + r t − 1. The non-overlapping process
{r t(2)} = {. . . , r t − 2(2), r t(2), r t + 2(2), . . . }
is GWN with mean E[r t(2)] = 2 ⋅ μ = 0.02, variance
var(r t(2)) = 2 ⋅ σ 2 = 0.005,
and standard deviation sd(r t(2)) = √2σ = 0.071.

Example 4.5 (Independent white noise)


Let Y t ∼ iid (0, σ 2). Then {Y t}
is called an independent white noise (IWN) process and is denoted

Y t ∼ IWN(0, σ 2). The difference between GWN


and IWN is that with IWN we don’t specify that all random
variables
are normally distributed. The random variables can have any distribution
with mean zero and
1
variance σ 2. To illustrate, suppose
Y t = × t 3 where t 3 denotes a Student’s
t distribution with 3 degrees
√3
of freedom. This process has E[Y t] = 0
and var(Y t) = 1. Figure 4.4 shows simulated observations
from this
process created using the R commands

set.seed(123)

y = (1/sqrt(3))*rt(250, df=3)

ts.plot(y, main="Independent White Noise Process", xlab="time", ylab="y(t)",

col="blue", lwd=2)

abline(h=0)

1
Figure 4.4: Simulation of IWN(0,1) process: Y t ∼ × t3
√3

The simulated IWN process resembles the GWN in Figure 4.4


but has more extreme observations.

Example 4.6 (Weak white noise)


Let {Y t} be a sequence of uncorrelated random variables each
with mean zero and variance σ 2. Then {Y t}
is called
a weak white noise (WWN) process and is denoted Y t ∼ WWN(0, σ 2).
With a WWN process, the
random variables are not independent, only
uncorrelated. This allows for potential non-linear temporal
dependence
between the random variables in the process.


4.1.2 Non-Stationary processes

In a covariance stationary stochastic process it is assumed that the


means, variances and
autocovariances are independent of time. In a
non-stationary process, one or more of these assumptions
is not true.
The following examples illustrate some typical non-stationary time
series processes.

Example 4.7 (Deterministically trending process)


Suppose {Y t} is generated according to the deterministically trending process:

2
Y t = β 0 + β 1t + ε t,  ε t ∼ GWN(0, σ ε ),
t = 0, 1, 2, …

Then {Y t} is nonstationary because the mean of Y t depends


on t:

E[Y t] = β 0 + β 1t.

2
Figure 4.5 shows a realization
of this process with β 0 = 0, β 1 = 0.1 and σ ε = 1
created using the R
commands:

set.seed(123)

e = rnorm(250)
y.dt = 0.1*seq(1,250) + e

ts.plot(y.dt, lwd=2, col="blue", main="Deterministic Trend + Noise")

abline(a=0, b=0.1)

Figure 4.5: Deterministically trending nonstationary process Y t = 0.1 × t + ε t, ε t ∼ GWN(0, 1)

Here the non-stationarity is created by the deterministic trend β 0 + β 1t


in the data. The non-stationary
process {Y t} can be transformed
into a stationary process by simply subtracting off the trend:
2
X t = Y t − β 0 − β 1t = ε t ∼ GWN(0, σ ε ).

The detrended process X t ∼ GWN(0, σ 2ε )


is obviously covariance stationary.

Example 4.8 (Random walk)


A random walk (RW) process {Y t} is defined as:

2
Y t = Y t − 1 + ε t,  ε t ∼ GWN(0, σ ε ),
Y 0 is fixed (non-random).

By recursive substitution starting at t = 1, we have:

Y 1 = Y 0 + ε 1,
Y 2 = Y 1 + ε 2 = Y 0 + ε 1 + ε 2,

Yt = Yt − 1 + εt = Y0 + ε1 + ⋯ + εt
t

= Y0 + ∑ ε j.
j=1

Now, E[Y t] = Y 0 which is independent of t. However,

( )
t t

var(Y t) = var ∑ εj = ∑ σ 2ε = σ 2ε × t,
j=1 j=1

which depends on t, and so {Y t} is not stationary.

2
Figure 4.6 shows a realization of
the RW process with Y 0 = 0 and σ ε = 1 created
using the R commands:

set.seed(321)

e = rnorm(250)
y.rw = cumsum(e)

ts.plot(y.rw, lwd=2, col="blue", main="Random Walk")

abline(h=0)
Figure 4.6: Random walk process: Y t = Y t − 1 + ε t, ε t ∼ GWN(0, 1).

The RW process looks much different from the GWN process in Figure
4.2. As the variance of the
process
increases linearly with time, the uncertainty about where the process
will be at a given point in
time increases with time.

Although {Y t} is non-stationary, a simple first-differencing


transformation, however, yields a covariance
stationary process:

2
X t = Y t − Y t − 1 = ε t ∼ GWN(0, σ ε ).

Example 4.9 (Random walk with drift model for log stock prices)
Let r t denote the continuously compounded monthly return on
Microsoft stock and assume that
r t ∼ GWN(μ, σ 2).
Since r t = ln(P t / P t − 1) it follows that lnP t = lnP t − 1 + r t.
Now, re-express r t as r t = μ + ε t
where ε t ∼ GWN(0, σ 2).
Then lnP t = lnP t − 1 + μ + ε t. By recursive substitution
we have
t
lnP t = lnP 0 + μt + ∑ t = 1 ε t and
so lnP t follows a random walk process with drift value μ.
Here, E[lnP t] = μt
and var(lnP t) = σ 2t
so lnP t is non-stationary because both the mean and variance
depend on t. In this
model, prices, however, do not follow a random
walk since P t = e ln P t = P t − 1e r t.

4.1.3 Ergodicity

In a strictly stationary or covariance stationary stochastic process


no assumption is made about the
strength of dependence between random
variables in the sequence. For example, in a covariance
stationary
stochastic process it is possible that ρ 1 = cor(Y t, Y t − 1) = ρ 100 = cor(Y t, Y t − 100) = 0.5,
say.
However, in many contexts it is reasonable to assume that the
strength of dependence between random
variables in a stochastic process
diminishes the farther apart they become. That is, ρ 1 > ρ 2⋯
and that
eventually ρ j = 0 for j large enough. This diminishing
dependence assumption is captured by the concept
of ergodicity.

Definition 4.3 (Ergodicity (intuitive definition))


Intuitively, a stochastic process {Y t} is ergodic if
any two collections of random variables partitioned far
apart in the
sequence are essentially independent.

The formal definition of ergodicity is highly technical and requires


advanced concepts in probability theory.
However, the intuitive definition
captures the essence of the concept. The stochastic process {Y t}
is
ergodic if Y t and Y t − j are essentially independent
if j is large enough.

If a stochastic process {Y t} is covariance stationary and


ergodic then strong restrictions are placed on the
joint behavior
of the elements in the sequence and on the type of temporal dependence
allowed.

Example 4.10 (White noise processes)


If {Y t} is GWN or IWN then it is both covariance stationary
and ergodic.

Example 4.11 (Covariance stationary but not ergodic process (White 1984, page 41))
Let Y t ∼ GWN(0, 1) and let X ∼ N(0, 1) independent
of {Y t}. Define Z t = Y t + X. Then {Z t} is covariance
stationary but not ergodic. To see why {Z t} is not ergodic,
note that for all j > 0:

var(Z t) = var(Y t + X) = 1 + 1 = 2,
γ j = cov(Y t + X, Y t − j + X) = cov(Y t, Y t − j) + cov(Y t, X) + cov(Y t − j, X) + cov(X, X)
= cov(X, X) = var(X) = 1,
1
ρ j =  for all j.
2

Hence, the correlation between random variables separated far apart


does not eventually go to zero and
so {Z t} cannot be ergodic.

4.2 Multivariate Time Series

Consider n time series variables {Y 1t}, …, {Y nt}.


A multivariate time series is the (n × 1) vector time
series
{Y t} where the i th row of {Y t}
is {Y it}. That is, for any time t, Y t = (Y 1t, …, Y nt) ′.
Multivariate time series
analysis is used when one wants to model
and explain the interactions and co-movements among a
group of time
series variables. In finance, multivariate time series analysis is
used to model systems of
asset returns, asset prices, exchange
rates, the term structure of interest rates, and economic variables,
etc.. Many of the time series concepts described previously for univariate time series carry over to
multivariate time series in a natural way. Additionally, there are some important time series concepts that
are particular to multivariate time series. The following sections give the details of these extensions.

4.2.1 Stationary and ergodic multivariate time series


A multivariate time series {Y t} is covariance stationary
and ergodic if all of its component time series are
stationary and
ergodic. The mean of Y t is defined as the (n × 1)
vector

E[Y t] = (μ 1, …, μ n) ′ = μ,

where μ i = E[Y it] for i = 1, …, n. The variance/covariance


matrix of Y t is the (n × n) matrix

var(Y t) = Σ = E[(Y t − μ)(Y t − μ) ′]

( )
var(Y 1t) cov(Y 1t, Y 2t) ⋯ cov(Y 1t, Y nt)
cov(Y 2t, Y 1t) var(Y 2t) ⋯ cov(Y 2t, Y nt)
= .
⋮ ⋮ ⋱ ⋮
cov(Y nt, Y 1t) cov(Y nt, Y 2t) ⋯ var(Y nt)

The matrix Σ has elements σ ij = cov(Y it, Y jt) which measure


the contemporaneous linear dependence
between Y it and Y jt that is time invariant.
The correlation matrix of Y t is the (n × n)
matrix

cor(Y t) = C 0 = D − 1Γ 0D − 1,

where D is an (n × n) diagonal matrix with j th


diagonal element σ j = sd(Y jt).

4.2.1.1 Cross covariance and correlation matrices

For a univariate time series {Y t}, the autocovariances, γ k,


and autocorrelations, ρ k, summarize the linear
time dependence
in the data. With a multivariate time series {Y t}
each component has autocovariances
and autocorrelations but there
are also cross lead-lag covariances and correlations between all possible
pairs of components. The lag k autocovariances and autocorrelations of Y jt,
for j = 1, …, n, are defined as

k
γ jj = cov(Y jt, Y jt − k),
k
γ jj
k
ρ jj = corr(Y jt, Y jt − k) = 2
,
σj

k −k k −k
and these are symmetric in k: γ jj = γ jj ,
ρ jj = ρ jj . The cross lag-k covariances
and cross lag-k
correlations between
Y it and Y jt are defined as

γ kij = cov(Y it, Y jt − k),


k
γ ij
k
ρ ij = corr(Y jt, Y jt − k) = ,
2 2
√ σi σj

and they are not necessarily symmetric in k. In general,

k k
γ ij = cov(Y it, Y jt − k) ≠ cov(Y jt, Y it − k) = γ ji.

k
If γ ij ≠ 0 for some k > 0 then Y jt is said
to lead Y it. This implies that past values of Y jt
are useful for
k
predicting future values of Y it. Similarly, if
γ ji ≠ 0 for some k > 0 then Y it is said to
lead Y jt. It is possible
that Y it leads Y jt
and vice-versa. In this case, there is said to be dynamic feedback
between the two
series.

All of the lag k cross covariances and correlations are summarized


in the (n × n) lag k cross covariance
and lag k cross
correlation matrices

Γ k = E[(Y t − μ)(Y t − k − μ) ′]

( )
cov(Y 1t, Y 1t − k) cov(Y 1t, Y 2t − k) ⋯ cov(Y 1t, Y nt − k)
cov(Y 2t, Y 1t − k) cov(Y 2t, Y 2t − k) ⋯ cov(Y 2t, Y nt − k)
= ,
⋮ ⋮ ⋱ ⋮
cov(Y nt, Y 1t − k) cov(Y nt, Y 2t − k) ⋯ cov(Y nt, Y nt − k)

C k = D − 1Γ kD − 1.

′ ′
The matrices Γ k and C k are not
symmetric in k but it is easy to show that Γ − k = Γ k
and C − k = C k.

Example 4.12 (Multivariate Gaussian white noise processes)


Let {Y t} be an n × 1 vector time series process.
If Y t ∼ iid N(0, Σ) then
{Y t} is called multivariate
Gaussian white noise
and is denoted Y t ∼ GWN(0, Σ).
Notice that

E[Y t] = 0,
var(Y t) = Σ,
k
cov(Y jt, Y jt − k) = γ jj = 0 (for k > 0)
k
cov(Y it, Y jt − k) = γ ij = 0 (for k > 0)

Hence, the elements of {Y t} are contemporaneously correlated but


exhibit no time dependence. That is,
each element of Y t exhibits no time dependence and
there is no dynamic feedback between any two
elements. Simulating
observations from GWN(0, Σ) requires
simulating from a multivariate normal
distribution, which can be done
using the mvtnorm function  rmvnorm() . For example,
to simulate and
plot T = 250 observation from a bivariate GWN(0, Σ)
process with

Σ=
( )
4
1
1
1
⇒C=
( 1
0.5
0.5
1 )
use:

library(mvtnorm)

Sigma = matrix(c(4, 1, 1, 1), 2, 2)

set.seed(123)

Y = rmvnorm(250, sigma=Sigma)

colnames(Y) = c("Y1", "Y2")

ts.plot(Y, lwd=2, col=c("black", "blue"))

abline(h=0)

legend("topleft", legend=c("Y1", "Y2"),

lwd=2, col=c("black", "blue"), lty=1)


Figure 4.7: Simulated bivariate GWN process.

The simulated values are shown on the same plot in Figure 4.7.
Both series fluctuate randomly about
zero, and the first series (black
line) has larger fluctuations (volatility) than the second series
(blue line).
The two series are contemporaneously correlated (ρ 12 = 0.5)
but are both uncorrelated over time (
ρ k11 = ρ k22 = 0, k > 0)
and are not cross-lag correlated (ρ k12 = ρ k21 = 0, k > 0).

4.3 Time Series Models

Time series models are probability models that are used to describe
the behavior of a stochastic process.
In many cases of interest, it
is assumed that the stochastic process to be modeled is covariance
stationary and ergodic. Then, the main feature of the process to be
modeled is the time dependence
between the random variables. In this
section, we illustrate some simple models for covariance stationary
and ergodic time series that exhibit particular types of linear time
dependence captured by
autocorrelations. The univariate models, made
popular originally by (Box and Jenkins 1976), are called
autoregressive
moving average (ARMA) models. The multivariate model, made popular
by (Sims 1980),
is called the vector autoregressive (VAR) model.
These models are used extensively in economics and
finance for modeling
univariate and multivariate time series.

4.3.1 Moving average models

Moving average models are simple covariance stationary and ergodic


time series models built from linear
functions of GWN that can capture
time dependence between random variables that lasts only for a finite
number of lags.
4.3.1.1 MA(1) Model

Suppose you want to create a covariance stationary and ergodic stochastic


process {Y t} in which Y t and
Y t − 1 are correlated
but Y t and Y t − j are not correlated for j > 1. That is,
the time dependence in the
process only lasts for one period. Such
a process can be created using the first order moving average
(MA(1)) model:

Y t = μ + ε t + θε t − 1,   − 1 < θ < 1,
2
ε t ∼ GWN(0, σ ε ).

The MA(1) model is a simple linear function of the GWN random variables
ε t and ε t − 1. This linear
structure
allows for easy analysis of the model. The moving average parameter
θ determines the sign and
magnitude of the correlation between
Y t and Y t − 1. Clearly, if θ = 0 then Y t = μ + ε t
so that {Y t} is GWN with
non-zero mean μ and exhibits
no time dependence. As will be shown below, the MA(1) model produces
a
covariance stationary and ergodic process for any (finite) value
of θ. The restriction − 1 < θ < 1 is called
the invertibility restriction and will be explained below.

To verify that (4.2) process is a covariance stationary


process we must show that the mean, variance and
autocovariances are
time invariant. For the mean, we have:

E[Y t] = μ + E[ε t] + θE[ε t − 1] = μ,

because E[ε t] = E[ε t − 1] = 0.

For the variance, we have

var(Y t) = σ 2 = E[(Y t − μ) 2] = E[(ε t + θε t − 1) 2]


2 2
= E[ε t ] + 2θE[ε tε t − 1] + θ 2E[ε t − 1 ]
2 2 2
= σ ε + 0 + θ 2σ ε = σ ε (1 + θ 2).

The term E[ε tε t − 1] = cov(ε t, ε t − 1) = 0


because {ε t} is an independent process.

For γ 1 = cov(Y t, Y t − 1), we have:

cov(Y t, Y t − 1) = E[(Y t − μ)(Y t − 1 − μ)]


= E[(ε t + θε t − 1)(ε t − 1 + θε t − 2)]
= E[ε tε t − 1] + θE[ε tε t − 2]
2
+ θE[ε t − 1 ] + θ 2E[ε t − 1ε t − 2]
2 2
= 0 + 0 + θσ ε + 0 = θσ ε .

Here, the sign of γ 1 is the same as the sign of θ.

For ρ 1 = cor(Y t, Y t − 1) we have:

γ1 θσ 2ε θ
ρ1 = = = .
σ2 2
σ ε (1 + θ 2) (1 + θ 2)
Clearly, ρ 1 = 0 if θ = 0; ρ 1 > 0 if θ > 0; ρ 1 < 0
if θ < 0. Also, the largest value for | ρ 1 | is 0.5 which
occurs
when | θ | = 1. Hence, a MA(1) model cannot describe a
stochastic process that has | ρ 1 | > 0.5. Also, note
that there
is more than one value of θ that produces the same value of
ρ 1. For example, θ and 1/θ give
the same
value for ρ 1. The invertibility restriction − 1 < θ < 1
provides a unique mapping between θ and
ρ 1.

For γ 2 = cov(Y t, Y t − 2), we have:

cov(Y t, Y t − 2) = E[(Y t − μ)(Y t − 2 − μ)]


= E[(ε t + θε t − 1)(ε t − 2 + θε t − 3)]
= E[ε tε t − 2] + θE[ε tε t − 3]
+ θE[ε t − 1ε t − 2] + θ 2E[ε t − 1ε t − 3]
= 0 + 0 + 0 + 0 = 0.

Similar calculations can be used to show that cov(Y t, Y t − j) = γ j = 0 for j > 1.


Hence, for j > 1 we have
ρ j = 0 and there is only time dependence
between Y t and Y t − 1 but no time dependence between Y t
and
Y t − j for j > 1. Because ρ j = 0 for j > 1 the MA(1)
process is trivially ergodic.

Example 4.13 (Simulating values from MA(1) process)


Consider simulating T = 250 observations from (4.2) with
μ = 1, θ = 0.9 and σ ε = 1. When simulating
an
MA(1) process, you need to decide how to start the simulation.
The value of Y t at t = 0, y 0, is called the
initial
value and is the starting value for the simulation. Now, the first
two observations from (4.2) starting
at t = 0 are

y 0 = μ + ε 0 + θε − 1,
y 1 = μ + ε 1 + θε 0.

It is common practice is to set ε − 1 = ε 0 = 0


so that y 0 = μ, y 1 = μ + ε 1 = y 0 + ε 1
and ε 1 is the first simulated
error term. The remaining
observations for t = 2, …, T are created from (4.2).
We can implement the
simulation in a number of ways in R. The most
straightforward way is to use a simple loop:

n.obs = 250

mu = 1

theta = 0.9

sigma.e = 1

set.seed(123)

e = rnorm(n.obs, sd = sigma.e)

y = rep(0, n.obs)

y[1] = mu + e[1]

for (i in 2:n.obs) {

y[i] = mu + e[i] + theta*e[i-1]

head(y, n=3)

## [1] 0.4395 0.2654 2.3515


The “for loop” in R can be slow, however, especially for a very
large number of simulations. The
simulation can be more efficiently
implemented using vectorized calculations as illustrated below:

set.seed(123)

e = rnorm(n.obs, sd = sigma.e)

em1 = c(0, e[1:(n.obs-1)])

y = mu + e + theta*em1

head(y, n=3)

## [1] 0.4395 0.2654 2.3515

The vectorized calculation avoids looping all together and computes


all of the simulated values at the
same time. This can be considerably
faster than the “for loop” calculation.

The MA(1) model is a special case of the more general autoregressive


integrated moving average
(ARIMA) model. R has many built-in functions
and several packages for working with ARIMA models. In
particular,
the R function  arima.sim()  can be used to simulate observations
from a MA(1) process. It
essentially implements the simulation loop
described above. The arguments of  arima.sim()  are:

args(arima.sim)

## function (model, n, rand.gen = rnorm, innov = rand.gen(n, ...),

## n.start = NA, start.innov = rand.gen(n.start, ...), ...)

## NULL

where  model  is a list with named components describing the


ARIMA model parameters (excluding the
mean value),  n  is the
number of simulated observations,  rand.gen  specifies the
pdf for ε t,  innov  is a
vector ε t values
of length  n ,  n.start  is the number of pre-simulation
(burn-in) values for ε t,
 start.innov  is a
vector of  n.start  pre-simulation values for ε t,
and  ...  specify any additional

arguments for  rand.gen .


For example, to perform the same simulations as above use:

ma1.model = list(ma=0.9)

set.seed(123)

y = mu + arima.sim(model=ma1.model, n=250,

n.start=1, start.innov=0)

head(y, n=3)

## [1] 0.4395 0.2654 2.3515

The  ma  component of the  "list"  object  ma1.model 


specifies the value of θ for the MA(1) model, and
is used
as an input to the function  arima.sim() . The options  n.start = 1  and  start.innov = 0  sets
the start-up initial value
ε 0 = 0. By default,  arima.sim()  sets μ = 0,
specifies ε t ∼ GWN(0, 1), and returns
ε t + θε t − 1 for t = 1, …, T. The simulated value for Y t is constructed by
adding the value of  mu  (μ = 1) to
the output of  arima.sim() .

The function  ARMAacf()  can be used to compute the theoretical


autocorrelations, ρ j, from the MA(1)
model (recall, ρ 1 = θ / (1 + θ 2)
and ρ j = 0 for j > 1). For example, to compute ρ j
for j = 1, …, 10 use:

ma1.acf = ARMAacf(ar=0, ma=0.9, lag.max=10)

ma1.acf

## 0 1 2 3 4 5 6 7 8 9

## 1.0000 0.4972 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

## 10

## 0.0000

Figure 4.8 shows the simulated data


and the theoretical ACF created using:

par(mfrow=c(2,1))

ts.plot(y,main="MA(1) Process: mu=1, theta=0.9",

xlab="time",ylab="y(t)", col="blue", lwd=2)

abline(h=c(0,1))

plot(0:10, ma1.acf,type="h", col="blue", lwd=2,

main="ACF for MA(1): theta=0.9",xlab="lag",ylab="rho(j)")

abline(h=0)
2
Figure 4.8: Simulated values and theoretical ACF from MA(1) process with μ = 1, θ = 0.9 and σ ε = 1.

par(mfrow=c(1,1))

Compared to the GWN process in 4.2,


the MA(1) process is a bit smoother in its appearance. This is due
to the positive one-period time dependence captured by ρ 1 = 0.4972.

Example 4.14 (MA(1) model for overlapping continuously compounded returns)


Let R t denote the one-month continuously compounded return and
assume that:

R t ∼  GWN(μ, σ 2).

Consider creating a monthly time series of two-month continuously


compounded returns using:

R t(2) = R t + R t − 1.

The time series of these two-month returns, observed monthly, overlap


by one month:

R t(2) = R t + R t − 1,
R t − 1(2) = R t − 1 + R t − 2,
R t − 2(2) = R t − 2 + R t − 3,

The one-month overlap in the two-month returns implies that {R t(2)}
follows an MA(1) process. To show
this, we need to show that the autocovariances
of {R t(2)} behave like the autocovariances of an MA(1)
process.

To verify that {R t(2)} follows an MA(1) process, first we


have:

E[R t(2)] = E[R t] + E[R t − 1] = 2μ,


var(R t(2)) = var(R t + R t − 1) = 2σ 2.

Next, we have:

cov(R t(2), R t − 1(2)) = cov(R t + R t − 1, R t − 1 + R t − 2) = cov(R t − 1, R t − 1) = var(R t − 1) = σ 2,

and,

cov(R t(2), R t − 2(2)) = cov(R t + R t − 1, R t − 2 + R t − 3) = 0,


cov(R t(2), R t − j(2)) = 0 for j > 1.

Hence, the autocovariances of {R t(2)} are those of an MA(1)


process.

Notice that

σ2 1
ρ1 = = .
2σ 2 2

θ
What MA(1) process describes {R t(2)}? Because ρ 1 = = 0.5
it follows that θ = 1. Hence, the MA(1)
1 + θ2
process has mean 2μ
and θ = 1 and can be expressed as the MA(1) model:

R t(2) = 2μ + ε t + ε t − 1,
ε t ∼ GWN(0, σ 2).

Notice that this is a non-invertible MA(1) model.

4.3.1.2 MA(q) Model

The MA(q) model has the form

2
Y t = μ + ε t + θ 1ε t − 1 + ⋯ + θ qε t − q,  where ε t ∼ GWN(0, σ ε ).

The MA(q) model is stationary and ergodic provided θ 1, …, θ q


are finite. The moments of the MA(q) (see
end-of-chapter exercises)
are

E[Y t] = μ,
2 2
γ 0 = σ 2(1 + θ 1 + ⋯ + θ q ),

γj =
{ (θ j + θ j + 1θ 1 + θ j + 2θ 2 + ⋯ + θ qθ q − j )σ 2 for j = 1, 2, …, q .
0 for j > q

Hence, the ACF of an MA(q) is non-zero up to lag q and is zero


afterward.
Example 4.15 (Overlapping returns and MA(q) models)
MA(q) models often arise in finance through data aggregation transformations.
For example, let
R t = ln(P t / P t − 1) denote the monthly continuously
compounded return on an asset with price P t. Define the
annual
return at time t using monthly returns as R t(12) = ln(P t / P t − 12) = ∑ 11
j = 0 R t − j.
Suppose
R t ∼ GWN(μ, σ 2) and consider a sample
of monthly returns of size T, {R 1, R 2, …, R T}.
A sample of annual
returns may be created using overlapping
or non-overlapping returns. Let {R 12(12), R 13(12),
…, R T(12)}
denote a sample of T ∗ = T − 11 monthly
overlapping annual returns and {R 12(12), R 24(12), …, R T(12)}
denote a sample of T / 12 non-overlapping annual returns. Researchers
often use overlapping returns in
analysis due to the apparent larger
sample size. One must be careful using overlapping returns because
the monthly annual return sequence {R t(12)} is not a Gaussian
white noise process even if the monthly
return sequence {R t}
is. To see this, straightforward calculations give:

E[R t(12)] = 12μ,


γ 0 = var(R t(12)) = 12σ 2,
γ j = cov(R t(12), R t − j(12)) = (12 − j)σ 2 for j < 12,
γ j = 0 for j ≥ 12.

Since γ j = 0 for j ≥ 12 notice that {R t(12)}


behaves like an MA(11) process:

R t(12) = 12μ + ε t + θ 1ε t − 1 + ⋯ + θ 11ε t − 11,


ε t ∼ GWN(0, σ 2).

4.3.2 Autoregressive Models

Moving average models can capture almost any kind of autocorrelation


structure. However, this may
require many moving average terms in
(4.3). Another type of simple time series model is the
autoregressive model. This model can capture complex autocorrelation
patterns with a small number of
parameters and is used more often
in practice than the moving average model.

4.3.2.1 AR(1) Model

Suppose you want to create a covariance stationary and ergodic stochastic


process {Y t} in which Y t and
Y t − 1 are correlated,
Y t and Y t − 2 are slightly less correlated, Y t and
Y t − 3 are even less correlated and
eventually Y t and Y t − j
are uncorrelated for j large enough. That is, the time dependence
in the process
decays to zero as the random variables in the process
get farther and farther apart. Such a process can
be created using
the first order autoregressive (AR(1)) model:

Y t − μ = ϕ(Y t − 1 − μ) + ε t,   − 1 < ϕ < 1


2
ε t ∼ iid N(0, σ ε )

It can be shown that the AR(1) model is covariance stationary and


ergodic provided − 1 < ϕ < 1. We will
show that the AR(1) process
has the following properties:
E[Y t] = μ,
2
var(Y t) = σ 2 = σ ε / (1 − ϕ 2),
cov(Y t, Y t − 1) = γ 1 = σ 2ϕ,
cor(Y t, Y t − 1) = ρ 1 = γ 1 / σ 2 = ϕ,
cov(Y t, Y t − j) = γ j = σ 2ϕ j,
cor(Y t, Y t − j) = ρ j = γ j / σ 2 = ϕ j.

Notice that the restriction | ϕ | < 1 implies that:

lim ρ j = ϕ j = 0,
j→∞

so that Y t is essentially independent of Y t − j for large


j and so {Y t} is ergodic. For example, if ϕ = 0.5
then
ρ 10 = (0.5) 10 = 0.001; if ϕ = 0.9 then ρ 10 = (0.9) 10 = 0.349.
Hence, the closer ϕ is to unity the stronger is the
time dependence
in the process. If ϕ = 1, then (4.4) becomes the random
walk model Y t = Y t − 1 + ε t and is
a non-stationary
process.

Verifying covariance stationarity for the AR(1) model is more involved


than for the MA(1) model, and
establishing the properties (4.5)
- (4.10) involves some tricks. In what follows, we
will assume that {Y t} is
a covariance stationary process and
that | ϕ | < 1. First, consider the derivation for (4.5).
We have:

E[Y t] = μ + ϕ(E[Y t − 1] − μ) + E[ε t]


= μ + ϕE[Y t − 1] − ϕμ.

Here, we use the first trick. Given that {Y t} is covariance stationary it follows that E[Y t] = E[Y t − 1].
Substituting E[Y t] = E[Y t − 1] into the above and solving for
E[Y t] gives (4.5).

A similar trick can be used to derive (4.6):

2
var(Y t) = ϕ 2(var(Y t − 1)) + var(ε t) = ϕ 2(var(Y t)) + σ ε ,

which uses the fact that Y t − 1 is independent of ε t


(because Y t − 1 only depends on t − 1 values) and
var(Y t) = var(Y t − 1)
given that {Y t} is covariance stationary. Solving for σ 2 = var(Y t)
gives (4.6).

To determine (4.7), we use another trick. Multiply both sides of (4.4) by Y t − 1 − μ and take expectations to
give:

γ1 = E [(Y t − μ )(Y t − 1 − μ )] = ϕE [(Y t − 1 − μ) 2 ] + E [ε t (Y t − 1 − μ )] = ϕσ 2,


which uses the fact that Y t − 1 is independent of ε t,
and var(Y t) = var(Y t − 1) = σ 2. Finally,
to determine (4.9),
multiply both sides of (4.4)
by Y t − j − μ and take expectations to give:

γj = E [(Y t − μ )(Y t − j − μ )] = ϕE [(Y t − 1 − μ)(Y t − j − μ) ] + E [ε t (Y t − j − μ )]


= ϕγ j − 1,

[ ]
which uses the fact that Y t − j is independent of ε t,
and E (Y t − 1 − μ)(Y t − j − μ) = γ j − 1 provided
{Y t} is
covariance stationary. Using recursive substitution
and γ 0 = σ 2 gives (4.9).
The AR(1) model (4.4) is written in what is called the mean-adjusted form. The
mean-adjusted form can
be re-expressed in the form of a linear regression
model as follows:

Y t − μ = ϕ(Y t − 1 − μ) + ε t ⇒
Y t = μ − ϕμ + ϕY t − 1 + ε t
= c + ϕY t − 1 + ε t,

where c = (1 − ϕ)μ ⇒ μ = c / (1 − ϕ) is the intercept of the linear


regression. This regression model
form is
convenient for estimation by ordinary least squares.

Example 4.16 (Simulating values from AR(1) process)


Consider simulating 250 observations from (4.4) with μ = 1,
ϕ = 0.9 and σ ε = 1. To start the simulation,
an
initial value or start-up value for Y 0 is required. A commonly
used initial value is the mean value μ so that
Y 1 = μ + ε 1.
As with the MA(1) model, this can be performed using a simple “for loop” in R:

phi = 0.9

mu = 1

sigma.e = 1

n.obs = 250

y = rep(0, n.obs)

set.seed(123)

e = rnorm(n.obs, sd=sigma.e)

y[1] = mu + e[1]

for (i in 2:n.obs) {

y[i] = mu + phi*(y[i-1] - mu) + e[i]

head(y, 3)

## [1] 0.4395 0.2654 1.8976

Unfortunately, there is no easy way to vectorize the loop calculation.


However, the R function  filter() ,
with optional argument
 method = "recursive" , implements the AR(1) recursion
efficiently in C code and
so is more efficient than the for loop code
in R above:

y = mu + stats::filter(e, phi, method="recursive")

head(y, 3)

## [1] 0.4395 0.2654 1.8976

The R function  arima.sim() , which internally uses the  filter() 


function, can also be used to simulate
observations from an AR(1)
process. For the AR(1) model, the function  arima.sim()  simulates
the
components form of the AR(1) model
Yt = μ + ut ,
u t = ϕu t − 1 + ϵ t.

Hence, to replicate the “for loop” simulation with  arima.sim() 


use:

ar1.model = list(ar=0.9)

mu = 1

set.seed(123)

y = mu + arima.sim(model=ar1.model,n=250,n.start=1, start.innov=0)

head(y, 3)

## [1] 0.4395 0.2654 1.8976

The R function  ARMAacf()  can be used to compute the theoretical


ACF for an AR(1) model as follows

ar1.acf = ARMAacf(ar = 0.9, ma = 0, lag.max = 10)

The simulated AR(1) values and the ACF are shown in Figure 4.9.
Compared to the MA(1) process in
Figure 4.8,
the realizations from the AR(1) process are much smoother. That is,
when Y t wanders high
above its mean it tends to stay above the
mean for a while and when it wanders low below the mean it
tends to
stay below for a while.


2
Figure 4.9: Simulated values and ACF from AR(1) model with μ = 1, ϕ = 0.9 and σ ε = 1

Example 4.17 (AR(1) models and the speed of mean reversion)


insert example showing AR(1) models with different values of ρ
discuss the concept of mean reversion
discuss RW model as special case of AR(1)

4.3.2.2 AR(p) Model

The covariance stationary AR(p) model in mean-adjusted form is

Y t − μ = ϕ 1(Y t − 1 − μ) + ⋯ + ϕ p(Y t − p − μ) + ε t,
2
ε t ∼ GWN(0, σ ε ),

where μ = E[Y t]. Like the AR(1), restrictions on the autoregressive


parameters ϕ 1, …, ϕ p are required for
{Y t}
to be covariance stationary and ergodic. A detailed treatment of these
restrictions is beyond the
scope of this book. However, one simple
necessary condition for {Y t} to be covariance stationary is

| ϕ | < 1 where ϕ = ϕ 1 + ⋯ + ϕ p. Hence, in the


AR(p) model the sum of the autoregressive components ϕ
has a
similar interpretation as the single autoregressive coefficient in
the AR(1) model.

The regression form of the AR(p) is


Y t = c + ϕ 1Y t − 1 + ⋯ + ϕ pY t − p + ε t,

where c = μ / (1 − ϕ 1 − ⋯ − ϕ p) = μ / (1 − ϕ). This form


is convenient for estimation purposes because it is in
the form of
a linear regression.

The regression form of the AR(p) model is used very often in practice because of its simple
linear
structure and because it can capture a wide variety of autocorrelation
patterns such as exponential decay,
damped cyclical patterns, and
oscillating damped cyclical patterns. Unfortunately, the mathematical
derivation of the autocorrelations in the AR(p) model is complicated
and tedious (and beyond the scope of
this book). The exercises at the end of the chapter illustrate some of the calculations for the AR(2) model.

4.3.3 Autoregressive Moving Average Models

Autoregressive and moving average models can be combined into a general


model called the
autoregressive moving average (ARMA) model. The ARMA
model with p autoregressive components and
q moving average
components, denoted ARMA(p,q) is given by

Y t − μ = ϕ 1(Y t − 1 − μ) + ⋯ + ϕ p(Y t − p − μ)
+ ε t + θ 1ε t − 1 + ⋯ + θ qε t − q
ε t ∼ GWN(0, σ 2)

The regression formulation is

Y t = c + ϕ 1Y t − 1 + ⋯ + ϕ pY t − p + ε t + θε t − 1 + ⋯ + θε t − q

where c = μ / (1 − ϕ 1 − ⋯ − ϕ p) = μ / (1 − ϕ) and ϕ = ϕ 1 + ⋯ + ϕ p.
This model combines aspects of the pure
moving average models and
the pure autoregressive models and can capture many types of
autocorrelation
patterns. For modeling typical non-seasonal economic and financial
data, it is seldom
necessary to consider models in which p > 2 and
q > 2. For example, the simple ARMA(1,1) model

Y t − μ = ϕ 1(Y t − 1 − μ) + ε t + θ 1ε t − 1

can capture many realistic autocorrelations observed in data.

4.3.4 Vector Autoregressive Models

The most popular multivariate time series model is the (VAR) model. The VAR model is a multivariate
extension
of the univariate autoregressive model (4.12). For example,
a bivariate VAR(1) model for
Y t = (Y 1t, Y 2t) ′
has the form

( )()( )( ) ( )
1 1
Y 1t c1 a 11 a 12 Y 1t − 1 ε 1t
= + 1 1
+ ,
Y 2t c2 a 21 a 22 Y 2t − 1 ε 2t

or

1 1
Y 1t = c 1 + a 11 Y 1t − 1 + a 12 Y 2t − 1 + ε 1t,
1 1
Y 2t = c 2 + a 21 Y 1t − 1 + a 22 Y 2t − 1 + ε 2t,
where

( ) (( ) ( ))
ε 1t 0 σ 11 σ 12
∼ iid N , .
ε 2t 0 σ 12 σ 22

In the equations for Y 1 and Y 2, the lagged values of both


Y 1 and Y 2 are present. Hence, the VAR(1) model
allows
for dynamic feedback between Y 1 and Y 2 and can capture
cross-lag correlations between the
variables. In matrix notation,
the model is

Y t = AY t − 1 + ε t,
ε t ∼ N(0, Σ),

where

( ) ( )
1 1
a 11 a 12 σ 11 σ 12
A= 1 1
, Σ= .
a 21 a 22 σ 12 σ 22

The general VAR(p) model for Y t = (Y 1t, Y 2t, …, Y nt) ′


has the form

Y t = c + A 1Y t − 1 + A 2Y t − 2 + ⋯ + A pY t − p + ε t,

where A i are (n × n) coefficient matrices and


ε t is an (n × 1) unobservable zero
mean white noise vector
process with covariance matrix Σ.
VAR models are capable of capturing much of the complicated
dynamics
observed in stationary multivariate time series.

Example 4.18 (Simulated values from a bivariate VAR(1) process)


To be completed

4.4 Forecasting

One of the main practical uses of time series models is for forecasting future observations
The existence of time dependence in a covariance stationary time series means that we can exploit
this time dependence to obtain forecasts of future observations that are superior to the unconditional
mean.
Describe basic forecasting problem
Show result that conditional mean is optimal mse forecast
Show chain-rule of forecasting for AR(1) processes

4.5 Further Reading: Time Series Concepts

This chapter gives a very brief introduction to time series modeling.


More thorough treatments of time
series analysis with an orientation
towards economics and finance with examples in R are given in
(Ruppert and Matteson 2015), (Tsay 2010), and (Zivot 2016). The CRAN task
view for Time Series is an
excellent summary of R packages used for
time series analysis.
4.6 Problems: Time Series Concepts

4.6.1 Exercise 4.1

Suppose the time series {X t }∞t = − ∞


is independent white noise. That is X t ∼ i. i. d. (0, σ 2).
Define two new
time series { Y t }
and {Z t } , where Y t = X t
and Z t = | X t | . Are {Y t }
and {Z t }
∞ ∞ 2 ∞ ∞
also
t= −∞ t= −∞ t= −∞ t= −∞
independent
white noise processes? Why or why not?

4.6.2 Exercise 4.2

Realizations from four stochastic processes are


given in Figure . Which processes appear to be
covariance stationary
and which processes appear to be non-stationary? Briefly justify your
answers.

4.6.3 Exercise 4.3

Consider the MA(1) model

Y t = 0.05 + ε t + θε t − 1, − 1 < θ < 1


ε t ∼ iid N(0, (0.10) 2).

This process has mean E[Y t] = 0.05.

1. Calculate var(Y t) and ρ 1 = cor(Y t, Y t − 1) for θ = 0.5 and θ = 0.9.


2. Using the R function  arima.sim() , simulate and plot 250 observations of the MA(1) process with
θ = 0.5 and θ = 0.9 . Briefly comment on the behavior of the simulated data series. Does it look
covariance stationary? Does it show evidence of time dependence?

4.6.4 Exercise 4.4

Consider the AR(1) model

Y t − 0.05 = ϕ(Y t − 1 − 0.05) + ε t, − 1 < ϕ < 1


ε t ∼ iid N(0, (0.10) 2).

This process has mean E[Y t] = 0.05.

1. Calculate var(Y t) for ϕ = 0.5 and ϕ = 0.9.


2. Calculate ρ j = cor(Y t, Y t − j) for ϕ = 0.5 and ϕ = 0.9 and for j = 1, …, 5.
3. Using the R function  arima.sim() , simulate and plot 250 observations of the AR(1) process with
θ = 0.5 and θ = 0.9 . Briefly comment on the behavior of the simulated data series. Does it look
covariance stationary? Does it show evidence of time dependence? How is it different from the MA(1)
process
4.6.5 Exercise 4.5

Figure shows a realization of a stochastic process


representing a monthly time series of overlapping 2-
month continuously
compounded returns r t(2) = r t + r t − 1, where the 1-month continuously
compounded
returns r t follow a Gaussian White noise process
with variance 1.

1. Based on the sample autocorrelations, which time series process is most appropriate for describing
the series: MA(1) or AR(1)? Justify
your answer.
2. If you think the process is an AR(1) process, what do you think is the value of the autoregressive
parameter? If you think the process is a MA(1) process, what do you think is the value of the moving
average parameter?

4.6.6 Exercise 4.6

Let Y t represent a stochastic process. Under what conditions is Y t covariance stationary?

Realizations from four stochastic processes are given in the Figures below:

Which processes appear to be covariance stationary and which processes appear to be non-stationary?
For those processes that you think are non-stationary, explain why the process is non-stationary.

Consider the following model:

Y t = 10 − 0.67Y t − 1 + ϵ t, ϵ ∼ N(0, 1)

Is it stationary? Why or why not?

Find the mean and the variance of this process.


4.7 Solutions to Selected Problems

Exercise 4.6 Let Y t represent a stochastic process. Under what conditions is Y t covariance stationary?

Realizations from four stochastic processes are given in the Figures below:

2. Which processes appear to be covariance stationary and which processes appear to be non-
stationary?

The only process which appears to be covariance stationary is Process 1 (constant mean, volatility etc.)

For those processes that you think are non-stationary, explain why the process is non-stationary.

Process 2 has an obvious time trend so the mean is not independet of t. Process 3 has a level shift
around observation 75 (the mean shifts up) so that the mean before t = 75 is different from the mean after
t = 75. Process 4 shows an increase in variance/volatility after observation 75. The variance/volatility
before t = 75 is different from the variance/volatility after t = 75.

3. Consider the following model:

Y t = 10 − 0.67Y t − 1 + ϵ t, ϵ ∼ N(0, 1)

Is it stationary? Why or why not?

Stationary. The absolute value of the coefficient of Y t − 1 is 0.67, which is smaller than 1.

Find the mean and the variance of this process.

Mean: 10 / (1 + 0.67) = 5.99

Variance: 1 / (1 − 0.67 2) = 1.8146


References

Box, G., and G. M. Jenkins. 1976. Time Series Analysis : Forecasting and Control. San Francisco:
Holden-Day.

Ruppert, D., and D. S. Matteson. 2015. Statistics and Data Analysis for Financial Engineering with R
Examples. New York: Springer.

Sims, C. A. 1980. Macroeconomics and Reality. Econometrica.

Tsay, R. 2010. Analysis of Financial Time Serie. Wiley.

Zivot, E. 2016. Modeling Financial Time Series with R. New York: Springer.


17. To conserve on notation, we will represent the stochastic process
{Y t} t = − ∞ simply as {Y t}. ↩︎

18. This is also called a Cauchy distribution. For this distribution E[Y t] = var(Y t) = cov(Y t, Y t − j) = ∞. ↩︎

19. As an example of white noise, think of tuning an AM radio. In between


stations there is no signal and
all you hear is static. This is white
noise.↩︎

You might also like