0% found this document useful (0 votes)

11 views13 pages

1-Basic Concepts 34454745

Time series analysis involves statistical methods to analyze data indexed by time, focusing on both observed data and stochastic processes. The document discusses various concepts, including the importance of domain knowledge, data visualization, and the transformation of data to reveal hidden structures, using examples like the S&P/TSX Composite Index and Canadian climate data. It also introduces fundamental concepts such as stochastic processes, white noise, and moving average processes, emphasizing the dependence of successive data points in time series analysis.

Uploaded by

Yanbo PAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views13 pages

1-Basic Concepts 34454745

Uploaded by

Yanbo PAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

1 Basic concepts

1.1 Introduction
Time series analysis refers to statistical theories and methods used to analyze data
indexed by time.
In time series analysis, the word time series has two related meanings:
(i) A data set, actual or simulated, of observations indexed by time.
(ii) A stochastic process (see Definition 1.3) used to model observed time series data.
For clarity, we may use time series data for (i) and time series model for (ii). We
typically denote random variables and stochastic processes in uppercase (e.g., Xt ),
and observed/realized values in lowercase (e.g., xt ). That time is linearly ordered and
directed is crucial in the statistical analysis of time series data (as well as in real life).
It is essential to gain experience by examining a wide variety of real data-sets. We
give two examples to illustrate some general ideas and terminologies.
Example 1.1. The S&P/TSX Composite Index represents the overall performance of
the stocks listed on the Toronto Stock Exchange (TSX).1 We consider the daily closing
value of the index which is available on, say, Yahoo Finance2 and can be downloaded
either directly from Yahoo Finance or through R using the function getSymbol()
in the package quantmod. Familiarity with the underlying subject matter, or domain
knowledge, is essential to meaningful analysis of the data.

S&P/TSX Composite index Daily log returns

0.10
20000

0.05
15000

0.00
−0.05
10000

−0.10

2000 2005 2010 2015 2020 2025 2000 2005 2010 2015 2020 2025

Date Date

Figure 1.1: Left: Time series of the S&P/TSX Composite Index

from 2000-01-04 to 2024-06-14. Right: Daily log return of the
index.

It is always a good idea to plot the data to examine its features. In Figure 1.1(left)
we plot the daily closing values from 2000-01-04 to 2024-06-14. We say that the
frequency is daily. Other common frequencies (at least for economic data) are weekly,
monthly, quarterly and yearly. While we plot the data in calendar time, the data is
indexed by trading day which skips in particular all holidays. For example, 2000-01-
07 is a Friday and the next trading day is 2000-01-10 (Monday). If we represent the
1
More precisely, it is a capitalization-weighted market index where the influence of a constituent stock
is proportional to its market capitalization.
2
See https://ca.finance.yahoo.com/. The ticker symbol (unique identifier) of the S&P/TSX Com-
posite index is ^GSPTSE

2
series by yt , t = 1, 2, . . . T where T is the length of the series, we interpret t as the
t-th trading day of the data-set.
We say that this time series is in discrete time since the time t takes values in a
discrete case (here t ∈ T = {1, . . . , T }). For visualization purposes we often interpolate
linearly between the data points as done in the figure. Some processes, such as high
frequency trading, speech, brain activities and weather, may be considered to occur
in continuous time (say t ∈ T = [a, b] ⊂ R for some interval), although for practical
reasons they are usually sampled in discrete time.
In Figure 1.1(left), we observe that the series yt evolves in a zig-zag matter but
exhibits an overall increasing trend (in finance, a key question is the trade-off between
reward and risk). It is often useful to transform the data to reveal hidden structures
and/or make it more suitable for statistical analysis; some techniques are discussed in
Section 2. In Figure 1.1(right), we consider the daily logarithmic return (which forms
another time series) given by

rt = log yt − log yt−1 = ∇ log yt , t = 2, 3, . . . , T,

where ∇, defined by
∇xt = xt − xt−1 , (1.1)
is the difference operator. This transformation removes the apparent trend in the
data: the log returns are roughly symmetric about zero.3 Also observe that there are
periods of high volatility (e.g., in 2008 (financial crisis) and 2020 (COVID-19)) and
they seem to cluster (rather than spread evenly across time). This phenomenon is
called volatility clustering and is observed in many financial time series.
Example 1.2. We consider Canadian climate data which is publicly available on https:
//climate.weather.gc.ca/. Here, we consider the daily mean temperature,4 in
degrees Celsius, recorded at a meteorological station (id: 6158731) within the Toronto
Pearson International Airport (YYZ). (The concept “temperature of Toronto” is useful
in daily life but is not very specific. An observation must be recorded sometime
somewhere.) The data set is plotted in Figure 1.2. We observe immediately a strong
seasonal pattern (i.e., periodic behaviour) whose fluctuations differ from year to year.

Daily mean temperature at YYZ

30
20
10
0
−10
−20

2014 2016 2018 2020 2022 2024

Date

Figure 1.2: Daily mean temperature, from 2013-06-13 to 2024-

06-13 at the Toronto International Airport.
3
The trend is accounted by the small but positive mean of the daily log returns.
4
Defined by mean = 12 max + 12 min, see https://climate.weather.gc.ca/glossary_e.html. We may
consider the time-averaged temperature but this quantity is not reported.

3
A raw data-set must be suitably cleaned and preprocessed before formal statistical
analysis begins. For example, there may be errors in data entry possibly revealed by
values which are unreasonably large or small. The series of interest may be derived
from other data. Here, the daily mean temperature is derived from the daily maximum
and minimum temperatures. In this data-set there are 21 missing values. Missing
values may be treated by various methods often on a case-by-case basis. Since here
the number of missing values is small (21 in 4013 date points), a reasonable approach
is to estimate a missing value yt by averaging nearby values, say 21 yt−1 + 12 yt+1 . In
other cases missing, or unobserved, values are important and may be accounted by
a suitable model. It is concerning if the results of an analysis are sensitive to the
conventions used.
The key idea of time series analysis is to regard an observed time series (possibly
after a suitable transformation) as the realized value of a stochastic process. Statistical
methods are used to infer the properties of the process. We recall here the general
definition of stochastic process.
Definition 1.3. Let T ⊂ R be a nonempty index set.
(i) (Stochastic process) A stochastic process indexed by T is a collection (Xt )t∈T of
random variables defined on some probability space (Ω, F, P). We simply write
(Xt ) if T is clear from the context. We say that (Xt ) is d-dimensional if each
Xt = (X1,t , . . . , Xd,t ) is a d-dimensional random vector.
(ii) (Sample path) If (Xt )t∈T is a stochastic process, the sample path corresponding
to the sample point ω ∈ Ω is the function t ∈ T 7→ Xt (ω).
For the most part we focus on univariate processes (d = 1) in discrete time, such
that T is an interval of Z (e.g., T = Z+ = {0, 1, 2, . . .} or Z).5 A distinctive feature of
time series is that successive data points are typically dependent, so we cannot regard
them as i.i.d. samples as in “conventional statistics”. In a nutshell, time series analysis
is concerned with the modelling of dependence for stochastic processes.

1.2 Simple examples of stochastic processes

The concept of white noise is fundamental in time series analysis. A white noise may
be regarded as a relatively simple kind of time series and can be used as a building
block of more sophisticated models. Some basic examples are given.
Definition 1.4. Let (Xt )t∈T , T ⊂ Z, be a real-valued stochastic process.
(i) (i.i.d. noise) (Xt ) follows an i.i.d. noise process with mean 0 and variance σ 2 >
0, written (Xt ) ∼ IID(0, σ 2 ), if (Xt )t∈T is a collection of i.i.d. random variables
with mean 0 and variance σ 2 .
(ii) (White noise) (Xt ) follows a white noise process with mean 0 and variance σ 2 >
0, written (Xt ) ∼ WN(0, σ 2 ), if each Xt has mean 0 and variance σ 2 and (Xt )
are pairwise uncorrelated, i.e.,
2
σ , if s = t;
Cov(Xs , Xt ) = (1.2)
0, if s 6= t.
i.i.d.
Example 1.5 (Gaussian white noise). If Xt ∼ N (0, σ 2 ) then (Xt ) ∼ IID(0, σ 2 ). A
simulated sample path is shown in Figure 1.3(left). We can obtain non-Gaussian
i.i.d. noise processes simply by using other distributions such as the t-distribution and
the Laplace distribution. Examples of white noise processes which are not i.i.d. noise
processes will be given later.
5
In applications a discrete time series may not be indexed by an integer. For example, for a monthly
1
series we may use 1997 for January 1997 and 1997 + 12 for February 1997.

4
A sample path of Gaussian white noise Sample paths of Gaussian random walk

30
2

20
1

10
0

0
−10
−1
−2

−30
0 50 100 150 200 250 300 0 50 100 150 200 250 300

i.i.d.
Figure 1.3: Left: A sample path of Gaussian white noise Xt ∼
N (0, 1). Right: Ten sample paths of Gaussian random walk where
i.i.d.
Xt ∼ N (0, 1).

Since independence implies uncorrelation, we have

(Xt ) ∼ IID(0, σ 2 ) ⇒ (Xt ) ∼ WN(0, σ 2 ).

The converse is generally false but holds (Xt ) is a Gaussian process, i.e., (Xt1 , . . . , Xtk )
has a multivariate normal distribution for all k ≥ 1 and t1 , . . . , tk ∈ T.
Example 1.6 (A white noise which is not i.i.d.). Let R ≥ 0 be a non-negative and non-
constant random variable with E[R2 ] = σ 2 , and let Zt be an i.i.d. process, independent
of R, such that P(Zt = ±1) = 21 . Now define

Xt = RZt .

By independence, we have

E[Xt ] = E[RZt ] = E[R]E[Zt ] = E[E] · 0 = 0.

Next compute

Cov(Xs , Xt ) = E[(RZs )(RZt )] = E[R2 ]Cov(Zs , Xt ) = σ 2 δst ,

where δst is the Kronecker delta. Thus Xt ∼ WN(0, σ 2 ).

Nevertheless, Xt is not and i.i.d. noise. To see this, note that by construction
|Xt | = R for all t. It follows that Cov(|Xs |, |Xt |) = 1 for all s, t. (If (Xt ) was an
i.i.d. process we would have Cov(|Xs |, |Xt |) = 0 by independence.)
Example 1.7 (Random walk). A random walk started at s0 ∈ R is a process (St )t∈Z+
given by S0 = s0 and

St = S0 + X1 + · · · + Xt , t = 1, 2, . . . , (1.3)

where (Xt )∞ 2
t=1 is a collection of i.i.d. random variables (e.g., (Xt ) ∼ IID(0, σ )). See
Figure 1.3(right) for an illustration. Assuming Xt has finite variance, we may write

Xt = µ + σt ,

5
A sample path of an MA(3) process

3
2
1
0
−1
−2
−3

0 50 100 150 200 250 300

Figure 1.4: A simulated sample path of an MA(3) process where

the noises are i.i.d. N (0, 1) and (θ1 , θ2 , θ3 ) = (0.8, 0.2, −0.2). Note
the difference with the Gaussian white noise simulated in Figure
1.3(left).

where µ = E[Xt ] and (t )t ∼ IID(0, 1). Then

t
X
St = s0 + tµ + σ t , (1.4)
s=1

which has a linear trend. We may recover (Xt ) from (St ) by taking the first difference:
∇St = St − St−1 = Xt = µ + σt . (1.5)
Here differencing removes the trend. Observe that the TSX series in Figure 1.1(left)
looks qualitatively somewhat similar to a random walk if we neglect the big jumps
and volatility clustering.
Example 1.8 (Moving average process). Let (Zt )t∈Z ∼ WN(0, σ 2 ). Given a constant
θ ∈ R, define (Xt )t∈Z by
Xt = Zt + θZt−1 , t ∈ Z. (1.6)
More generally, given an integer q ≥ 1 and θ1 , . . . , θq ∈ R, we may define (Xt )t∈Z by
Xt = Zt + θ1 Zt−1 + · · · + θq Zt−q , t ∈ Z. (1.7)
If θq 6= 0, we call (Xt ) defined by (1.7) a moving average process of order q, or simply
an MA(q) process. Thus (1.6) defines an MA(1) process if θ 6= 0. The idea is that
Xt is given as a weighted sum of the current noise Zt and up to q previous noises
Zt−1 , . . . , Zt−q . See Figure 1.4 for an example.
Unlike a white noise process, successive values of a moving average process are
correlated, up to lag q. For example, for an MA(1) process (1.6) we have, for any t,
Cov(Xt+1 , Xt )
= Cov(Zt+1 + θZt , Zt + θZt−1 )
= Cov(Zt+1 , Zt ) + θCov(Zt+1 , Zt−1 ) + θCov(Zt , Zt ) + θ2 Cov(Zt , Zt−1 )
= 0 + θ · 0 + θ · σ 2 + θ2 · 0 = θσ 2 ,
and, by a similar argument,
Cov(Xt+2 , Xt ) = Cov(Xt+3 , Xt ) = · · · = 0.

6
Also, we have

Cov(Xt , Xt ) = Var(Xt ) = Var(Zt + θZt−1 )

= Var(Zt ) + θ2 Var(Zt−1 ) = (1 + θ2 )σ 2 .

Thus (using the symmetry of covariance) we have, for t (time), h (lag) ∈ Z,


 (1 + θ2 )σ 2 if h = 0;
Cov(Xt+h , Xt ) = θσ 2 , if h = ±1; (1.8)

0, otherwise.

This is an example of an autocovariance function (Definition 1.13). By varying q and

the parameters θ1 , . . . , θq , we can obtain a wide range of behaviours. In practice, q
and the coefficients (θj )qj=1 have to be estimated from data.

1.3 Stationarity
In time series analysis, we often observe only a single series (of some length) regarded
as the realized sample path of a stochastic process. Without some assumption of
stationarity, statistical inference is not possible. For example, suppose we observe one
sample (x1 , . . . , xT ) from N (a, I) where a = (a1 , . . . , aT ) ∈ RT is arbitrary. Regardless
of the length T we cannot expect to estimate a accurately. A key idea of time series
is that the underlying process is built up from some stationary process for which
statistical inference is possible. In the following we let the index set T be either Z+
or Z.
Definition 1.9. A stochastic process (Xt )t∈T is strictly stationary if for all k ≥ 1,
distinct t1 , . . . , tk , and h, we have
d
(Xt1 , . . . , Xtk ) = (Xt1 +h , . . . , Xtk +h ).

That is, the finite dimensional distributions of (Xt ) are invariant under time shifts.
Example 1.10 (Simple examples of strictly stationary processes).
(i) Any i.i.d. process is strictly stationary.
(ii) Let (Xt )t∈Z+ be a time homogeneous Markov chain. Suppose X0 ∼ π where π
is stationary for the chain, i.e., if X0 ∼ π then Xt ∼ π for all t. Then (Xt )
is strictly stationary. To give a concrete example, suppose the state space is
X = {0, 1} and the transition matrix P (x, y) = P(Xt+1 = y|Xt = x), x, y ∈ X ,
is given by

P (0, 0) = P (1, 1) = 1 − p, P (0, 1) = P (1, 0) = p.

Then the stationary distribution π is the uniform distribution on X .

Strict stationarity involves knowing all finite dimensional distributions. This is
difficult, if not impossible, to verify in practice since with only a finite number of
observations the invariance of joint distributions under time shift cannot be accurately
assessed. Weak stationarity (also called second order stationarity) focuses only on the
first two moments and hence is generally more applicable. Also, it is mathematically
and computationally more tractable to work with only the first two moments.
Definition 1.11 (Weak/second order stationarity). Let (Xt )t∈T⊂Z be a real-valued
stochastic process such that Var(Xt ) < ∞ for all t. Consider its mean function defined
by
µX (t) = E[Xt ], t ∈ T, (1.9)

7
and its autocovariance function defined by

γX (s, t) = Cov(Xs , Xt ), s, t ∈ T. (1.10)

We also define the autocorrelation function by ρX (s, t) = Cor(Xs , Xt ). We say that

(Xt ) is weakly stationary if (i) the mean function is constant, i.e., µX (t) ≡ m for
some constant m ∈ R; and (ii) the autocovariance function is variant under time
shift, i.e., γX (s, t) = γX (s + h, t + h) for all s, t, h.
Proposition 1.12. If (Xt ) is strictly stationary and Var(Xt ) < ∞ for all t, then
(Xt ) is weakly stationary. The converse holds if (Xt ) is a Gaussian process.

Proof. The first statement is clear. The second statement holds since a (multivariate)
normal distribution is completely specified by the mean and covariance matrix.

From now on, by statonarity we mean weak stationarity unless otherwise specified.
In (ii) above, replace t by s + h. So (ii) is equivalent to

γX (s, s + h) = γX (t, t + h) (1.11)

for all s, t and h. That is, the covariance function depends only on the lag h. When
h = 0,
γX (t, t) = Var(Xt )
is the common variance of Xt .
Definition 1.13. Let (Xt )t∈T , T = Z+ or Z, be a stationary process.
(i) (Autocovariance function) The autocovarinace function (ACVF) of X is defined
by
γX (h) = Cov(Xt , Xt+h ), h ∈ Z. (1.12)
(ii) (Autocorrelation function) The autocorrelation function (ACF) of X is defined
(when γX (0) > 0) by
Cov(Xt , Xt+h ) γX (h)
ρX (h) = Cor(Xt , Xt+h ) = p p = , h ∈ Z. (1.13)
Var(Xt ) Var(Xt+h ) γX (0)

Remark 1.14. The autocovariance γX (h) = Cov(Xt , Xt+h ) only measures linear de-
pendence between Xt and Xt+h . Xt and Xt+h can be dependent even if Cov(Xt , Xt+h ) =
0.
Let X and Y be real-valued random variables. It can be shown that X and Y are
independent if and only if

E[f (X)g(Y )] = E[f (X)]E[g(Y )] (1.14)

for any (bounded) functions f and g. Uncorrelation between X and Y only requires
(1.14) to hold when f and g are affine functions (i.e., functions of the form ax + b).
Proposition 1.15 (Properties of ACVF and ACF). Let (Xt ) be a stationary process.
Then:
(i) (Normalization) ρX (0) = 1.
(ii) (Symmetry) γX (h) = γX (−h) and ρX (h) = ρX (−h). (Thus in (1.12) and (1.13)
we may restrict to h ≥ 0.)
(iii) (Positive semidefiniteness) For k ≥ 1, lags h1 , . . . , hk and constants a1 , . . . , ak ∈
R, we have
Xk
ai aj γX (hi − hj ) ≥ 0. (1.15)
i,j=1

8
Proof. (i) Obvious since ρX (0) = Cor(Xt , Xt ) = 1.
(ii) By symmetry of the covariance, we have

γX (h) = Cov(Xt+h , X) = Cov(Xt , Xt−h ) = γX (−h).

(iii) Observe that the left hand side of (1.15) is the variance of the linear combi-
Pk
nation i=1 ai X(t + hk ), which is non-negative:
k
! n k
X X X
0 ≤ Var ai Xt+hi = ai aj Cov(Xt+hi , Xt+hj ) = ai aj γX (hi − hj ).
i=1 i,j=1 i,j=1

Remark 1.16. Conversely, any function γ : Z → R (or Z+ → R) which satisfies (ii)

and (iii) in Proposition 1.15 is the ACVF of some stationary process. In particular,
we can construct a strictly stationary Gaussian process (Xt ) whose ACVF is γ.
Example 1.17.
(i) A white noise process (Xt ) ∼ WN(0, σ 2 ) is stationary by construction. Clearly
Cov(Xt , Xt+h ) is equal to σ 2 if h = 0 and 0 if h 6= 0. Thus its ACF is given by

1 if h = 0;
γX (h) = (1.16)
0, if h 6= 0.

(ii) Let (Xt ) be an MA(1) process as in (1.6) in Example 1.8. From (1.8), (Xt ) is
stationary and its ACF is given by

 1 if h = 0;
θ
γX (h) = 2 if |h| = 1; (1.17)
 1+θ
0, if |h| =
6 2.

Example 1.18. Consider the process St = s0 + X1 + · · · + Xt , t ≥ 0, where s0 is

a constant and (Xt ) ∼ WN(0, σ 2 ) (c.f. Example 1.7). Consider its autocovariance
function γS (s, t) = Cov(Ss , St ). By symmetry of γS , we may assume without loss of
generality that t ≥ s. Since the increments of (St ) are uncorrelated, we have

γS (s, t) = Cov(Ss + (St − Ss ), Ss )

= Cov(Ss , Ss ) + Cov(Ss , St − Ss )
= Var(Ss ) + 0 = sσ 2 .

Thus for s, t ≥ 0 we have γS (s, t) = min{s, t} which is not a function of the lag s − t.
We conclude that S is not stationary. Consider

γS (s, t) min{s, t}σ 2 min{s, t}

Cor(Ss , St ) = p p = √ √ = √ √ .
γS (s, s) γS (t, t) 2
sσ tσ 2 s t

For h ≥ 0, we have
√
t t
Cor(St , St+h ) = √ √ =√ , (1.18)
t( t + h) t+h

which decays like O( h1 ) as h increases.

We study one more fundamental example.

9
Example 1.19 (Autoregressive process of order 1). Let (Zt )t∈Z ∼ WN(0, σ 2 ), and let
φ ∈ R be a constant satisfying |φ| < 1. Define a process (Xt )t∈Z by
∞
X
Xt = φj Zt−j = Zt + φZt−1 + φ2 Zt−2 + · · · , (1.19)
j=0

which P
is a moving average process of infinite order (an instance of MA(∞) process).
∞ j 2
Since j=0 |φ | < ∞ and (Zt ) ∼ WN(0, σ ), the series converges absolutely with
probability 1. To show this, consider
 
X∞ ∞
X
E |φj Zt−j | = |φj |E[|Zt−j |],
j=0 j=0

where the equality holds by the monotone convergence theorem. On the other hand,
since (Zt ) ∼ WN(0, σ 2 ), we have, by the Cauchy-Schwarz inequality,
q
E[|Zt−j |] = E[|Zt |] ≤ E[Zt2 ] · 1 = σ.

Using the geometric sum formula, we have

 
X∞ ∞
X σ
E |φj Zt−j | ≤ σ |φ|j = < ∞.
j=0 j=0
1 − |φ|

P∞
Thus with probability 1 we have j=0 |φj Zt−j | < ∞, i.e., the sum converges abso-
lutely.
Since
Xt = Zt + φ(Zt−1 + φZt−2 + · · · ) = Zt + φXt−1 ,
| {z }
Xt−1

(Xt ) satisfies the recursion

Xt = φXt−1 + Zt , t ∈ Z. (1.20)

Thus Xt is a weighted sum of its previous value Xt−1 and the current noise Zt .
We say that (Xt ) follows an autoregressive process of order 1 (denoted as AR(1)).
Autoregressive processes of higher orders will P be introduced later.
∞
Let’s verify that (Xt ) is stationary. Since j=0 |φj | < ∞ and (Zt ) ∼ WN(0, σ 2 ),
it is possible to exchange sums with expectation/covariance operators.6 We have
 
X∞ X∞ ∞
X
E[Xt ] = E  φj Zt−j  = φj E[Zt−j ] = φj · 0 = 0,
j=0 j=0 j=0

so the mean function is identically zero.7 Since (Zt ) ∼ WN(0, σ 2 ), we have

 
X ∞ X∞
σ2
Var(Xt ) = Var  φj Zt−j  = φ2j Var(Zt−j ) = .
j=0 j=0
1 − φ2

6
We omit the proof P
which can be found in [1, Section 3.1].
7
Letting Xt = µ + ∞ j
j=0 φ Zt−j , where µ ∈ R, shifts the mean to µ. The recursion (1.20) becomes
Xt − µ = φ(Xt−1 − µ) + Zt .

10
φ = 0.9 φ = −0.9

6
4
2

2
0

0
−2

−2
−4

−4
0 50 100 150 200 250 300 0 50 100 150 200 250 300

Figure 1.5: Simulated sample paths of AR(1) processes where

i.i.d.
Zt ∼ N (0, 1). Left: φ = 0.9. Right: φ = −0.9. The same noise
sequence is used to simulate both paths.

So the covariance function Cov(Xt , Xt+h ) is well-defined. To find γX (t, t + h) for

h > 0, we use the recursion (1.20) and observe that

γX (t, t + h) = Cov(Xt , Xt+h )

= Cov(Xt , φXt+h−1 + Zt+h )
= φCov(Xt , Xt+h−1 ) + Cov(Xt , Zt+h ) (1.21)
= φCov(Xt , Xt+h−1 )
= φγX (t, t + (h − 1)).

In the fourth equality we used the fact that

∞
X
Cov(Xt , Zt+h ) = φj Cov(Zt−j , Zt+h ) = 0, h > 0.
j=0

Iterating (1.21) shows that

σ2
γX (t, t + h) = φh γX (t, t) = φh , h ≥ 0.
1 − φ2

Since γX (t, t + h) is independent of t, (Xt ) is stationary with autocorrelation function

1 if h = 0;
ρX (h) = (1.22)
φj if h ≥ 1;

which decays geometrically. When φ ∈ (0, 1), ρX (h) decays monotonically. When
φ ∈ (−1, 0), the sign of ρX (h) alternates. See Figure 1.5 to get a feel of how this
relates to behaviours of the sample paths. The ACFs (theoretical and sample) are
plotted in Figure 1.6.

1.4 Sample autocorrelation function

We introduce the sample analogues of the ACVF and ACF.

11
φ = 0.9 φ = −0.9

1.0

1.0
theoretical theoretical
sample sample
0.5

0.5
ACF

ACF
0.0

0.0
−0.5

−0.5
−1.0

−1.0
0 5 10 15 20 0 5 10 15 20

Lag Lag

Figure 1.6: Theoretical and sample ACFs of the sample paths

in Figure 1.5. Left: The case φ = 0.9. Right: The case φ = −0.9.
The horizontal dashed lines are at ± 1.96
√
T
where T is the length of
the series.

Definition 1.20. Let (xt )Tt=1 be a real-valued time series data-set.

PT
(i) (Sample mean) The sample mean x̄ is defined by x̄ = T1 t=1 xt .
(ii) (Sample autocovariance function) The sample covariance function is defined for
0 ≤ h < T by
T −h
1 X
γ̂(h) = γ̂(−h) = (xt+h − x̄)(xt − x̄). (1.23)
T t=1

(iii) (Sample correlation function) The sample correlation function is defined for 0 ≤
h < T by
γ̂(h)
ρ̂(h) = ρ̂(−h) = . (1.24)
γ̂(0)
Note that in (1.23) we divide by T rather than T − h (which varies with the lag
h). This ensures that γ̂ is positive semidefinite in the sense of Proposition 1.15(iii).
Example 1.21. In Figure 1.6 we plot the sample ACFs of the two simulated paths of
AR(1) process (for φ = 0.9 and φ = −0.9 respectively). Observe that the patterns of
the sample ACFs match – up to sampling errors – those of the theoretical ones.
Example 1.22. In Figure 1.7 we plot a simulated path of a random walk (as in Figure
1.3) and its sample ACF. Observe that the sample ACF stays positive for all lags
shown and decays rather slowly (c.f. (1.18)). These behaviours usually indicate that
the underlying process is nonstationary.
In time series analysis it is frequently useful to know whether a process is approx-
imately a white noise. Naturally, we may examine this by the sample ACF. Even if
(xt ) is a realization of a white noise process, due to sampling errors the sample ACF is
likely to be non-zero for non-zero lags. The typical magnitude of fluctuations is given
d
by the following theorem whose proof is beyond the scope of the course. We use →
to denote convergence in distribution.
Theorem 1.23 (Asymptotic distribution of sample ACF). Let (Xt ) ∼ IID(0, σ 2 ).
Let ρ̂T be the sample ACF computed from (X1 , . . . , XT ). Under additional technical

12
Sample path of a random walk Series sample_path

1.0
0
−5

0.8
−10

0.6
ACF
−15

0.4
−20

0.2
−25

0.0
−30

0 50 100 150 200 250 300 0 5 10 15 20

Lag

Figure 1.7: Left: Simulated path of a random walk. Right: Its

sample ACF.

assumptions,8 for any positive lag h we have

√ d
T (ρ̂T (1), . . . , ρT (h)) → N (0, Ih ), (1.25)

where N (0, Ih ) is the standard normal distribution. Thus, for any h 6= 0, ρ̂T (h) is
approximately distributed as N (0, T1 ) when T is sufficiently large.
Here is a straightforward application of the theorem. Fix a lag h, say h = 1. If
(Xt ) is an i.i.d. noise (more precisely, we mean that it satisfies the assumptions of
Theorem 1.23), then ρ̂T (h) is approximately distributed as N (0, T1 ). If we observe
|ρ̂T (h)| > 1.96
√
T
≈ √2T , then we can reject the null hypothesis that (Xt ) is an i.i.d. noise
at 5% significance level. The lines at ± 1.96
√
T
are typically provided in a plot of ACF are
useful for visual inspection of the sample ACF. For example, in Figures √ 1.6 and 1.7
the sample ACF is significant (by which we mean larger than 1.96/ T in magnitude)
at many lags. This is strong evidence that the series is not a white noise.
Corollary 1.24. Let h ≥ 1 be a given maximum lag under consideration Under the
assumptions of Theorem 1.23, we have
h
X d
QT := T ρ̂2T (j) → χ2h , (1.26)
j=1

where χ2h is the chi-squared distribution with h degrees of freedom.

Proof. This follows from the continuous mapping theorem applied to (1.25).

Given a lag h and a significance level α, the Portmanteau test9 (also called the
Box–Pierce test) states that we reject the null hypothesis that the series is an i.i.d. noise
if the test statistic QT defined by (1.26) exceeds the (1 − α)-quantile of χ2h . Note that
Corollary 1.24 is an asymptotic result so may not be accurate if T is not sufficiently
8
A sufficient condition that the fourth moment of Xt is finite: E[Xt4 ] < ∞.
9
The term “Portmanteau test” is also used for related tests that use a similar test statistic.

13
large. A refinement of the Portmanteau test, called the Ljung-Box test, uses instead
the test statistic
Xh
ρ̂2T (j)
QT := T (T + 2) , (1.27)
j=1
T −j

which has the same limiting distribution χ2h .

Remark 1.25. Although we use various statistical tests, certain subjective decisions
are involved in the analysis of time series (and other) data. For example, we may use
the same data (and prior information) to decide the class of models considered and
estimate the parameters, and we may consider multiple tests and p-values in model
selection and diagnostics (each p-value only applies to a single test when considered
alone). In practice, time series analysis is both a science and an art.

TEST 2, CMTH 380, Summer 2022
No ratings yet
TEST 2, CMTH 380, Summer 2022
14 pages
Probability and Random Processes Stark Solution Manual
No ratings yet
Probability and Random Processes Stark Solution Manual
6 pages
TSA PPT Lesson 03
No ratings yet
TSA PPT Lesson 03
15 pages
Time Series
No ratings yet
Time Series
190 pages
Econometrics For Finance Ch6
No ratings yet
Econometrics For Finance Ch6
10 pages
Corporate Finance 11th Edition Ross Solutions Manual
100% (46)
Corporate Finance 11th Edition Ross Solutions Manual
17 pages
Sem-Iii PS QB 2023
No ratings yet
Sem-Iii PS QB 2023
56 pages
جاوب غلط
No ratings yet
جاوب غلط
24 pages
Lecture 3 - 503 STAT - Bayes & Discrete Distributions
No ratings yet
Lecture 3 - 503 STAT - Bayes & Discrete Distributions
73 pages
Normal Random Variables
No ratings yet
Normal Random Variables
29 pages
PLAT and Backtesting 1735006712
No ratings yet
PLAT and Backtesting 1735006712
13 pages
First Course in Time Series Analysis
No ratings yet
First Course in Time Series Analysis
364 pages
Rohini 49916513786
No ratings yet
Rohini 49916513786
4 pages
Math 102 Midterms Reviewer (With Mock Tests)
No ratings yet
Math 102 Midterms Reviewer (With Mock Tests)
3 pages
Module2 StatisticsandProbability 3136824333227984
No ratings yet
Module2 StatisticsandProbability 3136824333227984
3 pages
Tutorial 3
No ratings yet
Tutorial 3
3 pages
Stochiastic Time Series
No ratings yet
Stochiastic Time Series
49 pages
1 Characteristics of Time Series 1.2 Time Series Statistical Models
No ratings yet
1 Characteristics of Time Series 1.2 Time Series Statistical Models
11 pages
Econ f342 Ae Time Series Addl
No ratings yet
Econ f342 Ae Time Series Addl
124 pages
Lectures 15-27 Review: Scott Sheffield
No ratings yet
Lectures 15-27 Review: Scott Sheffield
203 pages
JRFM 16 00055
No ratings yet
JRFM 16 00055
28 pages
Noise Series
No ratings yet
Noise Series
40 pages
BSC MATH Sem4 Stat105 2
No ratings yet
BSC MATH Sem4 Stat105 2
9 pages
Time Series 2022 B
No ratings yet
Time Series 2022 B
57 pages
Time Series Analysis
No ratings yet
Time Series Analysis
104 pages
STAT0010 Introductory Slides
No ratings yet
STAT0010 Introductory Slides
67 pages
Ecmetrics II Ch3
No ratings yet
Ecmetrics II Ch3
40 pages
Stationary ARMA Procecess. - 2023
No ratings yet
Stationary ARMA Procecess. - 2023
52 pages
ECO304 - Sampling Distribution
No ratings yet
ECO304 - Sampling Distribution
39 pages
SC Presentation VIT 03272024 Final
No ratings yet
SC Presentation VIT 03272024 Final
98 pages
Lec 1, STAT 411
No ratings yet
Lec 1, STAT 411
67 pages
Tutorial1 ch5350
No ratings yet
Tutorial1 ch5350
2 pages
Week 1 - Slides White
No ratings yet
Week 1 - Slides White
72 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Time Series
No ratings yet
Time Series
31 pages
Ch.3 Normal Distribution
No ratings yet
Ch.3 Normal Distribution
1 page
Time Series
No ratings yet
Time Series
30 pages
AA SL Revision Course Day 2
No ratings yet
AA SL Revision Course Day 2
28 pages
Characteristics
No ratings yet
Characteristics
10 pages
PDS LVC 3 Post-Session Summary Time Series
No ratings yet
PDS LVC 3 Post-Session Summary Time Series
18 pages
Chap12 PDF
No ratings yet
Chap12 PDF
22 pages
Econometrics Chapter 1 UNAV
No ratings yet
Econometrics Chapter 1 UNAV
38 pages
Lecture Notes
No ratings yet
Lecture Notes
97 pages
Mtrics II (PPT 2)
No ratings yet
Mtrics II (PPT 2)
36 pages
Cl. 12 Maths Lab Activity 7,8 and 9.
No ratings yet
Cl. 12 Maths Lab Activity 7,8 and 9.
7 pages
Binomial Probability Distribution
No ratings yet
Binomial Probability Distribution
23 pages
Introduction To Time Series Analysis, Lectures
No ratings yet
Introduction To Time Series Analysis, Lectures
49 pages
Time Series-Ch02
No ratings yet
Time Series-Ch02
16 pages
Discrete Distributions-Practice Questions
No ratings yet
Discrete Distributions-Practice Questions
88 pages
XIII. Mathematics, Grade 7
No ratings yet
XIII. Mathematics, Grade 7
26 pages
Math7339TS1TimesSeries Intro
No ratings yet
Math7339TS1TimesSeries Intro
33 pages
Assigment # 1 For Economatrics - 102649
No ratings yet
Assigment # 1 For Economatrics - 102649
10 pages
Arma Models: This Project Is About The Time Analysis Based Model ARMA. Which Is A Forecasting
No ratings yet
Arma Models: This Project Is About The Time Analysis Based Model ARMA. Which Is A Forecasting
23 pages
2.10 Stationary Time Series-1607080624480
No ratings yet
2.10 Stationary Time Series-1607080624480
35 pages
AR, MA, ARIMATime Series
No ratings yet
AR, MA, ARIMATime Series
76 pages
Time Series Analysis
100% (1)
Time Series Analysis
66 pages
Time Series Analysis
No ratings yet
Time Series Analysis
41 pages
STAT 497 Applied Time Series Analysis
No ratings yet
STAT 497 Applied Time Series Analysis
46 pages
VX Python For Finance EuroScipy 2012 Y Hilpisch
No ratings yet
VX Python For Finance EuroScipy 2012 Y Hilpisch
47 pages
On Regression
No ratings yet
On Regression
57 pages
0b755df5-44c6-48da-9ad3-bafb0798629c
100% (5)
0b755df5-44c6-48da-9ad3-bafb0798629c
15 pages
Krolzig Macroeconometrics I
No ratings yet
Krolzig Macroeconometrics I
48 pages
Stat 479
No ratings yet
Stat 479
74 pages
Chapter 1 - Lecture Notes
No ratings yet
Chapter 1 - Lecture Notes
16 pages
ECON-960 - Econometrics All Slides Final Term PDF
No ratings yet
ECON-960 - Econometrics All Slides Final Term PDF
44 pages
Time Series With SAS
No ratings yet
Time Series With SAS
364 pages
Chapter 1. Basic Concepts in Time Series Analysis
No ratings yet
Chapter 1. Basic Concepts in Time Series Analysis
43 pages
Class16 PDF
No ratings yet
Class16 PDF
77 pages
Station A Rity
No ratings yet
Station A Rity
18 pages
ECON 762 Lecture Notes
No ratings yet
ECON 762 Lecture Notes
19 pages
Time - Series - in - Brief
No ratings yet
Time - Series - in - Brief
11 pages
Maths 2A Random Variables and Probability Distributions Important Questions
No ratings yet
Maths 2A Random Variables and Probability Distributions Important Questions
13 pages
Chap1 Introduction - pt1 Student
No ratings yet
Chap1 Introduction - pt1 Student
11 pages
Applied Time Series Analysis
No ratings yet
Applied Time Series Analysis
340 pages
Characteristics of Time Series
No ratings yet
Characteristics of Time Series
17 pages
Alexander 1983 Limitations Survey Techniques
No ratings yet
Alexander 1983 Limitations Survey Techniques
11 pages
Chapter Two
No ratings yet
Chapter Two
13 pages
Time Series Characteristic
No ratings yet
Time Series Characteristic
72 pages
Lesson 0: Martingales: Le Thi Xuan Mai
No ratings yet
Lesson 0: Martingales: Le Thi Xuan Mai
50 pages
John P. Nolan - Stable Distributions
No ratings yet
John P. Nolan - Stable Distributions
34 pages
Lecture Note
No ratings yet
Lecture Note
124 pages
Falk M. A First Course On Time Series Analysis Examples With SAS (U. of Wurzburg, 2005) (214s) - GL
100% (1)
Falk M. A First Course On Time Series Analysis Examples With SAS (U. of Wurzburg, 2005) (214s) - GL
214 pages
Distributed Ledgers: Design and Regulation of Financial Infrastructure and Payment Systems
From Everand
Distributed Ledgers: Design and Regulation of Financial Infrastructure and Payment Systems
Robert M. Townsend
No ratings yet
Elliott Wave Timing Beyond Ordinary Fibonacci Methods
From Everand
Elliott Wave Timing Beyond Ordinary Fibonacci Methods
Mark Lytle
4/5 (23)
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
From Everand
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
Bob Mather
3/5 (1)
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)

1-Basic Concepts 34454745

Uploaded by

1-Basic Concepts 34454745

Uploaded by

1 Basic concepts

S&P/TSX Composite index Daily log returns

Figure 1.1: Left: Time series of the S&P/TSX Composite Index

rt = log yt − log yt−1 = ∇ log yt , t = 2, 3, . . . , T,

Daily mean temperature at YYZ

2014 2016 2018 2020 2022 2024

Figure 1.2: Daily mean temperature, from 2013-06-13 to 2024-

1.2 Simple examples of stochastic processes

Since independence implies uncorrelation, we have

(Xt ) ∼ IID(0, σ 2 ) ⇒ (Xt ) ∼ WN(0, σ 2 ).

E[Xt ] = E[RZt ] = E[R]E[Zt ] = E[E] · 0 = 0.

Cov(Xs , Xt ) = E[(RZs )(RZt )] = E[R2 ]Cov(Zs , Xt ) = σ 2 δst ,

where δst is the Kronecker delta. Thus Xt ∼ WN(0, σ 2 ).

0 50 100 150 200 250 300

Figure 1.4: A simulated sample path of an MA(3) process where

where µ = E[Xt ] and (t )t ∼ IID(0, 1). Then

Cov(Xt , Xt ) = Var(Xt ) = Var(Zt + θZt−1 )

Thus (using the symmetry of covariance) we have, for t (time), h (lag) ∈ Z,

This is an example of an autocovariance function (Definition 1.13). By varying q and

P (0, 0) = P (1, 1) = 1 − p, P (0, 1) = P (1, 0) = p.

Then the stationary distribution π is the uniform distribution on X .

γX (s, t) = Cov(Xs , Xt ), s, t ∈ T. (1.10)

We also define the autocorrelation function by ρX (s, t) = Cor(Xs , Xt ). We say that

γX (s, s + h) = γX (t, t + h) (1.11)

E[f (X)g(Y )] = E[f (X)]E[g(Y )] (1.14)

γX (h) = Cov(Xt+h , X) = Cov(Xt , Xt−h ) = γX (−h).

Remark 1.16. Conversely, any function γ : Z → R (or Z+ → R) which satisfies (ii)

Example 1.18. Consider the process St = s0 + X1 + · · · + Xt , t ≥ 0, where s0 is

γS (s, t) = Cov(Ss + (St − Ss ), Ss )

γS (s, t) min{s, t}σ 2 min{s, t}

which decays like O( h1 ) as h increases.

Using the geometric sum formula, we have

(Xt ) satisfies the recursion

so the mean function is identically zero.7 Since (Zt ) ∼ WN(0, σ 2 ), we have

Figure 1.5: Simulated sample paths of AR(1) processes where

So the covariance function Cov(Xt , Xt+h ) is well-defined. To find γX (t, t + h) for

γX (t, t + h) = Cov(Xt , Xt+h )

In the fourth equality we used the fact that

Iterating (1.21) shows that

Since γX (t, t + h) is independent of t, (Xt ) is stationary with autocorrelation function

1.4 Sample autocorrelation function

Figure 1.6: Theoretical and sample ACFs of the sample paths

Definition 1.20. Let (xt )Tt=1 be a real-valued time series data-set.

0 50 100 150 200 250 300 0 5 10 15 20

Figure 1.7: Left: Simulated path of a random walk. Right: Its

assumptions,8 for any positive lag h we have

where χ2h is the chi-squared distribution with h degrees of freedom.

which has the same limiting distribution χ2h .

You might also like

where µ = E[Xt ] and (t )t ∼ IID(0, 1). Then