ECON-C4210 - Econometrics II: Capstone: Lecture 9A: Time Series I
ECON-C4210 - Econometrics II: Capstone: Lecture 9A: Time Series I
Otto Toivanen
2 what lags (=lagged values of a variable) and differences are and how to create them
• This (very) unlikely to hold with a time series: think of the COVID-19 employment shock.
• What most lay people consider key economic data are time series:
1 Price indices (inflation).
2 Unemployment.
3 GDP.
4 Exports and imports.
• Dynamic causal effects: Does a change in the central bank interest rate affect inflation 3
months / 12 months ahead?.
• Modeling risks in financial markets (volatility).
• Time series techniques have plenty of uses outside economics (climate modeling,
engineering systems, computer science).
• Even if not interested in the time series nature, need to take it into account.
• Data set: {Y1 , ..., YT } are T observations on the time series variable Y .
• We will study time series that are consecutive, i.e., there are not breaks (=missing
observations) in the series.
1000
800
mean cpi = 365
400 200
0 600
• Notice that you ”lose” observations from the beginning of the series when you take lags
and/or differences.
• Stata time series operators.
• These are population correlations, i.e., they describe the joint distribution of the
population.
• The sample autocovariance and autocorrelation are estimates of the population
equivalents.
cpi 1.0000
cpi 1.0000
6 7
6
mean lncpi =
5 4
.01
mean infl = 0.0028
-.01
-.02
0 200 400 600 800 1000
1949m1 - 2024m2: vertical line 2000m1
infl 1.0000
Yt = β0 + β1 Yt−1 + ut
.
. regr cpi L1.cpi if time_ind >= 13
cpi
L1. 1.001957 .0002171 4615.86 0.000 1.001531 1.002383
. estat ic
infl
L1. .6019819 .0266804 22.56 0.000 .5496178 .6543459
. estat ic
• Let’s set p = 4.
cpi
L1. 1.642382 .0335556 48.95 0.000 1.576524 1.70824
L2. -.8545365 .063973 -13.36 0.000 -.980093 -.72898
L3. .2786968 .064333 4.33 0.000 .1524338 .4049599
L4. -.0655585 .0338885 -1.93 0.053 -.1320697 .0009528
. estat ic
infl
L1. .5733712 .0332686 17.23 0.000 .5080767 .6386658
L2. -.0092169 .0383522 -0.24 0.810 -.0844888 .066055
L3. .0042502 .0381294 0.11 0.911 -.0705844 .0790848
L4. .1120095 .0331366 3.38 0.001 .046974 .1770449
. estat ic
• F-tests?
• Just like in machine learning, BIC and AIC utilize the bias-variance tradeoff.
• Through AIC / BIC testing you get a model with possibly biased coefficients, but good
forecasts.
• In Problem Set 4 (or 5), you will search for the optimal # of lags using BIC and AIC.
• Definition (for a single time series): A time series Yt is stationary if its probability
distribution does not change over time, that is, if the joint distribution of
(Ys+1 , Ys+2 , ..., Ys+T ) does not depend on s.
• Otherwise Yt is nonstationary.
• Stationarity requries that in a probabilistic sense, the future is like the past.
• It can be shown that with a random walk (which we will study more shortly) which is
nonstationary, the OLS coefficient of the lagged dependent variable (Yt−1 )
5.3
E[β1 ] ≈ 1 −
T
Stata code
1
2 s e t o b s 1000
3 gen t i m e = n
4 t s s e t time
5 gen u = 3 ∗ invnorm ( uniform ( ) )
6 gen y = u
7 replace y = y [ n − 1] + u i f time > 1
8
9 regr y L. y i f time < T
10 /∗ v a r y T = 2 6 , 5 1 , 1 5 1 , 1000(+1) ∗/
time u y u+y[t-1]
1 -3.2819 -3.2819
2 1.1012 -2.1807 -2.1807
3 0.4362 -1.7445 -1.7445
4 0.7973 -0.9471 -0.9471
5 1.4382 0.4911 0.4911
6 -3.7009 -3.2098 -3.2098
7 0.9043 -2.3055 -2.3055
8 -4.6377 -6.9433 -6.9433
9 0.4167 -6.5265 -6.5265
10 3.3998 -3.1267 -3.1267
1000
800
mean cpi = 365
400 200
0 600
Yt = Yt−1 + ut
• The value of Yt today is in expectation the same as what it actually was yesterday.
• Random walk = today’s value of Yt is equal to where you were yesterday + a (random)
step ut of unknown direction and length.
• More generally, the best prediction p periods into the future is Yt .
period Yt ut
1 u1 u1
2 u2 + u1 u2
3 u3 + u2 + u1 u3
. . .
. . .
P
l u
i i u l
. . .
. . .
• The value of a random walk in period t is the sum of the shocks to the series until and
including period t.
Yt = β0 + Yt−1 + ut
• The value of Yt today is in expectation the same as what it actually was yesterday +β0 .
• Random walk = today’s value of Yt where you were yesterday +β0 + a (random) step ut
of unknown direction and length.
• More generally, the best forecast p periods ahead is
E[Yt+p |Yt ] = β0 p + Yt
Yt = β0 + β1 Yt−1 + ut
• If β1 = 1, Yt is nonstationary.
• If β1 < 1, Yt is stationary.
• If β1 = 1, Yt is nonstationary.
• First thing to do: Plot the data to visually inspect whether there is a (stochastic) trend.
Yt = β0 + β1 Yt−1 + ut
• BUT: how to test when we know that it does not make sense to directly estimate the
above equation unless Yt is stationary, i.e., unless we know the answer to our question?
• Need a trick:
Yt = β0 + β1 Yt−1 + ut
∆Yt = β0 + δYt−1 + ut
∆Yt = β0 + δYt−1 + ut
• We know that β1 ≤ 1 → δ ≤ 0.
• H0 : δ = 0 against
• H1 : δ < 0.
• How do we get the equation for the augmented Dickey-Fuller test? Example with an
AR(2):
Yt = β0 + β1 Yt−1 + β2 Yt−2 + ut
Yt = β0 + (β1 + β2 )Yt−1 − β2 Yt−1 + β2 Yt−2 + ut
Yt = β0 + (β1 + β2 )Yt−1 − β2 [Yt−1 − Yt−2 ] + ut
Yt − Yt−1 = β0 − Yt−1 + (β1 + β2 )Yt−1 − β2 [Yt−1 − Yt−2 ] + ut
∆Yt = β0 + (β1 + β2 − 1)Yt−1 − β2 ∆Yt−1 + ut
∆Yt = β0 + δYt−1 + γ1 ∆Yt−1 + ut
• β1 z + β2 z − 1 = 0, then β1 + β2 = 1 and hence δ = 0, too.
∆Yt = β0 + ut
∆Yt = β0 + γ1 ∆Yt−1 + ut
• H0 : δ = 0 against
• H1 : δ < 0.
Interpolated Dickey-Fuller
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value
lncpi
L1. -.0002362 .0001404 -1.68 0.093 -.0005117 .0000393
• Consider a break in the series: A one-time change in the mean of a series (e.g. collapse
of Lehman Brothers in 2008). Such a shock would bias conclusions towards a unit root.
• We will consider how to test for a break in the next lecture.
• Consider a large (few large) outlier. The series may then look as if it mean-reverting
although it is not. Test results may be biased towards stationarity.
• The random walk model is the workhorse model for trends in economic time series.
• If Yt does not have a unit root, move ahead with your analysis.