0% found this document useful (0 votes)
41 views39 pages

CHAPTER SIX Basic Regression Analysis With Time Series Data

Uploaded by

demilie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views39 pages

CHAPTER SIX Basic Regression Analysis With Time Series Data

Uploaded by

demilie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

CHAPTER SIX

BASIC REGRESSION
ANALYSIS WITH TIME SERIES
DATA
 A time series is a set of observations on the values that a
variable takes at different times.
 Such data may be collected at regular time intervals, such as
daily (e.g., stock prices, weather reports), weekly (e.g.,
money supply figures), monthly [e.g., the unemployment
rate, the Consumer Price Index (CPI)], quarterly (e.g., GDP),
annually (e.g., government budgets), quinquennially, that
is, every 5 years (e.g., the cen sus of manufactures), or
decennially (e.g., the census of population).
 A time series consists of a set of observations measured at
specified, usually equal, time interval.
 Time series analysis attempts to identify those factors that
exert influence on the values in the series.
 Time series analysis is a basic tool for forecasting.
 Industry and government must forecast future activity to
make decisions and plans to meet projected changes.
 An analysis of the trend of the observations is needed to
acquire an understanding of the progress of events leading to
prevailing conditions.
 The purpose of time series analysis is to capture and examine
the dynamics of the data.
• The nature of time series data

– Temporal ordering of observations; may not be arbitrarily


reordered

– Typical features: serial correlation/nonindependence of


observations

– How should we think about the randomness in time series


data?
 The outcome of economic variables (e.g. GNP, Dow Jones) is
uncertain; they should therefore be modeled as random
variables

 Time series are sequences of r.v. (= stochastic processes)

 Randomness does not come from sampling from a population

 „Sample“ = the one realized path of the time series out of the
many possible paths the stochastic process could have taken
Analyzing Time Series:
Basic Regression
• Example: Analysis
US inflation and unemployment rates 1948-2003

Here, there are only two time series.


There may be many more variables
whose paths over time are observed
simultaneously.

Time series analysis focuses on


modeling the dependency of a
variable on its own past, and on the
present and past values of other
variables.
Analyzing Time Series:
Basic Regression Analysis
• Examples of time series regression models
• Static models
– In static time series models, the current value of one variable is
modeled as the result of the current values of explanatory variables
• Examples for static models
There is a contemporaneous relationship
between unemployment and inflation (= Phillips-
Curve).

The current murderrate is determined by the current conviction rate,


unemployment rate, and fraction of young males in the population.
Analyzing Time Series:
Basic Regression Analysis
• Finite distributed lag models
– In finite distributed lag models, the explanatory variables are allowed
to influence the dependent variable with a time lag
• Example for a finite distributed lag model
– The fertility rate may depend on the tax value of a child, but for
biological and behavioral reasons, the effect may have a lag

Children born per Tax Tax Tax


1,000 women in exemption exemption exemption
year t in year t in year t-1 in year t-2
Analyzing Time Series:
Basic Regression Analysis
• Interpretation of the effects in finite distributed lag models

• Effect of a past shock on the current value of the dep. variable

Effect of a transitory shock: Effect of permanent shock:


If there is a one time shock in a If there is a permanent shock in a past period, i.e.
past period, the dep. variable will the explanatory variable permanently increases
change temporarily by the by one unit, the effect on the dep. variable will
amount indicated by the be the cumulated effect of all relevant lags. This
coefficient of the corresponding is a long-run effect on the dependent variable.
lag.
Analyzing Time Series:
Basic Regression Analysis
• Graphical illustration of lagged effects

 For example, the effect is biggest after


a lag of one period. After that, the
effect vanishes (if the initial shock was
transitory).

 The long run effect of a permanent


shock is the cumulated effect of all
relevant lagged effects. It does not
vanish (if the initial shock is a per-
manent one).
• Finite sample properties of OLS under classical assumptions
• Assumption TS.1 (Linear in parameters)

The time series involved obey a linear relationship. The stochastic processes yt, xt1,…,
xtk are observed, the error process ut is unobserved. The definition of the explanatory
variables is general, e.g. they may be lags or functions of other explanatory variables.
• Assumption TS.2 (No perfect collinearity)
„In the sample (and therefore in the underlying time
series process), no independent variable is constant
nor a perfect linear combination of the others.“
• Notation
This matrix collects all the
information on the complete time
paths of all explanatory variables

The values of all explanatory


variables in period number t

• Assumption TS.3 (Zero conditional mean)


The mean value of the unobserved factors is
unrelated to the values of the explanatory variables
in all periods
Analyzing Time Series:
Basic Regression Analysis
 Discussion of assumption TS.3
The mean of the error term is unrelated to the
Exogeneity: explanatory variables of the same period

The mean of the error term is unrelated to the values


Strict exogeneity: of the explanatory variables of all periods

 Strict exogeneity is stronger than contemporaneous exogeneity


– TS.3 rules out feedback from the dep. variable on future values of the
explanatory variables; this is often questionable esp. if explanatory variables
„adjust“ to past changes in the dependent variable
– If the error term is related to past values of the explanatory variables, one
should include these values as contemporaneous regressors
Analyzing Time Series:
Basic Regression Analysis
 Theorem (Unbiasedness of OLS)

 Assumption TS.4 (Homoscedasticity)


The volatility of the errors must not be related
to the explanatory variables in any of the
periods
– A sufficient condition is that the volatility of the error is independent of
the explanatory variables and that it is constant over time
– In the time series context, homoscedasticity may also be easily violated,
e.g. if the volatility of the dep. variable depends on regime changes
Analyzing Time Series:
Basic Regression Analysis
 Assumption TS.5 (No serial correlation)
Conditional on the explanatory variables,
the un-observed factors must not be
correlated over time
 Discussion of assumption TS.5
– Why was such an assumption not made in the cross-sectional case?
– The assumption may easily be violated if, conditional on knowing the
values of the indep. variables, omitted factors are correlated over time
– The assumption may also serve as substitute for the random sampling
assumption if sampling a cross-section is not done completely randomly
– In this case, given the values of the explanatory variables, errors have to
be uncorrelated across cross-sectional units (e.g. states)
Analyzing Time Series:
Basic Regression Analysis
• Theorem (OLS sampling variances)

Under assumptions TS.1 – TS.5: The same formula as in the


cross-sectional case

The conditioning on the values of the explanatory variables is not easy to understand. It
effectively means that, in a finite sample, one ignores the sampling variability coming from the
randomness of the regressors. This kind of sampling variability will normally not be large
(because of the sums).

 Theorem (Unbiased estimation of the error variance)


Analyzing Time Series:
Basic Regression Analysis
• Theorem (Gauss-Markov Theorem)
– Under assumptions TS.1 – TS.5, the OLS estimators have the minimal variance
of all linear unbiased estimators of the regression coefficients
– This holds conditional as well as unconditional on the regressors

• Assumption TS.6 (Normality) This assumption implies TS.3 – TS.5

independently of

• Theorem 10.5 (Normal sampling distributions)


– Under assumptions TS.1 – TS.6, the OLS estimators have the usual nor-mal
distribution (conditional on ). The usual F- and t-tests are valid.
Analyzing Time Series:
Basic Regression Analysis
• Example: Static Phillips curve
Contrary to theory, the estimated Phillips
Curve does not suggest a tradeoff
between inflation and unemployment

The error term contains factors such as


monetary shocks, income/demand
• Discussion of CLM assumptions shocks, oil price shocks, supply shocks,
or exchange rate shocks

TS.1:

TS.2: A linear relationship might be restrictive, but it should be a good


approximation. Perfect collinearity is not a problem as long as
unemployment varies over time.
Analyzing Time Series:
Basic Regression Analysis
• Discussion of CLM assumptions (cont.)

Easily violated
TS.3:
For example, past unemployment shocks may lead to
future demand shocks which may dampen inflation
For example, an oil price shock means more inflation and
may lead to future increases in unemployment

Assumption is violated if monetary


TS.4: policy is more „nervous“ in times of
high unemployment
TS.5: Assumption is violated if ex-change
rate influences persist over time
Questionable (they cannot be explained by
TS.6: unemployment)
Analyzing Time Series:
Basic Regression Analysis
• Example: Effects of inflation and deficits on interest rates

Interest rate on 3-months T-bill Government deficit as percentage of GDP

The error term represents other


factors that determine interest
rates in general, e.g. business cycle
• Discussion of CLM assumptions effects

TS.1:

TS.2: A linear relationship might be restrictive, but it should be a good approximation.


Perfect collinearity will seldomly be a problem in practice.
Analyzing Time Series:
Basic Regression Analysis
• Discussion of CLM assumptions (cont.)
Easily violated
TS.3:
For example, past deficit spending may boost economic activity,
which in turn may lead to general interest rate rises
For example, unobserved demand shocks may increase interest
rates and lead to higher inflation in future periods

Assumption is violated if higher deficits lead to


TS.4: more uncertainty about state finances and
possibly more abrupt rate changes

TS.5: Assumption is violated if business cylce


effects persist across years (and they
Questionable cannot be completely accounted for by
TS.6: inflation and the evolution of deficits)
Analyzing Time Series:
Basic Regression Analysis
• Using dummy explanatory variables in time series

Children born per 1,000 Tax exemption Dummy for World War II Dummy for availabity of con-
women in year t in year t years (1941-45) traceptive pill (1963-present)

• Interpretation
– During World War II, the fertility rate was temporarily lower
– It has been permanently lower since the introduction of the pill in 1963
Analyzing Time Series:
Basic Regression Analysis
• Time series with trends

Example for a time


series with a linear
upward trend
Analyzing Time Series:
Basic Regression Analysis
• Modelling a linear time trend

Abstracting from random deviations, the dependent variable


increases by a constant amount per time unit

Alternatively, the expected value of the dependent variable is


a linear function of time

• Modelling an exponential time trend

Abstracting from random deviations, the dependent vari-able


increases by a constant percentage per time unit
Analyzing Time Series:
Basic Regression Analysis
• Example for a time series with an exponential trend

Abstracting from
random deviations,
the time series has a
constant growth rate
Analyzing Time Series:
Basic Regression Analysis
• Using trending variables in regression analysis
– If trending variables are regressed on each other, a spurious re-
lationship may arise if the variables are driven by a common trend
– In this case, it is important to include a trend in the regression
• Example: Housing investment and prices

Per capita housing investment Housing price index

It looks as if investment and


prices are positively related
Analyzing Time Series:
Basic Regression Analysis
• Example: Housing investment and prices (cont.)

There is no significant relationship


between price and investment anymore

• When should a trend be included?


– If the dependent variable displays an obvious trending behaviour
– If both the dependent and some independent variables have trends
– If only some of the independent variables have trends; their effect on
the dep. var. may only be visible after a trend has been substracted
Analyzing Time Series:
Basic Regression Analysis
• A Detrending interpretation of regressions with a time trend
– It turns out that the OLS coefficients in a regression including a trend
are the same as the coefficients in a regression without a trend but
where all the variables have been detrended before the regression
– This follows from the general interpretation of multiple regressions
• Computing R-squared when the dependent variable is trending
– Due to the trend, the variance of the dep. var. will be overstated
– It is better to first detrend the dep. var. and then run the regression on
all the indep. variables (plus a trend if they are trending as well)
– The R-squared of this regression is a more adequate measure of fit
Analyzing Time Series:
Basic Regression Analysis
• Modelling seasonality in time series
• A simple method is to include a set of seasonal dummies:

=1 if obs. from december


=0 otherwise

• Similar remarks apply as in the case of deterministic time trends


– The regression coefficients on the explanatory variables can be seen as the
result of first deseasonalizing the dep. and the explanat. variables
– An R-squared that is based on first deseasonalizing the dep. var. may better
reflect the explanatory power of the explanatory variables
Stationarity
 A key concept underlying time series processes is that of
stationarity.
 A time series is covariance stationary when it has the following
three characteristics:
 exhibits mean reversion in that it fluctuates around a
constant long-run mean;
 has a finite variance that is time-invariant; and
 has a theoretical correlogram that diminishes as the lag
length increases.
 In its simplest terms a time series 𝑌𝑡 is said to be stationary if:
 E(𝑌𝑡 ) = constant for all t;
 var(𝑌𝑡 ) = constant for all t; and
 cov(𝑌𝑡 , 𝑌𝑡+𝑘 ) = constant for all t and all k = 0, or if its
mean, variance and covariances remain constant over time.
 In short, if a time series is stationary, its mean, variance, and
autocovariance (at various lags) remain the same no matter
at what point we measure them; that is, they are time invariant.
 Such a time series will tend to return to its mean (called mean
reversion) and fluctuations around this mean (measured by its
variance) will have a broadly constant amplitude.
 If a time series is not stationary in the sense just defined, it is
called a nonstationary time series.
 In other words, a nonstationary time series will have a
time varying mean or a time-varying variance or both.
 Why are stationary time series so important?
 Because if a time series is nonstationary, we can study its
behavior only for the time period under consideration.
 Each set of time series data will therefore be for a particular
episode. As a consequence, it is not possible to generalize it to
other time periods.
 Therefore, for the purpose of forecasting, such
(nonstationary) time series may be of little practical value.
 At the formal level, stationarity can be checked by finding out if
the time series contains a unit root. The Dickey–Fuller (DF) and
augmented Dickey–Fuller (ADF) tests can be used for this
purpose.
 An economic time series can be trend stationary (TS) or
difference stationary (DS).
 A TS time series has a deterministic trend, whereas a DS time
series has a variable, or stochastic, trend.
 The common practice of including the time or trend variable in a
regression model to detrend the data is justifiable only for TS
time series.
 The DF and ADF tests can be applied to determine whether a
time series is TS or DS.
 The Dickey-Fuller (DF) test was developed and popularized
by Dickey and Fuller (1979).
 The null hypothesis of DF test is that there is a unit
root in an Autoregressive (AR) model, which
implies that the data series is not stationary.
 The alternative hypothesis is generally stationarity
or trend stationarity but can be different depending
on the version of the test is being used.
 The null hypothesis of nonstationarity is performed at the
1%, the 5% and 10% significance levels.
Autoregressive Unit Root – Testing: DF
 Decision
 p-value > 0.05: Fail to reject the null hypothesis (H0), the
data has a unit root and is non-stationary.
 p-value <= 0.05: Reject the null hypothesis (H0), the data
does not have a unit root and is stationary.
 If the test statistic is in absolute value smaller than all of the
critical values; we cannot reject the null hypothesis.
 If the test statistic is in absolute value greater than all of the
critical values; we reject the null hypothesis.
 Regression of one time series variable on one or more time
series variables often can give nonsensical or spurious results.
 This phenomenon is known as spurious regression.
 A “spurious regression” is one in which the time-series
variables are non stationary and independent.
 It is spurious because the regression will most likely indicate a
non-existing relationship:
 The coefficient estimate will not converge toward zero (the
true value). Instead, in the limit the coefficient estimate will
follow a non-degenerate distribution
 The t value most often is significant.
 𝑅2 is typically very high.
 One way to guard against it is to find out if the time series
are cointegrated.
 Cointegration means that despite being individually
nonstationary, a linear combination of two or more time
series can be stationary.
 Cointegration is a technique used to find a possible
correlation between time series processes in the long
term.
 Nobel laureates Robert Engle and Clive Granger introduced
the concept of cointegration in 1987.
 The most popular cointegration tests include Engle-
Granger, the Johansen Test, and the Phillips-Ouliaris test.
THE END OF CHAPTER SIX”
THANK YOU FOR YOUR
ATTENTION!

May 2004 Prof.VuThieu 39

You might also like