Lecture 3 - Predictability
Lecture 3 - Predictability
Lecture 3
Andrea Buraschi
Spring 2015
Road Map
Video 1
[http://www.cnbc.com/2015/09/15/robert-shiller-this-is-the-sign-were-
in-a-bubble.html]
Video 2
[https://www.youtube.com/watch?v=A8awjaedha0]
Return Predictability for the Market
Fama and French (1988)
I Fama and French embark in a series of empirical studies to analyze
the properties of excess returns on stocks and bonds at various
horizons:
ExRet (t, t + T ) = α(T ) + β(T )X (t ) + ε(t, t + T )
I They identi…ed three forecasting variables that have been used
extensively in subsequent work:
I The dividend / price ratio: D /P ; the default premium; the
Baa Aaa corporate bond spread.
I In the last 10 years it has been argued that also other variables have
predictive power. Example includes:
I Consumption to Income ratio, cay , (see Lettau and Ludvigson
(2001))
I Detrended short term interest rates, rrel , (see Ang and Bekaert
(2007))
I Ang and Bekaert (2007) argue that at short horizons, the short rate
strongly negatively predicts excess returns, while at long horizons,
the predictive power of the dividend yield is weak. These results are
robust in international data and are not due to lack of power.
I If you decompose the total return of the US stock market into (a) dividends and
(b) capital gains, the dividend component is very important.
I Total return is the sum of the blue, grey, and red bars. The largest component is
always the dividend yield.
I From four portfolios and dynamically rebalance according to the dividend yield.
What do you get?
I Is the UK evidence very di¤erent? No.
I Maybe high dividend yield assets are riskier. Let’s look at the Sharpe-ratio of
four portfolios.
Return Predictability
I The results give the impression that predictability gets stronger over
long-term (i.e. using long horizon regressions).
I Is this a genuine result, or a mechanical result?
1. Forecasts from persistent variables build up over time and are more
important at long horizons. Forecasts from fast-moving variables die
out more quickly.
2. Long-horizon forecasts result when the forecasting variable (D/P)
forecasts one-year returns in the same direction for many years in the
future.
3. There is nothing special or di¤erent about long-run forecasts. They
are the mechanical result of short-run forecasts and a persistent
forecasting variable.
Long Horizons
Long horizons are the same phenomenon: they result from short horizons
and a persistent forecasting variable
Dt D
= a + 0.94 t 1 + εt
Pt Pt 1
Long Horizons
Denote xt = D t
P t . Then
rt +1 = bxt + εt +1
xt + 1 = φxt + δt +1
Long Horizons
Denote xt = D t
P t . Then
rt +1 = bxt + εt +1
xt + 1 = φxt + δt +1
rt +1 + rt +2 = (bxt + εt +1 ) + (bxt +1 + εt +2 )
= bxt + b (φxt + δt +1 ) + εt +1 + εt +2
= b (1 + φ) xt + (bδt +1 + εt +1 + εt )
Long Horizons
Denote xt = D t
P t . Then
rt +1 = bxt + εt +1
xt + 1 = φxt + δt +1
rt +1 + rt +2 = (bxt + εt +1 ) + (bxt +1 + εt +2 )
= bxt + b (φxt + δt +1 ) + εt +1 + εt +2
= b (1 + φ) xt + (bδt +1 + εt +1 + εt )
Similarly
rt +1 + rt +2 + rt +3 = b 1 + φ + φ2 xt + (error )
Long Horizons
I The coe¢ cients rise over horizon if φ is a big number - if the right
hand variable moves slowly over time.
I b coe¢ cients that rise almost linearly with horizon and then taper o¤
a bit.
I Ang and Bekaert (2007) rigorously discuss the issues arising with
making statistical inference with long horizons.
I They account for the dynamics of the predicting factor to check if
the result is just “mechanical”. They …nds that:
I The long horizons results critically depends on the choice of standard
errors. With the standard Hansen-Hodrick (1980) or Newey-West
(1987) standard errors, there is some evidence for long horizon
predictability but it disappears when we correct for heteroskedasticity
and remove the moving average structure in the error terms induced
by summing returns over long horizons (see Richardson and Smith,
1991; Hodrick, 1992; Boudoukh and Richardson, 1993).
I They …nd that the predictive ability of the dividend yield is
considerably enhanced, at short horizons, in a bivariate regression
with the short rate.
Ang and Bekaert (2007)
• Once you account for the mechanical effect of increasing slope coefficients over time, the
results are sensitive to the sample period
• Still, in the period 1951-1990, for the U.S. dividend yield and short term rates are a
powerful predictor up to 4 quarters out.
• The result is robust to international data.
Is the PD Ratio time-varying? If so ... Why?
What is the Implication of Predictability?
P t +1 + D t +1
1 = R t +11 R t +1 = R t +11
Pt
I Multiplying both sides by P t /D t and rearranging:
Pt P t +1 + D t +1 Pt
1 = R t +11 R t +1 = R t +11
Dt Pt Dt
Pt P t +1 D t +1
= R t +11 1+
Dt D t +1 Dt
Returns and Dividends
pt dt = rt +1 + log (1 + e pt +1 d t +1
) + ∆d t +1
I Taking a Taylor series expansion of the last term about a point P /D = exp p d :
pt dt = rt +1 + ∆d t +1 + k + ρ(p t +1 d t +1 ) (1)
where
P /D
k = log (1 + P /D )ρ =
1 + P /D
Notice that P /D ' 4%, so that P /D ' 4% and ρ is about 0.96.
I Iterating forward the previous o.d.e. and taking conditional expectations we get:
" #
∞
pt d t = const. + Et ∑ ρj 1
(∆d t +j r t +j ) (2)
j =1
This is obtained by ruling out the explosive behaviour of stock prices where
lim j !∞ ρj (p t +j d t +j ) = 0. This is equivalent to ruling out bubbles.
I Let us study what this equation tells us:
" #
∞
pt d t = const. + Et ∑ ρj 1
(∆d t +j r t +j ) (3)
j =1
1. Price-dividend ratios can move if and only if there is news about current
dividends, future dividend growth or future returns.
2. If ∆d t and rt are totally unpredictable, i.e. if E t (∆d t +j ) and E t (rt +j ) are the
same for every time t, then p t d t must be constant (which we know isn’t true!).
I Cochrane argues that although predictability tests are indeed subject to several
statistical issues (see Stambaugh), there is strong reason to believe in
predictability.
I Far from being evidence of market ine¢ ciency, predictability is simply equivalent
to time-variation in risk premia
I Panic and fears will stop an individual to take advantage of predictability
(time-variation in risk premia).
I Did you double up your investment in the stock market in March 2009? Most
likely not ...
I However, this implies that long-run investors (endowment funds, sovereign
wealth funds, central banks) can take advantage of this feature in their strategic
portfolios!
The Variance of Price / Dividend Ratios
I If we forget the constant i.e. treat variables as deviations from the mean, then
multiply both sides of 2 by (p t d t ) and take the unconditional mean:
" #
∞
E [(p t d t )(p t d t )] = E (p t dt ) ∑ ρj 1
(∆d t +j r t +j )
j =1
I This regression forecast and states that if prices are moving on news of future
dividend growth - even news we cannot see - then prices should predict high
dividend growth!
I What’s the evidence? try decomposing the total variance of D/P into the two
components!
I Although the evidence is statistically insigni…cant, it seems that a high
price/dividend ratio forecasts a decline in future dividends! This is the wrong
direction!
I How has this happened?
I It seems that all variation in P/D ratios is due to the discount channel and none
due to the cash‡ow channel!
I High prices re‡ect low risk premia and lower expected excess returns.
I There are doubts about the statistical signi…cance of return predictability
I But we must remember: Volatility in prices implies that prices must predict
something...
I Can we talk about irrational expectations of a price bubble to explain these
observations? Or is there another explanation?
Introducing a dynamics in the dividend yield
I Let’s generalize the approach allowing for a proper dynamic model. So far
(d t p t ) is assumed was assumed to have no dynamics.
I The restriction that 1 br b d depends on this assumption. Therefore, let’s
assume that
(d t +1 p t +1 ) = adp + φ(d t p t ) + edp
t +1
I From the Campbell-Shiller (1988) linearization:
r t +1 = ρ (p t +1 d t +1 ) + ∆d t +1 (p t dt )
r t +1 = ρ (d t +1 p t +1 ) + ∆d t +1 + (d t pt )
I Thus
r t +1 = a r + b r (d t p t ) + ert +1 (8)
∆d t +1 = ad + b d (d t p t ) + edt+1 (9)
d t +1 p t +1 = adp + φ(d t p t ) + edp
t +1 (10)
They are linked by the restriction that
1 = ρφ + βr βd .
I In the early days of the forecasting regressions, some scholars were very
scheptical of the results.
I An important example was R. Stambaugh. In his 1999 ’Predictive regressions’
article Stambaugh (1999) shows that regressions using lagged endogenous
variables.are contaminated by incorrect standard errors in return.
I These biases need to be accounted for explicitly.
Stambaugh (1999)
Typical lagged explanatory variables for stock-return regressions are correlated with
contemporaneous stock returns:
I D /P negatively correlated with contemporaneous stock returns (price in the
denominator)
I cay negatively correlated (wealth in denominator)
I Model:
xt = α + ρxt 1 + εt
I Estimate:
cov (xt , xt 1 )
ρ̂ =
var (xt 1 )
cov ([α + ρxt 1 + εt ] , xt 1 )
=
var (xt 1 )
I Bias
cov (εt , xt 1 )
ρ̂ = ρ +
var (xt 1 )
I Bias arises because in a …nite sample εt negatively correlated with
xt 1 .
I High εt means a high xt and a high xt +1 , . . . xt +T etc, so in-sample
corresponds to a low x1 , . . . , xt 1 : namely cov (εt , xt 1 ) < 0.
I Thus, also standard errors and tests (p-values) are wrong
Bias with lagged correlated and persistent explanatory
variables
Rt = α + βxt 1 + ut
I Assumptions:
cov (εt , ut ) < 0, 0<ρ<1
I Intuition same as OLS case: high ut means low εt , low
xt +1 , . . . xt +T , and high in-sample x1 , . . . , xt 1 . Thus
cov (εt , ut ) < 0.
I High xt 1 corresponds to high Rt .
Stambaugh (1999)
In this paper, Stambaugh calibrate the model using historical data. Then
we compute the bias. He then ask the question: "Is the predictability tht
people …nd simply an aetifac of a statistical bias in the coe¢ cient β?"
Stambaugh (1999)
Stambaugh (1999)