Wavelet Applications in Economics and Finance (PDFDrive)
Wavelet Applications in Economics and Finance (PDFDrive)
Marco Gallegati
Willi Semmler Editors
Wavelet
Applications in
Economics and
Finance
Dynamic Modeling and Econometrics
in Economics and Finance
Volume 20
Editors
Stefan Mittnik
University of Munich
Munich, Germany
Willi Semmler
Bielefeld University
Bielefeld, Germany
and
New School for Social Research
New York, USA
Wavelet Applications
in Economics and Finance
123
Editors
Marco Gallegati Willi Semmler
Faculty of Economics “G.Fuà” New School for Social Research
Polytechnic University of Marche The New School University
Ancona New York
Italy USA
Mater semper certa est, pater numquam (“The mother is always certain, the father
is always uncertain”) is a Roman-law principle which has the power of praesumptio
iuris et de iure. This is certainly true for biology, but not for wavelets in economics
which have a true father: James Ramsey.
The most useful property of wavelets is its ability to decompose a signal into
its time scale components. Economics, like many other complex systems, include
variables simultaneously interacting on different time scales so that relationships
between variables can occur at different horizons. Hence, for example, we can find
a stable relationship between durable consumption and income. And the literature
is soaring: from money–income relationship to Phillips curve, from financial
market fluctuations to forecasting. But this feature threatens to undermine the very
foundations of the Walrasian construction. If variables move differently at different
time scales (stock market prices in nanoseconds, wages in weeks, and investments
in months), then also a linear system can produce chaotic effects and market self-
regulation is lost. If validated, wavelet research becomes a silver bullet.
James is also an excellent sailor (in 2003 he sailed across the Atlantic to keep
his boat from North America to Turkey), and his boat braves the streams with
“nonchalance”: by the way if you are able to manage wavelets, you are also ready
for waves.
v
Preface
James Bernard Ramsey received his B.A. in Mathematics and Economics from the
University of British Columbia in 1963, and his M.A. and Ph.D. in Economics
from the University of Wisconsin, Madison in 1968 with the thesis “Tests for
Specification Errors in Classical Linear Least Squares Regression Analysis”. After
being Assistant and Associate Professor at the Department of Economics of
Michigan State University, he became Professor and Chair of Economics and Social
Statistics at the University of Birmingham, England, from 1971 to 1973. He went
back to the US as Full Professor at Michigan State University until 1976 and
finally moved to New York University as Professor of Economics and Chair of
the Economics Department between 1978 and 1987, where he remained for 37
years until his retirement in 2013. Fellow of the American Statistical Association,
Visiting Fellow at the School of Mathematics (Institute for Advanced Study) at
Princeton in 1992–1993, and ex-president of the Society for Nonlinear Dynamics
and Econometrics, James Ramsey was also a jury member of the Econometric Game
2009. He has published 7 books and more than 60 articles on nonlinear dynamics,
stochastic processes, time series, and wavelet analysis with special emphasis on the
analysis of economic and financial data.
This book intends to honor James B. Ramsey and his contribution to economics
on occasion of his recent retirement from academic activities at the NYU Depart-
ment of Economics. This festschrift, as it is called in the German tradition, intends
to honor an exceptional scholar whose fundamental contributions have influenced a
wide range of disciplines, from statistics to econometrics and economics, and whose
lifelong ideas have inspired more than a generation of researchers and students.
He is widely acclaimed for his pioneering work in the early part of his career
on the general specifications test for the linear regression model, Ramsey’s RESET
test, which is part of any econometric software now. He is also well known for
his contributions to the theory and empirics of chaotic and nonlinear dynamical
systems. A significant part of his work has also been devoted to the development of
genuine new ways of processing data, as for instance the application of functional
data analysis or the use of wavelets in terms of nonparametric analysis.
vii
viii Preface
Each year the Society for Nonlinear Dynamics and Econometrics, at its Annual
Conference, awards two James Ramsey prizes for top graduate papers in economet-
rics. This year there will also be a set of special sessions dedicated to his research.
One of these sessions will be devoted to wavelet analysis, an area where James work
has had a great outstanding impact in the last twenty years. James Ramsey and his
coauthors have provided early applications of wavelets in economics and finance
by making use of discrete wavelet transform (DWT) in decomposing economic and
financial data. These works paved the way to the application of wavelet analysis
for empirical economics. The articles in this book are comprised of contributions
by colleagues, former students, and researchers covering a wide range of wavelet
applications in economics and finance and are linked to or inspired by the work of
James Ramsey.
We have been working with James continuously over the last 10 years and
have always been impressed by his competence, motivation, and enthusiasm. Our
collaboration with James was extraordinarily productive and an inspiration to all
of us. Working together we developed a true friendship strengthened by virtue
of the pleasant meetings held periodically at James office on the 7th floor of the
NYU Department of Economics, which became an important space for discussing
ongoing as well as new and exciting research projects. As one of his students has
recently written, rating James’ Statistics class: “He is too smart to be teaching!”
Sometimes our impression was that he could also have been too smart for us as
coauthor. This book is a way to thank him for the privilege we have had to met and
work with him.
Although widely used in many other disciplines like geophysics, engineering (sub-
band coding), physics (normalization groups), mathematics (C-Z operators), signal
analysis, and statistics (time series and threshold analysis), wavelets still remain
largely unfamiliar to students of economics and finance. Nonetheless, in the past
decade considerable progress has been made, especially in finance and one might
say that wavelets are the “wave of the future”. The early empirical results show
that the separation by time scale decomposition analysis can be of great benefit for
a deeper understanding of economic relationships that operate simultaneously at
several time scales. The “short and the long run” can now be formally explored and
studied.
The existence of time scales, or “planning horizons”, is an essential aspect
of economic analysis. Consider, for example, traders operating in the market for
securities: some, the fundamentalists, may have a very long view and trade looking
at market fundamentals and concentrate their attention on “long run variables” and
average over short run fluctuations. Others, the chartists, may operate with a time
horizon of only weeks, days, or even hours. What fundamentalists deem to be
variables, the chartists deem constants. Another example is the distinction between
short run adaptations to changes in market conditions; e.g., merely altering the
length of the working day, and long run changes in which the firm makes strategic
decisions and installs new equipment or introduces new technology.
A corollary of this assumption is that different planning horizons are likely to
affect the structure of the relationships themselves, so that they might vary over
different time horizons or hold at certain time scales, but not at others. Economic
relationship might also show negative relationship over some time horizon, but a
positive one over others. These different time scales of variation in the data may be
expected to match the economic relationships more precisely than a single time scale
using aggregated data. Hence, a more realistic assumption should be to separate out
different time scales of variation in the data and analyze the relationships among
variables at each scale level, not at the aggregate level. Although the concepts of the
“short-run” and of the “long-run” are central for modeling economic and financial
ix
x Introduction
decisions, variations in those relationships across time scales are seldom discussed
nor empirically studied in economics and finance.
The theoretical analysis of time, or “space series” split early on into the
“continuous wavelet transform”, CWT, and into “discrete wavelet transform”. DWT.
The latter is often more useful for applying to regular time series analysis with obser-
vations at discrete intervals. Wavelets provide a multi-resolution decomposition
analysis of the data and can produce a synthesis of the economic relationships that is
parameter preserving. The output of wavelet transforms enables one to decompose
the data in ways that are potentially revealing relationships that are not visible
using standard methods on “scale aggregated” data. Given their ability to isolate
the bounds on the temporary frequency content of a process as a function of time, it
is a great advantage of those transforms to be able to rely only on local stationarity
that is induced by the system, although Gabor transforms provide a similar service
for Fourier series and integrals.
The key lesson in synthesizing the wavelet transforms is to facilitate and develop
the theoretical insight into the interdependence of economic and financial variables.
New tools are most likely to generate new ways of looking at the data and new
insights into the operation of the finance–real interaction.
The 11 articles collected in this volume, all strictly refereed, represent original
up-to-date research papers that reflect some of the latest developments in the area of
wavelet applications for economics and finance.
In the first chapter James provides a personal retrospective of a decade’s research
that highlights the links between CWT, DWT wavelets and the more classical
Fourier transforms and series. After stressing the importance of analyzing the
various basis spaces, the exposition evaluates the alternative bases available to
wavelet researchers and stresses the comparative advantage of wavelets relative to
the alternatives considered. The appropriate choice of class of function, e.g., Haar,
Morlet, Daubchies, etc., with rescaling and translation provide appropriate bases in
the synthesis to yield parsimonious approximations to the original time or space
series.
The remaining papers examine a wide variety of applications in economics and
finance that reveal more complex relationships in economic and financial time series
and help to shed light on various puzzles that emerged in the literature since long;
on business cycles, traded assets, foreign exchange rates, credit markets, forecasting,
and labor market research. Take, for example, the latter. Most economists agree that
productivity increases welfare, but whether productivity also increases employment
is still controversial. As economists have shown using data from the EU and the
USA, productivity may rise, but employment may be de-linked from productivity
increases. Recent work has shown that the analysis of the relationship between
productivity and employment is one that can only properly be analyzed after
decomposition by time scale. The variation in the short run is considerably
different from the variation in the long run. In the chapter “Does Productivity
Affect Unemployment? A Time-Frequency Analysis for the US”, Marco Gallegati,
Mauro Gallegati, James B. Ramsey, and Willi Semmler, applying parametric and
Introduction xi
“Short and Long Term Growth Effects of Financial Crises” and “Measuring Risk
Aversion Across Countries from the Consumption-CAPM: A Spectral Approach”
contain two articles using the spectral approach. F.N.G. Andersson and P. Karpestam
investigate to what extent financial crises can explain low growth rates in developing
countries. Distinguishing between different sources of crises and separating short-
and long-term growth effects of financial crises, they show that financial crises have
reduced growth and that the policy decisions have caused them to be worsened and
prolonged. In their paper E. Panopolou and S. Kalyvitis adopt a spectral approach
to estimate the values of risk aversion over the frequency domain. Their findings
suggest that at lower frequencies risk aversion falls substantially across countries,
thus yielding in many cases reasonable values of the implied coefficient of risk
aversion.
Contents
Part I Macroeconomics
xiii
xiv Contents
xv
xvi Contributors
L
EvLzen Kocenda CERGE-EI, Charles University and the Czech Academy of
Sciences, Prague, Czech Republic
Ekaterini Panopoulou Department of Statistics and Insurance Science, University
of Piraeus, Athens, Greece and University of Kent, UK
James B. Ramsey Department of Economics, New York University, New York,
NY, USA
Teresa Maria Rodrigues Economics Department, University of Minho, Braga,
Portugal
Willi Semmler Department of Economics, New School for Social Research,
New York, NY, USA
Maria Joana Soares NIPE and Department of Mathematics and Applications,
University of Minho, Braga, Portugal
LukáLs Vácha Institute of Information Theory and Automation, Academy of
Sciences of the Czech Republic, Prague, Czech Republic
Brandon Whitcher Pfizer Worldwide Research & Development, Cambridge, MA,
USA
Functional Representation, Approximation,
Bases and Wavelets
James B. Ramsey
Abstract After stressing the importance of analyzing the various basis spaces, the
exposition evaluates the alternative bases available to wavelet researchers. The next
step is to demonstrate the impact of choice of basis for the representation or
projection of the regressand. The similarity of formulating a basis is explored across
a variety of alternative representations. This development is followed by a very
brief overview of some articles using wavelet tools. The comparative advantage of
wavelets relative to the alternatives considered is stressed.
1 Introduction
The paper begins with a review of the main features of wavelet analysis which
are contrasted with other analytical procedures, mainly Fourier, splines, and linear
regression analysis. A review of Crowley (2007), Percival and Walden (2000), Bruce
and Gao (1996), the excellent review by Gençay et al. (2002), or the Palgrave
entry for Wavelets by Ramsey (2010) before proceeding would be beneficial to the
neophyte wavelet researcher.
The next section contains a non-rigorous development of the theory of wavelets
and contains discussions of wavelet theory in contrast to the theory of Fourier series
and splines. The third section discusses succinctly the practical use of wavelets and
compares alternative bases; the last section concludes.
Before proceeding the reader should note that all the approximating systems are
characterized by the functions that provide the basis vectors, e.g. Sin(k!t ), Cos(k!t )
for Fourier series, or “t” for the monomials, ei for the exponentials, etc.
are highly differential, but are not suitable for analyzing signals with discrete
changes and discontinuities.
The basis functions for splines are polynomials that are also differential and are
defined over a grid determined by the knots; various choices for the differentiability
at the knots determine the flexibility and smoothness of the spline approximation
and the degree of curvature between knots.
Obviously, the analysis of any signal involves choosing both the approximating
function and the appropriate basis vectors generated from the chosen function.
The concepts of “projection” and analysis of a function are distinguished; for the
former one considers the optimal manner in which an N dimensional basis space
can be projected onto a K dimensional subspace.
For a given level of approximation one seeks the smallest K for the transformed
basis. Alternatively, a given function can be approximated by a series expansion,
which implies that one is assuming that the function lies in a space defined in turn
by a given class of functions, usually defined to be a Hilbert space. Projection and
representation of a function are distinguished.
Y D Xˇ C u
Yperm D Xperm ˇ C uperm
Functional Representation, Approximation, Bases and Wavelets 3
f 2 .a1 j/.x a1 /2
y D f .xj/ D f .a1 j/ C f 1 .a1 j/.x a1 / C (1)
2Š
f 3 .a1 j/.x a1 /3 R./
C C
3Š 4Š
for some value. This equation approximately represents the variation of y in terms
of powers of x. Care must be taken in that the derived relationship is not exact,
as the required value for in the remainder term will vary for different values
for a1 , x, and the highest derivative used in the expansion. Under the assumption
that R./ is approximately zero, the parameters given the coefficient a1 can be
estimated by least squares using N independent drawings on the regressand’s error
term. Assuming the regressors are observed error free; one has:
minf†N
1 .yi f .xi j// g
2
(2)
4 J.B. Ramsey
f1; t 1 ; t 2 ; t 3 ; t 4 ; t 5 ; : : :g (5)
that is, we consider the projection of a vector y on the space spanned by the
monomials, t 0 ; t 1 ; : : : ; t k , or as became popular as a calculation saving device, one
considers the projection of y on the orthogonal components of the sequence in
Eq. (5), see Kendall and Stuart (1961).
These first two procedures indicate that the underlying concept was that insight
would be gained if the projections yielded approximations that could be specified in
terms of very few estimated coefficients. Further very little structure was imposed
on the model, either in terms of the statistical properties of the model or in terms of
the restrictions implied by the underlying theory.
Two other simple basis spaces are the exponential
fe 1 t ; e 2 t ; e 3 t : : : ::e k t g (6)
ft 1 ; t 2 ; t 3 ; : : : : : : :t k g: (7)
X
mCL1
SB .t/ D ck Bk;.t; / (8)
kD1
where SB .t/ is the spline approximation, ck , are the coefficients of the projection,
Bk;.t; / is the B-spline function at position k, with knot structure, . The vector
designates the number of knots, L, and their position which defines the subintervals
that are modeled in terms of polynomials of degree m. At each knot the polynomials
are constrained to be equal in value for polynomials of degree 1, agreement for
the first derivative for polynomials of degree 2, etc. Consequently, adjacent spline
polynomials line up smoothly.
B-Splines are one of the most flexible basis systems, so that it can easily fit
locally complex functions. An important use of splines is to interpolate over the grid
created by the knots in order to generate a differential function, or more generally,
a differential surface. Smoothing is a local phenomenon.
The next procedure in terms of longevity of use is Fourier analysis. The basis for
the space spanned by Fourier coefficients is given by:
where ! is the fundamental frequency. The approximating sequences are given most
simply by:
X
K
y D f .t/ Š ck k (11)
kD1
where the sequence ck specifies the coefficients chosen to minimize the squared
errors between the observed sequence and the known functions shown in Eq. (11),
k is the basis function as used in Eq. (9), and the coefficients are given by
Z
ck D f .t/k .t/ dt (12)
6 J.B. Ramsey
The implied relationships between the basis function, , the basis space given
by k , k D 1; 2; 3; ::, and representation of the function f .t/, are given in abstract
form in Eq. (12), in order to emphasize the similarities between the various basis
spaces.
We note two important aspects of this equation. We gain in understanding if
the number of coefficients are few in number; i.e. k is “small”. We gain if the
function “f” is restricted to functions of a class that can be described in terms
of the superposition of the basis functions, e.g. trigonometric functions and their
derivatives for Fourier analysis. The fit for functions that are continuous, but not
every where differential, can only be approximated using many basis functions. The
equations generating the basis functions, k , based on the fundamental frequency,
!, are re-scaled versions of that fundamental frequency. The concept of re-scaling a
“fundamental” function to provide a basis will occur in many guises.
Fourier series are useful in fitting global variation, but respond to local variation
only at very high frequencies thereby substantially increasing the required number
of Fourier coefficients to achieve a given level of approximation For example,
consider fitting a Fourier basis to a “Box function”, any reasonable degree of fit
will require very many terms at high frequency at the points of discontinuity (see
Bloomfield 1976; Korner 1988).
Economy of coefficients can be obtained for local fitting by using windows, that
is, instead of
1
O 1 X O
h.!/ D R.s/ cos.s!/
2 1
O
where R.s/ is the sample covariance at lag “s”, we consider
1 X
M
O
h.!/ D O cos.s!/
.s/R.s/ (13)
2 M
X
n
Sn .f; t/ D fO.r/ exp irt ! f .t/ (15)
n
as n ! 1
As pointed out by Korner (1988), the difficulty is due to the confusion between
“the limit of the graphs and the graph of the limit of the sum”. This insight was
presented by Gibbs and illustrated practically by Michelson; that is Sn .h; t/ !
h.t/ pointwise; that is the blips move towards the discontinuity but pointwise
convergence of fn to f does not imply that the graph of fn starts to look like f
for large N shown in (16). The important point to remember is that the difference is
bounded from below in this instance by:
Z
2 sin x
dx > 1:17 (17)
0 x
The main lesson here for the econometrician is that observed data may well
contain apparently continuous functions that are not only sampled at discrete
intervals, but that may in fact contain significant discontinuities. Indeed, one
may well face the problem of estimating a continuous function that is nowhere
differential,the so called “Weierstrass functions” (see, for example, Korner 1988).
It is useful to note that, whether we are examining wavelets (to be defined below),
or sinusoids or Gabor functions, we are in fact approximating f .t/ by “atoms”.1 We
seek to obtain the best M atoms for a given f .t/ out of a dictionary of P atoms.
There are three standard methods for choosing the M atoms in this over sampled
situation. The first is “matching pursuit” in which the M atoms are chosen one
at a time; this procedure is referred to as greedy and sub-optimal (see Bruce and
Gao 1996). An alternative method is the best basis algorithm which begins with a
1
A collection of atoms is a “dictionary”.
8 J.B. Ramsey
dictionary of bases. The third method, which will be discussed in the next section,
is known as basis pursuit, where the dictionary is still over complete. The synthesis
of f .t/ in terms of i .t/ is under-determined.
This brief discussion indicates that the essential objective is to choose a good
basis. A good basis depends upon the resolution of two characteristics; linear inde-
pendence and completeness. Independence ensures uniqueness of representation
and completeness ensures that any f .t/ within a given class of functions can be
represented in terms of the basis vectors. Adding vectors will destroy independence,
removing vectors will destroy completeness. Every vector v or function v.t/ can be
represented uniquely as:
X
vD bi vi (18)
or
X
v.t/ D bi vi .t/
This is the defining property of a Riesz basis (see, for example, Strang and
Nguyen 1996).
If 0 < A < B and Eq. (19) holds and the basis generating functions are defined
within a Hilbert space, then we have defined a frame and A; B are the frame bounds.
If A equals B the bounds are said to be tight; if further the bounds are unity,
i.e. A D B D 1, one has a orthonormal basis for the transformation. For example,
p
consider a frame within a Hilbert space, C , given by: e1 D .0; 1/; e2 D . 23
p
1
2
/; e3 D . 3 1
; /.
2 2
For any v in the Hilbert space we have:
X 3
ˇ˝ ˛ˇ
ˇ v; ej ˇ2 D 3 kvk2 (20)
j D1
2
where the redundancy ratio is 3=2, i.e. three vectors in a two dimensional space
(Daubechies 1992).
Much of the usefulness of wavelet analysis has to do with its flexibility in handling
a variety of nonstationary signals. Indeed, as wavelets are constructed over finite
intervals of time and are not necessarily homogeneous over time, they are localized
in time and scale. The projection of the analizable signal onto the wavelet function
Functional Representation, Approximation, Bases and Wavelets 9
w D Wx (21)
where x is the analizable signal (see Eq. (21)). While theoretically this is a very use-
ful relationship which clarifies the link between wavelet coefficients and the original
data, it is decidedly not useful in reducing the complexity of the relationships and
does not provide a suitable mechanism for evaluating the coefficients (Bruce and
Gao 1996).
The experienced Waveletor knows also to consider the shape of the basis
generating function and its properties at zero scale. This concern is an often missed
aspect of wavelet analysis. Wavelet analysis, unlike Fourier analysis, can consider
a wide array of generating functions. For example, if the function being examined
is a linearly weighted sum of Gaussian functions, or of the second derivatives of
Gaussian functions, then efficient results will be obtained by choosing the Gaussian
function, or the second derivative of the Gaussian function in the latter case. This is
a relatively under utilized aspect of wavelet analysis, which will be discussed more
fully later.
Further any moderately experienced “Waveletor” knows to choose his wavelet
generating function so as to maximize the “number of zero moments”, to ascertain
the number of continuous derivatives (as a measure of smoothness), and to worry
about the symmetry of the underlying filters although one may consider models
for which asymmetry in the wavelet generating function is appropriate. While
many times the choice of wavelet generating function makes little or no difference
there are times when such considerations are important for the analysis in hand.
For example, the inappropriate use of the Haar function for resolving continuous
smooth functions, or using smooth functions to represent samples of discontinuous
paths. Wavelets provide a vast array of alternative wavelet generating functions, e.g.
Gaussian, Gaussian first derivative, Mexican hat, the Daubechies series, the Mallat
series, and so on. The key to the importance of the differences lies in choosing
the appropriate degree and nature of the oscillation within the supports of the
wavelet function. With the Gaussian, first, and second derivatives as exceptions,
the generating functions are usually derived from applying a pair of filters to the
data using subsampled data (Percival and Walden 2000).
I have previously stated that at each scale the essential operation is one of
differencing using weighted sums; the alternative rescaleable wavelet functions
provide an appropriate basis for such differences. Compare for example:
1 1
Haar W .h0 ; h1 / D . ; / (22)
2 2
Daubchies.D4/ D .h0 ; h1 ; h2 ; h3 /
p p p p
1 3 3 C 3 3 C 3 1 3
D . p ; p ; p ; p /
4 2 4 2 4 2 4 2
10 J.B. Ramsey
The Haar transform is of width two, the Daubchies .D4/ is of width 4. The
Haar wavelet generates a sequence of paired differences at varying scales 2j .
In comparison, the Daubechies transform provides a “nonlinear differencing” over
sets of four scaled elements, at scales 2j .
Alternatively, wavelets can be generated by the conjunction of high and low
pass filters, termed “filter banks” by Strang and Nguyen (1996) to produce pairs of
functions ‰.t/; ˆ.t/ that with rescaling yield a basis for the analysis of a function
ft . Unlike the Fourier transform, which uses the sum of certain basis functions (sines
and cosines) to represent a given function and may be seen as a decomposition
on a frequency-by-frequency basis, the wavelet transform utilizes some elementary
functions (father ˆ and mother wavelets ‰) that, being well-localized in both time
and scale, provide a decomposition on a “scale-by-scale” basis as well as on a
frequency basis. The inner product ˆ with respect to f is essentially a low pass
filter that produces a moving average; indeed we recognize the filter as a linear time-
invariant operator. The corresponding wavelet filter is a high pass filter that produces
moving differences (Strang and Nguyen 1996). Separately, the low pass and high
pass filters are not invertible, but together they separate the signal into frequency
bands, or octaves. Corresponding to the low pass filter there is a continuous time
scaling function .t/. Corresponding to the high pass filter is a wavelet w.t/.
For any set of filters that satisfy the following conditions
X
L1
hl D 0 (23)
lD0
X
L1
h2l D 0 (24)
lD0
X
L1
hl hlC2n D 0 (25)
lD0
defines a wavelet function and so is both necessary and sufficient for the analysis of
a function “f”. However, this requirement is insufficient for defining the synthesis
for a function “f”. To achieve synthesis, one must add the constraint that:
Z 1
O 2
j .!/j
C D d! < 1 (26)
1 !
locally in both frequency and time. Fourier analysis can relax local non-stationarity
by windowing the time series as was indicated above. The problem with this
approach is that the efficacy of this approach depends critically on making the right
choice of window and, more importantly, presuming its constancy over time.
Any pair of linear filters that meets the following criteria can represent a wavelet
transformation (Percival and Walden 2000). Equation (23) gives the necessary
conditions for an operator to be a wavelet: hl denotes the high pass filter, and the
corresponding low pass filter is given by:
Equation (27) indicates that the filter bank depends on both the lowpass and high
pass filters. Recall the high pass filter for the Daubechies D.4/, see Eq. (22), the
corresponding low pass filter is:
For wavelet analysis however, as we have observed, there are two basic wavelet
functions, father and mother wavelets, .t/ and .t/. The former integrates to 1 and
reconstructs the smooth part of the signal (low frequency), while the latter integrates
to 0 and can capture all deviations from the trend The mother wavelets, as said
above, play a role similar to sines and cosines in the Fourier decomposition. They
are compressed or dilated, in the time domain, to generate cycles to fit actual data.
The approximating wavelet functions J;k .t/ and J;k .t/ are generated from father
and mother wavelets through scaling and translation as follows:
J t 2J k
J;k .t/ D 2 2 (29)
2J
and
J2 t 2J k
J;k .t/ D2 (30)
2J
where j indexes the scale, so that 2j is a measure of the scale, or width, of the
functions (scale or dilation factor), and k indexes the translation, so that 2j k is the
translation parameter.
Given a signal f .t/, the wavelet series coefficients, representing the projections
of the time series onto the basis generated by the chosen family of wavelets, are
given by the following integrals:
R
dj;k D R j;k .t/f .t/ dt
(31)
sJ;k D J;k .t/f .t/ dt
12 J.B. Ramsey
where j D 1; 2; : : : ; J is the number of scales and the coefficients djk and sJk are the
wavelet transform coefficients representing, respectively, the projection onto mother
and father wavelets. In particular, the detail coefficients dJk ; : : : :; d2k ; d1k represent
progressively finer scale deviations from the smooth behavior (thus capturing the
higher frequency oscillations), while the smooth coefficients sJk correspond to the
smooth behavior of the data at the coarse scale 2J (thus capturing the low frequency
oscillations).
Finally, given these wavelet coefficients, from the functions
X X
SJ;k D sJ;k J;k .t/ and Dj;k D dJ;k J;k .t/ (32)
k k
we may obtain what are called the smooth signal, SJ;k , and the detail signals, Dj;k ,
respectively. The sequence of terms SJ ; DJ ; ::; Dj ; : : : ; D1 for j D 1; 2; : : : ; J
represents a set of signal components that provide representations of the original
signal f .t/ at different scales and at an increasingly finer resolution level.
It is very useful to view the use of wavelets in “regression analysis” in greater
generality than as a simple exercise in “least squares fitting”. As indicated above the
use of wavelets involves the properties of the implicit filters used in the construction
of the wavelet function. Such an approach to determining the properties of wavelet
analysis provides for a structured, but highly flexible, system that is characterized
by a “scarce transformation matrix”; that is, most coefficients in the transformed
space are zero. Indeed, the source of the benefit from creating a spanning set of
basis vectors, both for Fourier analysis and wavelets, is the reduction in degrees of
freedom from N , in the given Euclidean space, to K in the transformed space, where
K is very much smaller than N ; simple linear regression models illustrate the same
situation and perform a similar transformation.
The argument so far, has compared wavelets to splines and to Fourier series or
integrals. A discussion of the differences is required. Splines are easily dealt with in
that the approximations implied by the spline procedure is to interpolate smoothly a
sequence of observations from a smooth differential signal. The analysis is strictly
local, even though most spline algorithms average over the whole sample space.
The fit is almost entirely determined by the observed data points, so that, little
structure is imposed on the process. What structure is predetermined is generated
by the position of the knots.
Fourier series, or Fourier integrals, are strictly global over time or space, notwith-
standing the use of windows to obtain useful local estimates of the coefficients.
Wavelets, however, can provide a mixture of local and global characteristics of the
signal, and are easily modified to incorporate restrictions of the signal over time
or space. Wavelets generalize Fourier integrals and series in that each frequency
band, or octave, groups together, frequencies separated by the supports at each scale.
A research analyst can incorporate the equivalent of a windowed analysis of Fourier
integrals and incorporate time scale variations as in Ramsey and Zhang (1996,
1997). Further, as illustrated by cosine wave packets (Bruce and Gao 1996), and the
Functional Representation, Approximation, Bases and Wavelets 13
wide choice for low and high pass filters (Strang and Nguyen 1996), considerable
detail can be captured, or suppressed, and basic oscillations can be incorporated
using band pass filters to generate oscillatory wavelets.
While it is well recognized that wavelets have not been as widely used in Economics
as in other disciplines, I hope to show that there is great scope for remedying the
situation. The main issue involves the gain in insight to be stimulated by using
wavelets; quite literally, the use of wavelets encourages researchers to generalize
their conception of the problem at hand.
1 t u it
g .t/ D p g. /e (33)
s s
We impose the conditions jjgjj D 1 where jjgjj is L2 and g.0/ ¤ 0. For any
p parameter s, frequency modulation and translation parameter u: the factor
scale
1= s normalizes the norm of g.t/ to 1; g.t/ is centered at the abscissa u and
its energy is concentrated in the neighborhood of u, size is proportional to s; the
Fourier transform is centered at the frequency and its energy is concentrated in
the neighborhood of and size is proportional to 1=s. Matching pursuit was used
to determine the values of the coefficients; i.e. the procedure picks the coefficients
with the greatest contribution to the variation in the function being analyzed. Raw
tick by tick data on three foreign exchange rates were obtained from October 1,
1992 to September 30, 1993 (see Ramsey and Zhang 1997). The waveform analysis
indicates that there is efficiency of structure, but only at the lowest frequencies
equivalent to periods of 2 h with little power. There are some low frequencies that
wax and wane in intensity. Most of the energy of the system seems to be in the
localized energy frequency bursts.
The frequency bursts provide insights into market behavior. One can view the
dominant market reaction to news as a sequence of short bursts of intense activity
that are represented by narrow bands of high frequencies. For example, only the
first one hundred structures provide a good fit to the data at all but the highest
frequencies. Nevertheless the isolated bursts are themselves unpredictable.
14 J.B. Ramsey
The potential for the observable frequencies to wax and wane militates against
use of the Fourier approach. Further, the series most likely is a sequence of
observations on a continuous, but nowhere differential process. Further analysis is
needed to consider the optimal basis generating function.
To begin the discussion on the “errors in variables” problem one notes that the
approaches are as unstructured as they have always been; that is, we endeavor to
search for a strong instrumental variable, but have no ability to recognize one even
if considered. Further, it is as difficult to recognize a weak instrument that if used
would yield worse results. I have labeled this approach “solution by assumption”
since one has in fact no idea if a putative variable is, or is not, a useful instrumental
variable.
Wavelets can resolve the issue: see Ramsey et al. (2010), Gençay and Gradojevic
(2011), Gallegati and Ramsey (2012) for an extensive discussion of this critical
problem. The task is simple: use wavelets to decompose the observed series into
a “noise” component and a structural component, possibly refined by thresholding
the coefficient estimates (Ramsey et al. 2010). The benefits from recognizing the
insights to be gained from this approach are only belatedly coming to be realized.
Suppose all the variables in a system of equations can be factored into a structural
component, itself decomposable into a growth term, an oscillation term and into a
noise term, e.g.
yi D yi C "i
xi D x C i
zi D zi C !i
where the starred terms are structural and the terms "i ; i; !i are random variables,
either modeled as simple pulses or have a far more complex stochastic structure,
including having distributions that are functions of the structural terms. If we wish
to study the structure of the relationships between the variables, we can easily do so
(see Silverman 2000; Johnstone 2000). In particular, we can query the covariance
between the random error terms, select suitable instrumental variables, solve the
simultaneous equation problem, and deal effectively with persistent series.
Using some simulation exercises Ramsey et al. (2010) demonstrated how the
structural components revealed by the wavelet analysis yield nearly ideal instru-
mental variables for variables observed with error and for co-endogenous variables
in simultaneous equation models. Indeed, the comparison of the outcomes with
current standard procedures indicates that as the nonparametric approximation to
the structural component improves, so does the convergence of the near structural
estimates.
Functional Representation, Approximation, Bases and Wavelets 15
While I have posed the situation in terms of linear regression, the benefits of
this approach are far greater for non-linear relationships. The analysis of Donoho
and Johnstone (1995) indicates that asymptotic convergence will yield acceptable
results and convergence is swift.
Most economic and financial time series evolve in a nonlinear fashion over time, are
non-stationary and their frequency characteristics are often time-dependent, that is,
the importance of the various frequency components is unlikely to remain stable
over time. Since these processes exhibit quite complicated patterns like abrupt
changes, jumps, outliers and volatility clustering, a locally adaptive filter like the
wavelet transform is particularly well suited for evaluation of such models.
An example of the potential role to be played by wavelets is provided by
the detection and location of outliers and structural breaks. Indeed, wavelets can
provide a deeper understanding of structural breaks with respect to standard classical
analysis given their ability to identify the scale as well as the time period at
which the inhomogeneity occurs. Specifically, based on two main properties of the
discrete wavelet transform (DWT), i.e. the energy preservation and approximate
decorrelation properties, a wavelet-based test for homogeneity of variance (see
Whitcher 1998; Whitcher et al. 2002) can be used for detecting and localizing
regime shifts and discontinuous changes in the variance.
Similarly, structural changes in economic relationships can be usefully detected
by the presence of shifts in their phase relationship. Indeed, although a standard
assumption in economics is that the delay between variables is fixed, Ramsey
and Lampart (1998a,b) have shown that the phase relationship (and thus the
lead/lag relationship) may well be scale dependent and vary continuously over time.
Therefore examining “scale-by-scale” overlaid graphs between pairs of variables
can provide interesting insights into the nature of the relationship between these
variables and their evolution over time (Ramsey 2002). A recent example of this
approach is provided in Gallegati and Ramsey (2013) where the analysis of such
variations in phase is proven to be useful for detecting and interpreting structural
changes in the form of smooth changes in the q-relationship proposed by Tobin.
To consider an extreme example, suppose that the economy was composed entirely
of discrete jumps, the only suitable wavelet would be based on the Haar function.
Less restrictive is the assumption, that the process is continuous, except for a finite
number of discontinuities. The analysis can proceed in two stages; first isolate
the discontinuities using Haar wavelets, next analyze the remaining data using an
appropriate contiguous wavelet generating function.
Finally, wavelets provide a natural way to seek for outliers in that wavelets
allow for local distributions at all scales and outliers are at the very least a “local”
phenomenon [see for a very brief introduction (Wei et al. 2006; Greenblatt 1996)].
The idea of thresholding (Bruce and Gao 1996; Nason 2008), is that the noise
16 J.B. Ramsey
The separation of aggregate data into different time scale components by wavelets
can provide considerable insights into the analysis of economic relationships
between variables. Indeed, economics is an example of a discipline in which time
scale matters. Consider, for example, traders operating in the market for securities:
some, the fundamentalists, may have a very long view and trade looking at firm or
market fundamentals; some others, the chartists, may operate with a time horizon of
weeks or days. A corollary of this assumption is that different planning horizons
are likely to affect the structure of the relationships themselves, so that such
relationships might vary over different time horizons or hold at several time scales,
but not at others.
Although the concepts of the “short-run” and of the “long-run” are central for
modeling economic and financial decisions, variations in the relationship across
time scales are seldom discussed in economics and finance. We should begin by
recognizing that for each variable postulated by the underlying theory we admit the
possibility that:
ys D gs .yj;s xi;s /;
where ys is the dependent variable at scale s, gs ..) are arbitrary functions specified
by the theory, which might differ across scales, yj;s represents the codependent
variables at scale s, and xi;s represents exogenous variables xi at scale s; that is,
the relationships between economic variables may well be scale dependent.
Following Ramsey and Lampart (1998a,b) many authors have confirmed that
allowing for different time scales of variation in the data can provide a fruitful
understanding of the complex dynamics of economic relationships among variables
with non-stationary or transient component variables. For example, relationships
that are veiled when estimated at the aggregate level, may be consistently revealed
after allowing for a decomposition of the variables into different time scales.
In general, the results indicate that by using wavelet analysis it is possible to
uncover relationships that are at best puzzling using standard regression methods
Functional Representation, Approximation, Bases and Wavelets 17
and that ignoring time and frequency dependence between variables when analyzing
relationships in economics and finance can lead to erroneous conclusions.
The standard concerns about forecasting carry over to the use of wavelets, but,
as might have been anticipated, wavelets incorporate a degree of refinement and
flexibility not available using conventional methods (see, for example, Diebold
1998). With wavelets, one can choose the scale at which the forecast is to be made,
treating each scale level as a separate series for forecasting purposes. Secondly, one
should note that at any given point in time the “forecast” will depend on the scales
at which one wishes to evaluate the forecast; for example, at all scales for a point
in time, t0 , or for a subset of scales at time t0 . Further, one might well choose to
consider, at a given minimum scale whether to forecast a range, given the chosen
minimum scale, or to forecast a point estimate at time t0 .
These comments indicate a potentially fruitful line of research and indicates that
the idea of “forecasting” is more subtle than has been recognized so far. Forecasts
need to be expressed conditional on the relevant scales, and the usual forecasts are
special cases of a general procedure. Indeed, one concern that is ignored in the
conventional approach is to recognize across scales the composition of the variance
involved in term of the variances at each scale level. For examples, see Gallegati
et al. (2013), Yousefi et al. (2005), Greenblatt (1996). Linking forecasts to the
underlying scale indicates an important development in the understanding of the
information generated by wavelets. There is not a single forecast at time t0Ch made
at time t0 , but a forecast at each relevant time scale.
Fan and Gencay (2010) have explored the gain in efficiency in discovering unit
roots and applying tests for cointegration using wavelet procedures. Further, using
MODWT multi-resolution techniques the authors demonstrate a significant gain in
power against near unit root processes. In addition, the wavelet approach leads to a
novel interpretation of Von Neumann variance ratio tests.
Gallegati et al. (2009, 2011) reviewed the literature on the “wage Phillips curve”
using U.S. data. The most significant result of the multiscale analysis is the long run
one to one relationship between wage and price inflation and the close relationship
between nominal changes and unemployment rate at business cycle scales. Over
all, the paper suggests that allowing for different time scales of variation in the
data can provide a richer understanding of the complex dynamics of economic
relationships between variables. Relationships that are puzzling when tested using
standard methods can be consistently estimated and structural properties revealed
18 J.B. Ramsey
using timescale analysis. The authors note with some humor that Phillips himself
can be considered as the first user of wavelets in Economics!
One of the most cogent rationalizations for the use of wavelets and timescale
analysis is that different agents operate at different timescales. In particular, one
might examine the behavior of central banks to elucidate their objectives in the
short and long run. This is done in Aguiar-Conraria and Soares (2008) in assessing
the relationship between central bank decision-making and government decision-
making. The authors confirm that the macro relationships have changed and evolved
over time.
In Rua and Nunes (2009), Rua (2010) interesting results are obtained which
concentrate on the role of wavelets in the analysis of the co-movement between
international stock markets. In addition, the authors generalize the notion of co-
movement across both time and frequency. In Samia et al. (2009), a wavelet
approach is taken in assessing values for VaR’s and compared favorably to the
conventional ARMA-GARCH processes.
future on capturing variation within the function’s supports and thereby providing
alternative determinations of very short run behavior. The implied flexibility of
wavelets provides deconvolution of very short run phenomena as well as the medium
run and long run phenomena.
The paper also contains brief reviews of a variety of applications of wavelets to
economic examples which are of considerable importance to economists interested
in accurate evaluations of policy variables. A wide variety of data sources have been
examined, including both macroeconomic and financial data (Bloomfield 1976).
In these models the problem of errors in the variables is critical, but wavelets
provide the key to resolving the issue. Some papers examine data for structural
breaks and outliers. Comments on forecasting were presented. These thoughts
indicate that forecasting is more subtle than is currently believed in that forecasts
require to be calculated conditional on the scales involved in the forecast. Some
forecasts might well involve only a particular subset of the time scales included in
the entire system.
References
1 Introduction
living (Landes 1969). Economists in the past, from Ricardo to Schumpeter to Hicks,
have explored the phenomenon of whether new technology and productivity in fact
increase unemployment. The relationship between productivity and employment
is also very important in the theoretical approach followed by the mainstream
models: Real Business Cycle (RBC) and DSGE. In particular, RBC theorists have
postulated technology shocks as the main driving force of business cycles. In RBC
models technology shocks, either to output and employment (measured as hours
worked) are predicted to be positively correlated.1 This claim has been made the
focus of numerous econometric studies.2 Employing the Blanchard and Quah (1989)
methodology Gali (1999), Gali and Rabanal (2005), Francis and Ramey (2005) and
Basu et al. (2006) find a negative correlation between employment and productivity
growth, once the technology shocks have been purified taking out demand shocks
affecting output.
Although economists mostly agree on the long run positive effects of labor pro-
ductivity, significant disagreements arise over the issue as to whether productivity
growth is good or bad for employment in the short run. Empirical results have
been mixed (e.g. in Muscatelli and Tirelli 2001, where the relationship between
productivity growth and unemployment is negative for several G7 countries and
not significant for others) and postulate a possible trade-off between employment
and productivity growth (Gordon 1997). Such empirical findings have been also
complicated by the contrasting evidence emerging during the 1990s between the
US and Europe as to the relationship between (un)employment and productivity
growth. Whereas the increase in productivity growth in the US in the second half
of the 1990s is associated with low and falling unemployment (Staiger et al. 2001),
in Europe the opposite tendency was visible. Productivity growth appears to have
increased unemployment.
The labor market provides an example of a market where the strategies used
by the agents involved, firms and workers (through unions), can differ by time
scale. Thus, the “true” economic relationships among variables can be found at
the disaggregated (scale) level rather than at the usual aggregate level. As a matter
of fact, aggregate data can be considered the result of a time scale aggregation
procedure over all time scales and aggregate estimates a mixture of relationships
across time scales, with the consequence that the effect of each regressor tends
to be mitigated by this averaging over all time scales.3 Blanchard et al. (1995)
were the first ones to hint at such a research agenda. They stressed that it may be
useful to distinguish between the short, medium and long-run effects of productivity
growth, as the effects of productivity growth on unemployment may show different
1
See the volume by Cooley (1995), and see also Backus and Kehoe (1992) among the others.
2
For details of the evaluations, see Gong and Semmler (2006, ch.6).
3
For example in Gallegati et al. (2011) where wavelet analysis is applied to the wage Phillips curve
for the US.
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 25
co-movements depending on the time scales.4 Similar thoughts are also reported
in Solow (2000) with respect to the different ability of alternative theoretical
macroeconomic frameworks to explain the behavior of an economy at the aggregate
level in relation to their specific time frames5 and, more recently, the idea that time
scales can be relevant in this context has also been expressed by Landmann (2004).6
Following these insights, studies are now emerging arguing that researchers need
to disentangle the short and long-term effects of changes in productivity growth for
unemployment. For example, Tripier (2006), studying the co-movement of produc-
tivity and hours worked at different frequency components through spectral analysis,
finds that co-movements between productivity and unemployment are negative in
the short and long run, but positive over the business cycle.7 This paper is related
to the above mentioned literature by focussing on the relationship of unemployment
and productivity growth at different frequency ranges. Indeed, wavelets with respect
to other filtering methods are able to decompose macroeconomic time series, and
data in general, into several components, each with a resolution matched to its scale.
After the first applications of wavelet analysis in economics and finance provided
by Ramsey and his co-authors (1995; 1996; 1998a; 1998b), the number of wavelet
applications in economics has been rapidly growing in the last few years as a result
of the interesting opportunities provided by wavelets in order to study economic
relationships at different time scales.8
The objective of this paper is to provide evidence on the nature of the time scale
relationship between labor productivity growth and the unemployment rate using
wavelet analysis, so as to provide a new challenging theoretical framework, new
empirical results as well as policy implications. First, we perform wavelet-based
exploratory analysis by applying the continuous wavelet transform (CWT) since
tools such as wavelet power, coherency and phase can reveal interesting features
4
Most of the attention of economic researchers who work on productivity has been devoted
to measurement issues and to resolve the problem of data consistency, as there are many
different approaches to the measurement of productivity linked to the choice of data, notably the
combination of employment, hours worked and GDP (see for example the OECD Productivity
Manual, 2001).
5
“At short term scales, I think, something sort of Keynesian is a good approximation, and surely
better than anything straight neoclassical. At very long scales, the interesting questions are best
studied in a neoclassical framework. . . . At the 5–10 years time scale, we have to piece things
together as best as we can, and look for an hybrid model that will do the job” (Solow 2000, p. 156).
6
“The nature of the mechanism that link [unemployment and productivity growth] changes with the
time frame adopted” because one needs “to distinguish between an analysis of the forces shaping
long-term equilibrium paths of output, employment and productivity on the one hand and the forces
causing temporary deviations from these equilibrium paths on the other hand” (Landmann 2004,
p. 35).
7
Qualitative similar results are also provided using time domain techniques separating long-run
trends from short run phenomena.
8
For example Gençay et al. (2005), Gençay et al. (2010), Kim and In (2005), Fernandez (2005),
Crowley and Mayes (2008), Gallegati (2008), Ramsey et al. (2010), Gallegati et al. (2011).
26 M. Gallegati et al.
The essential characteristics of wavelets are best illustrated through the development
of the continuous wavelet transform (CWT).9 We seek functions .u/ such that:
Z
.u/ du D 0 (1)
Z
.u/2 du D 1 (2)
The cosine function is a “large wave” because its square does not converge to 1,
even though its integral is zero; a wavelet, a “small wave” obeys both constraints.
An example would be the Haar wavelet function:
8 1
< p2 1 < u < 0
ˆ
H
.u/ D p1 0<u<1 (3)
:̂ 2
0 otherwise
9
Wavelets, their generation, and their potential use are discussed in intuitive terms in Ramsey
(2010), while Gençay et al. (2001) generate an excellent development of wavelet analysis
and provide many interesting economic examples. Percival and Walden (2000) provide a more
technical exposition with many examples of the use of wavelets in a variety of fields, but not in
economics.
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 27
Let us choose the convention that we assess the value of the “average” at the
center of the interval and let b a represent the scale of the partial sums. We
have the expression:
A(; t/ is the average value of the signal centered at “t” with scale . But what
is of more use is to examine the differences at different values for and at different
values for “t”. We define:
This is the basis for the continuous wavelet transform, CWT, as defined by
the Haar wavelet function. For an arbitrary wavelet function, W . t/, the wavelet
transform, is:
Z 1
W .; t/ D ; t .u/x.u/ du
1
1 ut
; t .u/ p (7)
where is a scaling or dilation factor that controls the length of the wavelet and t
a location parameter that indicates where the wavelet is centered (see Percival and
Walden 2000).
Let Wx .; t/ be the continuous wavelet transform of a signal x.:/; jWx j2 represents
the wavelet power and can be interpreted as the energy density of the signal in the
time-frequency plane. Among the several types of wavelet families available, that is
Morlet, Mexican hat, Haar, Daubechies, etc., the Morlet wavelet is the most widely
28 M. Gallegati et al.
used because of its optimal joint time frequency concentration. The Morlet wavelet
is a complex wavelet that produces complex transforms and thus can provide us with
information on both amplitude and phase. It is defined as
1 2
.t/ D 4 e i !0 e 2 : (8)
10
The CWT has been computed using the MatLab package developed by Grinsted et al. (2004).
MatLab programs for performing the bias-rectified wavelet power spectrum and partial wavelet
coherence are provided by Ng and Kwok at http://www.cityu.edu.hk/gcacic/wavelet.
11
The statistical significance of the results obtained through wavelet power analysis was first
assessed by Torrence and Compo (1998) by deriving the empirical (chi-squared) distribution for
the local wavelet power spectrum of a white or red noise signal using Monte Carlo simulation
analysis.
12
As with other types of transforms, the CWT applied to a finite length time series inevitably suffers
from border distortions; this is due to the fact that the values of the transform at the beginning
and the end of the time series are always incorrectly computed, in the sense that they involve
missing values of the series which are then artificially prescribed; the most common choices are
zero padding extension of the time series by zeros or periodization. Since the effective support of
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 29
Fig. 1 Rectified wavelet power spectrum plots for labor productivity growth. Note: contours and
a cone of influence are added for significance. A black contour line testing the wavelet power 5 %
significance level against a white noise null is displayed as is the cone of influence, represented by
a shaded area corresponding to the region affected by edge effects
In Figs. 1 and 2 we report estimated wavelet spectra for labor productivity growth
and the unemployment rate, respectively.13 The comparison between the power
spectra of the two variables reveals important differences as to their characteristic
features. In the case of labor productivity growth there is evidence of highly
localized patterns at lower scales, with high power regions concentrated in the
first part of the sample (until late eighties). By contrast, for the unemployment
rate significant power regions are evident at scales corresponding to business cycle
frequencies throughout the sample.
Although useful for revealing potentially interesting features in the data like
“characteristic scales”, the wavelet power spectrum is not the best tool to deal
with the time-frequency dependencies between two time-series. Indeed, even if two
variables share similar high power regions, one cannot infer that their comovements
look alike.
the wavelet at scale is proportional to , these edge effects also increase with . The region in
which the transform suffers from these edge effects is called the cone of influence. In this area of
the time-frequency plane the results are unreliable and have to be interpreted carefully (see Percival
and Walden 2000).
13
We use quarterly data for the US between 1948W1 and 2013W4 from the Bureau of Labor Statistics.
Labor productivity is defined as output per hour of all persons in the Nonfarm Business Sector,
Index 1992 D 100, and transformed into its growth rate as 400 ln.xt =xt1 ). Unemployment rate
is defined as percent Civilian Unemployment Rate.
30 M. Gallegati et al.
Fig. 2 Rectified wavelet power spectrum for the unemployment rate. Note: see Table 1
In order to detect and quantify relationships between variables suitable wavelet tools
are the cross-wavelet power, wavelet coherence and wavelet phase difference. Let
Wx and Wy be the continuous wavelet transform of the signals x.:/ and y.:/, the
cross-wavelet power of the two series is given by jWxy j=jWx Wy j and depicts the
local covariance of the two time series at each scale and frequency (see Hudgins
et al. 1993). The wavelet coherence is defined as the modulus of the wavelet
cross spectrum normalized to the single wavelet spectra and is especially useful
in highlighting the time and frequency intervals where two phenomena have strong
interactions. It can be considered as the local correlation between two time series in
time frequency space. The statistical significance level of the wavelet coherence is
estimated using Monte Carlo methods. The 5 % significance level against the null
hypothesis of red noise is shown as a thick black contour. The cone of influence
is marked by a black thin line: again, values outside the cone of influence should
be interpreted very carefully, as they result from a significant contribution of zero
padding at the beginning and the end of the time series.
Complex-valued wavelets like Morlet wavelet have the ability to provide the
phase information, that is a local measure of the phase delay between two time series
as a function of both time and frequency. The phase information is coded by the
arrow orientation. Following the trigonometric convention the direction of arrows
shows the relative phasing of the two time series and can be interpreted as indicating
a lead/lag relationship: right (left) arrow means that the two variables are in phase
(anti-phase). If the arrows points to the right and up, it means the unemployment
rate is lagging. If they points to the right and down, unemployment rate is leading.
If the arrows are to the left and up, it means unemployment rate is leading and if
they are to the left and down, unemployment rate is lagging. The relative phase
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 31
Fig. 3 Wavelet coherence between the unemployment rate and productivity growth. The color
code for power ranges from blue (low coherence) to red (high coherence). A pointwise significance
test is performed against an almost process independent background spectrum. 95 % confidence
intervals for the null hypothesis that coherency is zero are plotted as contours in black in the figure.
The cone of influence is marked by black lines (Color figure online)
14
The number of the papers applying the DWT is far greater than those using the CWT. As a
matter of fact, the preference for DWT in economic applications can be explained by the ability of
the DWT to facilitate a more direct comparison with standard econometric tools than is permitted
by the CWT, e.g. time scales regression analysis, homogeneity test for variance, nonparametric
analysis.
15
Their dimensions change according to their scale: the windows stretch for large values of to
measure the low frequency movements and compress for small values of to measure the high
frequency movements.
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 33
j D 1; 2; ::J (10)
f .t/ SJ C DJ C DJ 1 C : : : D2 C D1 (15)
where SJ contains the “smooth component” of the signal, and the Dj ; j D 1; 2; ::J ,
the detail signal components at ever increasing levels of detail. SJ provides the large
scale road map, D1 shows the pot holes. The previous equation indicates what is
termed the multiresolution decomposition, MRD.
The orthonormal discrete wavelet transform (DWT), even if widely applied to time
series analysis in many disciplines, has two main drawbacks: (1) the dyadic length
requirement (i.e. a sample size divisible by 2J ), and (2) the wavelet and scaling
coefficients are not shift invariant. Because of the practical limitations of DWT
wavelet analysis is generally performed by applying the maximal overlap discrete
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 35
16
Since the J components obtained by the application of MODWT are not orthogonal, they do not
sum up to the original variable.
17
Detail levels D1 and D2 , represent the very short-run dynamics of a signal (and contains most
of the noise of the signal), levels D3 and D4 roughly correspond to the standard business cycle
time period (Stock and Watson 1999), while the medium-run component is associated to level
D5 . Finally, the smooth component S5 captures oscillations with a period longer than 16 years
corresponding to the low-frequency components of a signal.
18
Although a standard assumption in economics is that the delay between variables is fixed, the
phase relationship may well be scale dependent and vary continuously over time (e.g. in Ramsey
and Lampart 1998a,b; Gallegati and Ramsey 2013).
19
This leading behavior is consistent with the findings reported in the previous section using
wavelet coherence.
36 M. Gallegati et al.
8 S5 D5
cbind(lp.mra[[6]], ur.mra[[6]])
cbind(lp.mra[[5]], ur.mra[[5]])
-1.0
2
1
D4 D3
cbind(lp.mra[[4]], ur.mra[[4]])
cbind(lp.mra[[3]], ur.mra[[3]])
1.5
2
1
0.5
0
-0.5
-1
-2
-1.5
-3
0 50 100 150 200 250 0 50 100 150 200 250
Fig. 5 Phase shift relationships of smooth and detail components for unemployment (dotted lines)
and productivity (solid lines)
panel in Fig. 5 reveals that the two components are mostly in phase at the D5 scale
level, with unemployment slightly leading productivity growth. Nonetheless, the
plot also shows that the two series at this level have been moving into antiphase at
the beginning of the 1990s, as a consequence of a shift in the phase relationship,
and then have been moving in-phase again in the last part of the sample. At the D4
scale level unemployment and productivity are in-phase throughout the sample with
the exception of the 1960s. Finally, at the lower scale levels productivity growth
and unemployment rate components show very different amplitude fluctuations.
This pattern suggests how a well known feature of aggregate productivity growth
quarterly data, that is its very high volatility, can be ascribed to high frequency
components.
Wavelets provide a unique tool for the analysis of economic relationships on a scale-
by-scale basis. Time scale regression analysis allows the researcher to examine
the relationship between variables at each j scale where the variation in both
variables has been restricted to the indicated specific scale. In order to perform a
time scale regression analysis first we need to partition each variable into a set of
different components by using the discrete wavelet transform (DWT), such that each
component corresponds to a particular range of frequencies, and then run regression
analysis on a scale-by-scale basis (e.g. Ramsey and Lampart 1998a,b; Kim and
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 37
and
where urŒSJ t , and lpŒSJ t represent the components of the variables at the longest
scale, and urŒDj t , and lpŒDj t represent the components of the variables at each
scale j , with j D 1; 2; : : : :; J .
In Table 1 we present the results from least squares estimates at the aggregate and
individual scale levels. First of all, we notice that although at the aggregate level the
relationship between productivity and unemployment is not significant, the “scale-
by-scale” regressions reveal a positive significant relationship at almost each scale
level and that the effects of productivity on unemployment rate differ widely across
scales in terms of sign and estimated size effect. Specifically, if at scales D1 and D2
the estimated size effect of productivity growth on unemployment is negligible, at
business cycles and medium run scales, i.e. from D3 to D5 , the size and significance
of the estimated coefficients indicate a positive relationship that is higher for the D4
scale level. Finally, long run trends are negatively related. A 1 % fall in the long-run
productivity growth rate increases the unemployment rate by 1:86 %.21
20
Thus, we test for frequency dependence of the regression parameter by using timescale regression
analysis since the approaches used to detect and model frequency dependence such as spectral
regression approaches (Hannan 1963; Engle 1974, 1978) present several shortcomings because of
their use of the Fourier transformation. For examples of the use of this procedure in economics,
see Ramsey and Lampart (1998a,b), Gallegati et al. (2011).
21
This estimated magnitude of the impact of growth on unemployment is in line with those obtained
in previous studies. For example, Pissarides and Vallanti (2007) a panel of OECD countries
estimate that a 1 % decline in the growth rate leads to a 1.3–1.5 % increase in unemployment.
38 M. Gallegati et al.
This finding is not new. A negative link between unemployment and productivity
growth at low frequencies is also documented in Staiger et al. (2001), Ball and
Moffitt (2002), where the trending behavior of productivity growth is called for
in the explanation of low and falling inflation combined with low unemployment
experienced by the US during the second half of the 1990s, as well as in Muscatelli
and Tirelli (2001) for several G7 countries. Similar results have been also obtained
in Tripier (2006), Chen et al. (2007) using different methods. In the first by using
measures of co-movements in the frequency domain it is shown that co-movements
between variables differ strongly according to the frequency, that is negative in
the short and long run, but positive over the business cycle. In the latter, the
authors, disaggregating data into their short and long-term components and using
two different econometric methods (Maximum Likelihoof and structural VAR),
find that productivity growth affects unemployment positively in the short run and
negatively in the long run.22
In sum, when we consider different time frames we find that the effects of
productivity growth on unemployment are frequency-dependent: in the long run an
increase in productivity releases forces that stimulate innovation and growth in the
economy and thus determine a reduction of unemployment, whereas at intermediate
and business cycle time scales productivity gains cause unemployment to increase.
In this section we apply a methodology that allows us to explore the robustness of the
issues related to the relationship between productivity growth and unemployment
without making any a priory explicit or implicit assumption about the form of the
relationship: nonparametric regression analysis. Indeed, nonparametric regressions
can capture the shape of a relationship without us prejudging the issue, as they
estimate the regression function f .:/ linking the dependent to the independent
variables directly.23
There are several approaches available to estimate nonparametric regression
models,24 and most of these methods assume that the nonlinear functions of the
independent variables to be estimated by the procedures are smooth continuous
functions. One such model is the locally weighted polynomial regression pioneered
22
Recently, a negative long-run relationship between productivity growth and unemployment has
also been obtained by Schreiber (2009) using a co-breaking approach and Miyamoto and Takahashi
(2011) using band-pass filtering.
23
The traditional nonlinear regression model introduce nonlinear functions of dependent variables
using a limited range of transformed variables to the model (quadratic terms, cubic terms or
piecewise constant function). An example of a methodology testing for nonlinearity without
imposing any a priory assumption about the shape of the relationship is the smooth transition
regression used in Eliasson (2001).
24
See Fox (2000a,b) for a discussion on nonparametric regression methods.
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 39
4 Interpretation
25
The smoothing parameter controls the flexibility of the loess regression function: large values of
produce the smoothest functions that wiggle the least in response to fluctuations in the data, the
smaller the smoothing parameter is, the closer the regression function will conform to the data
26
We use different smoothing parameters, but our main findings do not show excess sensitivity
to the choice of the span in the loess function within what appear to be reasonable ranges of
smoothness (i.e. between 0.4 and 0.8).
S5 D5 D4
40
1.0
3.0
1.5
1.0
0.5
2.5
0.5
0.0
0.0
lp.mra[[6]]
lp.mra[[5]]
lp.mra[[4]]
2.0
-0.5
-0.5
-1.0
1.5
-1.5
-1.0
4.5 5.0 5.5 6.0 6.5 7.0 7.5 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 1.5
D3 D2 D1
2
4
1
2
0
0
lp.mra[[3]][21:232]
lp.mra[[2]][21:232]
lp.mra[[1]][21:232]
-2
-1
-1
-4
-2
-2
-3
-6
-3
-0.5 0.0 0.5 1.0 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3
which, in turn, reflect the extent to which the two forms of technical change
discussed in this literature, that is embodied and disembodied technology,27 are
embodied in production factors.
In the model with disembodied technological progress it is suggested that
higher productivity growth reduces the long run unemployment rate through the
so called “capitalization effect” (Pissarides 1990, 2000). By contrast, in the model
with embodied technological progress, faster technical change increases long run
unemployment through a “creative destruction effect” (Aghion and Howitt 1994,
1998; Postel-Vinay 2002). Inconsistency between these findings is resolved in
Mortensen and Pissarides (1998) by building up a matching model with embodied
technical progress in which both types of effects, that is “capitalization” and
“creative destruction”, can be obtained depending on “whether new technology
can be introduced into ongoing jobs, or it needs to be embodied in new job
creation” (Pissarides and Vallanti 2007). As a result, whether the overall impact of
productivity growth on unemployment is positive or negative is assumed to depend
upon the relative strength of the “capitalization” and “creative destruction” effects.
What effect is likely to prevail is a question that can be addressed by considering
the different time horizon of “capitalization” and “creative destruction” effects, and
their associated effects on job creation and jobs destruction, respectively. The time
horizon of job creation can be radically different from that of job destruction.
Indeed, firm’s time horizon when creating jobs can be very long, and definitely
much longer than firm’s horizon when destructing jobs, since job creation involves
computing the expected present discounted value of future profits from new jobs.
As a consequence, we can expect that the relevance of the capitalization effect
as to the creative destruction effect (and the net effect of productivity growth on
employment) could be different across different time horizons since the latter effect
induces more job destruction and less job creation than the first one. In particular,
we should observe a positive relationship between productivity growth and unem-
ployment if the creative destruction effect dominates over the capitalization effect,
and conversely a negative relationship if the capitalization effect dominates.
The empirical evidence provided using wavelet analysis hints that the “creative
destruction” effect dominates over the “capitalization” effect at short- to medium
term scales, whereas the “capitalization” effect dominates at the longest scale. In this
way we can interpret the negative long-run connection between productivity growth
and unemployment as consistent with models where technological progress is purely
disembodied (see Pissarides and Vallanti 2007) or the positive “capitalization effect”
27
Embodied technical change is embedded in (new) capital goods or jobs and can benefit only
jobs that explicitly invest in new technology. By contrast, disembodied technical change is not
tied to any factor of production and can benefit all existing jobs. According to the “capitalization”
effect an increase in growth raises the capitalized value of those returns obtained from creating
jobs, thereby reducing the equilibrium rate of unemployment by increasing the job-finding rate.
The second effect is the creative destruction, according to which an increase in growth raises the
equilibrium level of unemployment both directly, by raising the job-separation rate, and indirectly,
by discouraging the creation of job vacancies.
42 M. Gallegati et al.
5 Conclusion
28
These long run effects maybe also based on the sluggishness of real wage adjustments as
suggested by models where wage setting depends on backward looking reservation wages
(Blanchard and Katz 1999). Results compatible with this evidence are reported in Gallegati et
al. (2011) where wages do not adjust fully to productivity changes in the long-run.
29
Higher productivity growth is often accompanied by structural change wherein “old jobs” are
replaced by “new ones” since technology could enhance the demands for new products.
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 43
productivity that did not show up in the labor market as higher employment. US
is now viewed as suffering from jobless growth, so that the question is whether
the low reaction of employment to increases in productivity is a short or long run
phenomenon. The issue is therefore how does productivity affect unemployment at
different time horizons. Such relationships, and, in particular, the medium and long-
run relationships between productivity growth and unemployment are generally
analyzed in the empirical literature looking at average aggregate data, generally
decades, because from a time series perspectives the rate of growth of labor
productivity is a very volatile series whose implications in terms of the movements
of the other supply-side variables are difficult to interpret, particularly in the short-
run and medium run.30
The key to the empirical results obtained in the past is to examine the empirical
relationships on a “scale-by-scale” basis. This is because the result is an empirical
issue and the outcome depends at each scale on the elasticity of response of demand
to price, new products, and/or re-engineered products to the new technology. The
results in the short and intermediate run indicate that a reduction of employment is
plausible, especially if the elasticity of response of demand to price reductions is
unsubstantial. But the opposite seems to be the case for the long run. However,
even though the sign of the relationship between employment and productivity
may well stay constant over long periods of time, one would expect there to be
large differences in the relative magnitudes of the net response over time caused by
different market and technology conditions.
In this paper, we use wavelets to analyze the productivity-unemployment rela-
tionship over different time frames and demonstrate the usefulness of wavelet
analysis in disentangling the short, medium and long run effects of changes in
productivity growth for unemployment. In a nutshell, we find a strong negative
long run relationship between labor productivity and unemployment, but also a
positive significant relationship at lower scales, especially at scales corresponding
to business cycle frequency bands. In the medium run, new technology is likely
to be labor reducing, and thus adding to unemployment,31 as was visible in
Europe during the 1990s. In the long run, however, new technology replacing labor
(process innovation) increase productivity and makes firms and the economy more
competitive and may reduce unemployment.32 Finally, our results suggest some
relevant implications concerning the interpretation of search-matching models of
unemployment, Okun’s law, the RBC hypothesis of a positive co-movement of
productivity shocks and employment, and the US employment prospects.
When Thomas More (Utopia 1516) was asserting: sheep are eating men, he was,
in the short run, right. Due to agricultural innovations, profits in the primary sector
30
Indeed, the relationship between productivity and the unemployment rate may appear weaker
when we reduce the time period used for aggregating data (see Steindel and Stiroh 2001).
31
A statement like this goes back to David Ricardo who has pointed out that if machinery is
substituted for labor unemployment is likely to increase.
32
This point is made clear in a simple text book illustration by Blanchard (2005).
44 M. Gallegati et al.
were rising, less labor force was employed in agriculture and more lands were
devoted to pastureland. People had to “invent” new jobs, i.e. people were stimulated
into creating new products that the new technology made possible.
Acknowledgements The paper has been presented at the Workshop on “Frequency domain
research in macroeconomics and finance”, held at the Bank of Finland, Helsinki, 20–21 October
2011. We thank all participants for valuable comments and suggestions, particularly Jouko
Vilmunen and Patrick Crowley.
References
Aghion P, Howitt P (1994) Growth and unemployment. Rev Econ Stud 61:477–94
Aghion P, Howitt P (1998) Endogenous growth theory. MIT Press, Cambridge
Backus DK, Kehoe PJ (1992) International evidence on the historical evidence of business cycles.
Am Econ Rev 82:864–888
Ball L, Moffitt R (2002) Productivity growth and the Phillips curve. In: Krueger AB, Solow R (ed)
The roaring nineties: Can full employment be sustained? Russell Sage Foundation, New York,
pp 61–90
Basu S, Fernald JG, Kimball MS (2006) Are technology improvement contractionary? Am Econ
Rev 96:1418–1448
Blanchard OJ (2005) Macroeconomics, 4th edn. Prentice Hall, New Jersey
Blanchard OJ, Quah D (1989) The dynamic effects of aggregate demand and supply disturbances.
Am Econ Rev 79:655–673
Blanchard OJ, Katz L (1999) Wage dynamics: reconciling theory and evidence. NBER Working
Paper, No. 6924
Blanchard OJ, Solow R, Wilson BA (1995) Productivity and unemployment. MIT Press,
unpublished
Chen P, Rezai A, Semmler W (2007) Productivity and Unemployment in the Short and Long Run.
SCEPA Working Paper, 2007–8
Cleveland WS (1979) Robust Locally-Weighted Regression and Scatterplot Smoothing. J Am Stat
Assoc 74:829–836
Cooley TF (1995) Frontiers of business cycle research. Princeton University Press, Princeton
Crowley PM, Mayes DG (2008) How fused is the euro area core? An evaluation of growth cycle
co-movement and synchronization using wavelet analysis. J Bus Cycle Measur Anal 4:76–114
Daubechies I (1992) Ten lectures on wavelets. In: CBSM-NSF regional conference series in applied
mathematics. SIAM, Philadelphia
Eliasson AC (2001) Detecting equilibrium correction with smoothly time-varying strength. Stud
Nonlinear Dyn Econ 5:Article 2
Engle RF (1974) Band spectrum regression. Int Econ Rev 15:1–11
Engle RF (1978) Testing price equations for stability across spectral frequency bands. Economet-
rica 46:869–881
Fernandez VP (2005) The international CAPM and a wavelet-based decomposition of value at risk.
Stud Nonlinear Dyn Econ 9(4):4
Fox J (2000a) Nonparametric simple regression: smoothing scatterplots. Sage, Thousands Oaks
Fox J (2000b) Multiple and Generalized Nonparametric Regression. Sage, Thousands Oaks CA.
Francis N, Ramey VA (2005) Is the technology-driven real business cycle hypothesis dead? Shocks
and aggregate fluctuations revisited. J Monet Econ 52:1379–1399
Gali J (1999) Technology, employment, and the business cycle: Do technology shocks explain
aggregate fluctuations? Am Econ Rev 89:249–271
Does Productivity Affect Unemployment? A Time-Frequency Analysis for the US 45
Gali J, Rabanal P (2005) Technology shocks and aggregate fluctuations: How well does the RBC
model fit postwar U.S. data? IMF Working Papers 04/234
Gallegati M (2008) Wavelet analysis of stock returns and aggregate economic activity. Comput
Stat Data Anal 52:3061–3074
Gallegati M, Ramsey JB (2013) Structural change and phase variation: A re-examination of the
q-model using wavelet exploratory analysis. Struct Change Econ Dyn 25:60–73
Gallegati M, Gallegati M, Ramsey JB, Semmler W (2011) The US wage Phillips curve across
frequencies and over time. Oxf Bull Econ Stat 73:489–508
Gençay R, Selçuk F, Whitcher B (2001) An Introduction to Wavelets and Other Filtering Methods
in Finance and Economics. San Diego Academic Press, San Diego
Gençay R, Selçuk F, Whitcher B (2005) Multiscale systematic risk. J Int Money Financ 24:55–70
Gençay R, Gradojevic N, Selçuk F, Whitcher B (2010) Asymmetry of information flow between
volatilities across time scales. Quant Financ 10:895–915
Gordon RJ (1997) Is there a trade-off between unemployment and productivity growth? In Snower
D, de la Dehesa G (ed) Unemployment policy: government options for the labor market.
Cambridge University Press, Cambridge, pp 433–463
Gong G, Semmler W (2006) Stochastic dynamic macroeconomics: theory and empirical evidence.
Oxford University Press, New York
Grinsted A, Moore JC, Jevrejeva S (2004) Application of the cross wavelet transform and wavelet
coherence to geophysical time series. Nonlinear Processes Geophys 11:561–566
Hannan EJ (1963) Regression for time series with errors of measurement. Biometrika 50:293–302
Hudgins L, Friehe CA, Mayer ME (1993) Wavelet transforms and atmospheric turbulence. Phys
Rev Lett 71:3279–3282
Keim MJ, Percival DB (2010) Assessing Characteristic Scales Using Wavelets. arXiv:1007.4169
Kim S, In FH (2005) The relationship between stock returns and inflation: new evidence from
wavelet analysis. J Empir Financ 12:435–444
Landes DS (1969) The unbound Prometheus: technological change and industrial development in
Western Europe from 1750 to the present. Cambridge University Press, London
Landmann O (2004) Employment, productivity and output growth. In: World Employment Report
2004 International Labour Organization, Geneva
Liu Y, Liang XS, Weisberg RH (2007) Rectification of the bias in the wavelet power spectrum. J
Atmos Oceanic Technol 24:2093–2102
Miyamoto H, Takahashi Y (2011) Productivity growth, on-the-job search, and unemployment.
Economics & Management Series 2011–06, IUJ Research Institute
Mortensen DT, Pissarides C (1998) Technological progress, job creation and job destruction. Rev
Econ Dyn 1:733–753
Muscatelli VA, Tirelli P (2001) Unemployment and growth: some empirical evidence from
structural time series models. Appl Econ 33:1083–1088
OECD (2001) Measuring productivity OECD manual. OECD, Paris
Okun A (1962) Potential GNP: Its measurement and significance. In: Proceedings of the business
and economic statistics section, American Statistical Association
Percival DB, Walden AT (2000) Wavelet methods for time series analysis. Cambridge University
Press, Cambridge
Pissarides C (1990) Equilibrium unemployment theory. Blackwell, Oxford
Pissarides C (2000) Equilibrium unemployment theory, 2nd edn. MIT Press, Cambridge
Pissarides CA, Vallanti G (2007) The impact of TFP growth on steady-state unemployment. Int
Econ Rev 48:607–640
Postel-Vinay F (2002) The dynamic of technological unemployment. Int Econ Rev 43:737–60
Ramsey JB (2002) Wavelets in economics and finance: Past and future. Stud Nonlinear Dyn Econ
6:1–29.
Ramsey JB (2010) Wavelets. In: Durlauf SN, Blume LE (ed) The new Palgrave dictionary of
economics. Palgrave Macmillan, Basingstoke, pp 391–398
Ramsey JB, Zhang Z (1995) The analysis of foreign exchange data using waveform dictionaries. J
Empir Financ 4:341–372
46 M. Gallegati et al.
Ramsey JB, Zhang Z (1996) The application of waveform dictionaries to stock market index
data. In: Kravtsov YA, Kadtke J (ed) Predictability of complex dynamical systems. Springer,
New York, pp 189–208
Ramsey JB, Lampart C (1998a) The decomposition of economic relationship by time scale using
wavelets: money and income. Macroecon Dyn Econ 2:49–71
Ramsey JB, Lampart C (1998b) The decomposition of economic relationship by time scale using
wavelets: expenditure and income. Stud Nonlinear Dyn Econ 3:23–42
Ramsey JB, Uskinov D, Zaslavsky GM (1995) An analysis of U.S. stock price behavior using
wavelets. Fractals 3:377–389
Ramsey JB, Gallegati M, Gallegati M, Semmler W (2010) Instrumental variables and wavelet
decomposition. Econ Model 27:1498–1513
Schreiber S (2009) Explaining shifts in the unemployment rate with productivity slowdowns and
accelerations: a co-breaking approach. Kiel Working Papers 1505, Kiel Institute for the World
Economy
Silverman B (1999) Wavelets in statistics: beyond the standard assumptions. Phil Trans R Soc
Lond A 357:2459–2473
Solow RM (2000) Towards a macroeconomics of the medium run. J Econ Perspect 14:151–158
Staiger D, Stock JH, Watson MW (2001) Prices, wages and the U.S. NAIRU in the 1990s. NBER
Working Papers no. 8320
Steindel C, Stiroh KJ (2001) Productivity: What Is It, and Why Do We Care About It? Federal
Reserve Bank of New York Working Paper
Stock JH, Watson MW (1999) Business cycle fluctuations in US macroeconomic time series. In:
Taylor JB, Woodford M (ed) Handbook of macroeconomics. North-Holland, Amsterdam
Torrence C, Compo GP (1998) A practical guide to wavelet analysis. Bull Am Meteorol Soc
79:61–78
Tripier F (2006) Sticky prices, fair wages, and the co-movements of unemployment and labor
productivity growth. J Econ Dyn Control 30:2749–2774
The Great Moderation Under the Microscope:
Decomposition of Macroeconomic Cycles in US
and UK Aggregate Demand
Abstract In this paper the relationship between the growth of real GDP compo-
nents is explored in the frequency domain using both static and dynamic wavelet
analysis. This analysis is carried out separately for both the US and the UK using
quarterly data, and the results are found to be substantially different in the two
countries. One of the key findings in this research is that the “great moderation”
shows up only at certain frequencies, and not in all components of real GDP. We use
these results to explain why the incidence of the great moderation has been so patchy
across GDP components, countries and time periods. This also explains why it has
been so hard to detect periods of moderation (or otherwise) reliably in the aggregate
data. We argue it cannot be done without breaking the GDP components down into
their frequency components across time and these results show why: the predictions
of traditional real business cycle theory often appear not to be upheld in the data.
1 Introduction
frequency and time domains, and these techniques are typically referred to as “time-
frequency analysis” While time-frequency techniques are still not yet part of the
standard toolbox for analysis of time series in economics, these techniques are
standard in other disciplines such as engineering, acoustics, neurological sciences,
physics, geology and environmental sciences.
The contribution contained in this paper uses discrete wavelet analysis to analyse
fluctuations in the components of US and UK growth,1 and to look at the interactions
between the components of US or UK growth over time at different frequencies.
We begin by noting the correlations for the US between the growth rates of the
main components of aggregate demand in real GDP, namely personal consumption
expenditures, private investment, government expenditures (both current and capi-
tal) and export of goods and services. The data is chained real quarterly data from
1948 to end 2012, and growth rates are calculated as year-over-year changes in the
logged values of each component.
Using a basic Fisher correlation test ( - reported in the table with * referring
to significance at the 10 % level, ** D 5 % level and *** D 1 % level) for a
null hypothesis of zero correlation, only the correlations between C, I and G
are significant. Unsurprisingly, the highest correlation between annual changes in
components of US growth is between consumption and investment. Government
expenditures appear to be negatively related to both consumption and investment as
might be expected due to counter-cyclical fiscal policy. However, although neither
of these correlations are that high, both outstrip the contemporaneous correlation of
exports with consumption or investment (Table 1).
We now compare these initial correlations with those for the UK in Table 2.
The data is from the UK National Statistics Office and is quarterly chained real
quarterly data from 1955 to the third quarter of 2012.
Once again not all the reported correlations are significant—those between C and
I and X are, but none of the correlations with G are significant. Again, the largest
correlation is between consumption and investment; but the size of the correlation
is lower than for the US. This time the correlation between C and G is positive if
small, indicating a weak pro-cyclical (near a-cyclical) use of government spending,
whereas that between G and I is negative. The correlations between X and C and
I are all positive, with quite high correlation between X and I in particular. For the
UK the correlation between X and G is small, insignificant and negative (Table 3).
We now repeat the same exercise, but for the 1987–2007 period, which corre-
sponds to the period referred to as the “great moderation”, and we do this first for
the US:
1
An analysis of fluctuations in real GNP itself has already been undertaken in Crowley (2010).
The Great Moderation Under the Microscope: Decomposition of. . . 49
Table 1 Correlation of US C I G X
GDP components
C 1 0.629*** 0.123** 0.073
I 1 0.220*** 0.104*
G 1 0.000
X 1
Table 2 Correlation of UK C I G X
GDP components
C 1 0.589*** 0.092 0.149**
I 1 0.167** 0.309***
G 1 0.090
X 1
Table 3 Correlation of US C I G X
GDP components: 1997–2007
C 1 0.664*** 0.050 0.160
I 1 0.371** 0.551***
G 1 0.617***
X 1
Table 4 Correlation of UK C I G X
GDP components: 1997–2007
C 1 0.158 0.148 0.024
I 1 0.376** 0.056
G 1 0.057
X 1
Apart from the high correlations of C and I, these results are surprising.
Correlations between X and all other variables are much higher in this subperiod
than for the entire period, with a high correlation between I and X, and a large and
significant negative correlation between X and G.
In this table for the UK, we also get markedly different results, but in this case
the results are even more surprising. The correlations between C and I, G and X are
now not significant, which is a markedly different result from the correlations for
the whole dataset. The correlation between I and G is negative and is now higher
and significant (Table 4).
Taken together this set of four tables of correlations implies that there is a shifting
contemporaneous relationship between the components of GDP for both the US and
the UK, which likely has different underlying dynamics for each of the two countries
concerned. It also highlights the different dynamics that were in play during the
great moderation. The reason why this is the case is unclear, but it clearly merits
further investigation. To start with, not only do these simple statistics show that
many of these correlations are not significant, but they also ignore two important
considerations: (1) that lead or lag relationships may exist between components of
GDP which may change our interpretation of the facts (for example: two perfectly
correlated variables that are out of phase by half a cycle will show correlation of
1); and (2) that (possibly variable) cycle relationships might be significant between
50 P.M. Crowley and A. Hughes Hallett
the constituent components of GDP, which are only weakly related at other non-
business cycle lengths.
Clearly simple correlation coefficients are not going to reveal the size or causal
direction of these relationships, and more appropriate frequency domain tools are
required to explore if any “hidden” relationships exist. Two obvious examples will
make the point: (a) two perfectly correlated data series out of phase will yield
contemporaneous correlations close to zero or negative; contrast the correlations
between C and G which should be in phase, but negative if there is any smoothing,
with correlations between C and I which are likely to be out of phase but positively
correlated if they are driven by a common cycle. And (b) how strong should we
expect the C, I correlations to be? Since C will be subject to short business cycles,
and I to business and longer investment cycles, there is likely to be some (positive)
correlation—but not that strong, unless I’s cycle length is a multiple of that for C.
Perhaps this is the reason why we observe a negative and insignificant relationship
between C and I in the “great moderation” subperiod for the UK.
To address these considerations we use discrete wavelet analysis to analyze the
relationship between the components of GDP, in both the US and UK economies.
3.1 Rationale
The rationale for looking at cyclical interactions between the major components of
output is twofold:
(a) there are obviously some interactions between the components that occur
through the business cycles—notably between consumption and investment
through inventories and government policies, and between consumption and
exports through the international transmission of business cycles. These inter-
actions have important policy implications; and
(b) the real business cycle literature focused on these interactions as justification for
technology “shocks” driving fluctuations in the economy and hence the business
cycle. A deeper understanding of the interaction between the GDP components
may better inform model-building in terms of modelling the transmission of
fluctuations or shocks between spending units in the macro-economy.
The latter concern is particularly relevant here. In King et al. (1988) it was first
noted that real business cycle models do not reproduce the same variability in the
components of output, notably investment and consumption, and much effort has
been expended in this literature to attempt to construct models that exhibit the same
degree of co-movement in investment and consumption over time (see Christiano
and Fitzgerald 1998; Rebelo 2005). One solution to this has been explored in
The Great Moderation Under the Microscope: Decomposition of. . . 51
3.2 Data
The data used is quarterly chain-weighted quarterly real GDP data and it’s major
components. The US data was sourced from the Bureau of Economic Analysis
for 1947Q1 to 2012Q4 (giving 260 datapoints), and the data was transformed by
logging the source data and then taking annual differences. The UK data was
sourced from the National Statistics Office for 1954Q1 to 2012Q3 (giving 233
datapoints) and is transformed in the same manner. Figure 1 plots the data for the
US while Fig. 2 does the same for the UK.3
It should be noted that in the recent downturn government spending is still rising,
while all the other components of aggregate demand have clearly been falling.
Discrete wavelet analysis uses wavelet filters to extract cycles at different frequen-
cies from the data. It uses a given discrete function which is passed through the
series and “convolved”4 with the data to yield a coefficient, otherwise known as a
“crystal”. In the basic approach (the discrete wavelet transform or DWT) these data
points or crystals will be increasingly sparse for lower frequency (long) cycles if the
2
These are shocks from new investment which contains new technology rather than investment that
either replaces depreciated equipment or just adds to the stock of existing capital without upgrading
the technology.
3
Note that the vertical axes are scaled differently for each component.
4
In mathematics and, in particular, functional analysis, convolution is a mathematical operation on
two functions f and g, producing a third function that is typically viewed as a modified version
of one of the original functions. Convolution is similar to cross-correlation. It has applications
that include statistics, computer vision, image and signal processing, electrical engineering, and
differential equations.
52 P.M. Crowley and A. Hughes Hallett
US Real GDP US C
0.10
Log change
Log change
0.05
0.04
-0.02
-0.05
1950 1960 1970 1980 1990 2000 2010 1950 1960 1970 1980 1990 2000 2010
US I US G
0.4
0.3
Log change
Log change
0.0
0.1
-0.04
-0.1
1950 1960 1970 1980 1990 2000 2010 1950 1960 1970 1980 1990 2000 2010
US X
0.1
Log change
-0.1
-0.3
UK Real GDP UK C
0.08
0.05
Log change
Log change
0.02
-0.04
-0.05
1960 1970 1980 1990 2000 2010 1960 1970 1980 1990 2000 2010
UK I UK G
0.2
0.06
Log change
Log change
0.0
0.0
-0.2
-0.06
1960 1970 1980 1990 2000 2010 1960 1970 1980 1990 2000 2010
UK X
0.2
Log change
0.1
-0.1
wavelet function is applied to the series over consecutive data spans.5 So another
way of obtaining crystals corresponding to all data points in each frequency range
is to pass the wavelet function down the series by data observation,6 rather than
moving the whole wavelet function down the series to cover a completely new
data span. This is the basis of the maximal overlap discrete wavelet transform
(MODWT), and is the technique used here.
As shown in Bruce and Gao (1996), the wavelet coefficients can be approximated
by the integrals for father and mother wavelets as:
Z
sJ;k x.t/J;k .t/ dt (1)
Z
dj;k x.t/ j;k .t/ dt (2)
where the basis functions J;k .t/ and J;k .t/ are assumed to be orthogonal.
The multiresolution decomposition (MRD) of the variable or signal x.t/ is then
defined by the set of “crystals” or coefficients:
fsJ ; dJ ; dJ 1 ; : : : d1 g (4)
The interpretation of the MRD using the DWT is of interest as it relates to the
frequency at which activity in the time series occurs.7 For example with a quarterly
time series Table 5 shows the frequencies captured by each scale crystal.
Note that as quarterly data is used in the present study, to capture the conventional
business cycle length scale, crystals need to be obtained for five scales. This requires
at least 64 observations. But to properly resolve at the lowest frequency it would help
to have 128 observations, and as we have at least 214 observations for all 8 series this
5
But given that we seek the same resolution of cycles at different frequencies, this is still the most
efficient way to estimate the crystals.
6
Given the previous footnote, it is obvious that by doing this, it will lead to “redundancy” as the
wavelet coefficients have already been combined with most of the same datapoints.
7
One of the issues with spectral time-frequency analysis is the Heisenberg uncertainty principle,
which states that the more certainty that is attached to the measurement of one dimension ( -
frequency, for example), the less certainty can be attached to the other dimension ( - here the time
location).
54 P.M. Crowley and A. Hughes Hallett
is easily accomplished. Hence we can use six crystals here even though resolution
for the d6 crystal is not high. It should be noted that if conventional business cycles
are usually assumed to range from 12 quarters (3 years) to 32 quarters (8 years),
then crystal d4 together with the d3 crystal should contain the business cycle.
The variance decomposition for all series considered in this paper is calculated
using:
n
1 X
2j
Ej D d
d 2
dj;k (5)
E
kD1
P
where E d D j Ejd : represents the energy or variance in the detail crystals Ejd :
Although extremely popular due to its intuitive approach, the DWT suffers from
two drawbacks: dyadic length requirements for the series to be transformed and
the fact that the DWT is non-shift invariant ( - so if datapoints from the beginning
of the series are put aside, the lower frequencies will yield different crystals with
completely different values). In order to address these two drawbacks, as noted
above, we use the maximal-overlap DWT (MODWT)8 in this study. The MODWT
was originally introduced by Shensa (1992) and a phase-corrected version was
added and found superior to other methods of frequency decomposition9 by Walden
and Cristan (1998). The MODWT gives up the orthogonality property of the DWT
to gain other features, given in Percival and Mofjeld (1997), such as the ability to
handle any sample size regardless of whether the series is dyadic or not, increased
resolution at coarser scales as the MODWT oversamples the data, translation-
8
As Percival and Walden (2000) note, the MODWT is also commonly referred to by various other
names in the wavelet literature such as non-decimated DWT, time-invariant DWT, undecimated
DWT, translation-invariant DWT and stationary DWT. The term “maximal overlap” comes from
its relationship with the literature on the Allan variance (the variation of time-keeping by atomic
clocks)—see Greenhall (1991).
9
The MODWT was found superior to both the cosine packet transform and the short-time Fourier
transform.
The Great Moderation Under the Microscope: Decomposition of. . . 55
invariance, and more asymptotically efficient wavelet variance estimator than the
DWT.
Both Gençay et al. (2001) and Percival and Walden (2000) give a thorough and
accessible description of the MODWT using matrix algebra. Crowley (2007) also
provides an “intuitive” introduction to wavelets, written specifically for economists,
and references the (limited) contributions made by economists using discrete
wavelet analysis.10 The first real usage of wavelet analysis in economics was pio-
neered by James Ramsey (Lampart and Ramsey 1998), and the first application of
wavelets to economic growth (in the form of industrial production) was by Gallegati
and Gallegati (2007) and in the form of GDP in a working paper by Crowley and
Lee (2005) and then more recently in a published article by Yogo (2008). There
are now a few articles that have been published in macroeconomics using wavelet
methods in economics, most notably Crowley (2010), Aguiar-Conraria and Soares
(2011), Aguiar-Conraria et al. (2012), and Gallegati et al. (2011).
In this section and the next we review the output from the MODWT for both US
and UK real GDP and their aggregate demand components. We first review the US
results.
4.1 US Results
The plots for the US in Fig. 3 show the phase-adjusted crystals for each of the
frequency bands contained in the detail crystals (or frequency-resolved series)
d1–d6, plus the smoothed trend residual from the series (often referred to as the
“smooth”), d6, which is obtained after extracting the fluctuations corresponding
to the detail crystals. The most obvious observation is that the “great moderation”
clearly appears in the data from 1983 through to around 2007; but most noticeably in
the d1, d2 and d3 crystals (i.e. for cycles between 6 months and 4 years periodicity),
and less obviously in the 4–8 year cycle (d4 crystal) and not at all in the 8–16 year
cycle (d5 crystal). There also appears to be the possibility of a longer 30-year cycle
in the data, which appears here in s6, the smooth.11 Note that these observations
could not be made using a traditional time series analysis approach: the “great
moderation” for all its appeal at the time, appears not to have been a systematic
or permanent phenomenon. It is also noteworthy that in the current recovery cycles
10
These can also be accessed online at: http://faculty.tamucc.edu/pcrowley/Research/frequency_
domain_economics.html.
11
This also appears in GNP data as shown in Crowley (2010).
56 P.M. Crowley and A. Hughes Hallett
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
d4: 37.01%
d3: 12.95%
0.04
d2: 2.79%
0.03
Other: 6.38%
0.02
d6: 14.79%
d5: 26.08%
0.01
0.00
d1 d2 d3 d4 d5 d6
at different frequencies are not necessarily concordant—d2 and d4 are falling while
d3 and d5 are rising. This interaction of cyclical activity likely gives rise to the
uneven pace of US economic growth.
Figure 4 shows the variance decomposition by crystal over the entire data span.
Clearly the strongest cycle is contained in crystal d3 (representing 2–4 year cycles),
with d4 (representing 4–8 year cycles) following close behind; then d2 (1–2 years)
and d5 (8–16 year cycles) contain roughly the same amount of energy. As noted
before though, the amount of volatility in any given crystal can change over time.
The Great Moderation Under the Microscope: Decomposition of. . . 57
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
So during the “great moderation”, crystals d4 and d5 (4–16 year cycles) appear to
dominate fluctuations in growth, but not necessarily during other periods. Hence the
great moderation in fact appears to have been a phenomenon in which volatility was
shifted from short and business cycle lengths, to the longer cycles (up to 16 years
in length). This would certainly explain the observation that recessions or economic
slowdowns now appear to take place every 10–15 years, but the periods between are
more stable than they used to be.
As might be expected, the MODWT plot in Fig. 5 for consumption expenditures
shows relatively similar cyclical patterns to overall GDP, with a clear fall in volatility
after 1983 in crystal d3 (2–4 year cycles) but less so for d1, d2 or d4. This is
also reflected in the variance decomposition plot in Fig. 6 where there is now more
volatility in longer cycles, relative to the shorter cycles, reflecting the success of
consumption smoothing over time. As with the moderation in GDP volatility, this
fall in volatility after 1983 clearly shows the smoothing power of the strict monetary
controls introduced by the Volker regime at the Fed. The more recent movements
in US consumption are interesting, with shorter cycles up to an 8 year frequency
showing a downturn in consumption, but all longer cycles showing an upturn.
Figure 7 shows the MODWT plot for US private investment, and it is clearly
apparent that the “great moderation” for investment spending took place after
around 1987, that is later than in consumption; and again this was mostly confined
to fluctuations in d2 and d3 crystals, but does not appear in d4 and d5, and hardly at
all in d1. In terms of overall energy, the variance decomposition plots in Fig. 8 show
that most energy lies in crystal d3 (2–4 year cycles), with both d2 (1–2 year cycles)
and d4 (4–8 year cycles) also containing some cyclical activity. In d2, this mostly
occurred towards the beginning of the time series, whereas in d4 this appears to
have been more consistent through time and likely relates to the business cycle. This
finding clearly highlights the rich dynamics at play within the components of output.
It shows that the great moderation started at different points within the components
of GDP, and this observation would be missed if using only total GDP to measure
58 P.M. Crowley and A. Hughes Hallett
0.03
d3: 32.89%
d2: 9.58%
0.02
d1: 5.21%
d6: 10.05%
0.01
d4: 27.3%
d5: 14.97%
0.00
d1 d2 d3 d4 d5 d6
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
the onset of lower volatility in output. When looking at more recent trends, d1, d2
and d4 are turning downwards, and d3, d5 and d6 are all turning up. Also the d5 (at
the business cycle, appears to be becoming more volatile through time.
Government expenditures, since they contain both automatic stabilizers and
for more severe recessions, discretionary spending programs should display some
cyclical activity at business cycle frequencies. However Fig. 9 shows that apart from
the very beginning of the series, there is relatively little cyclicality in this series and
where there is, it clearly lies at around the business cycle in crystals d3, d4 and
The Great Moderation Under the Microscope: Decomposition of. . . 59
2.0
1.5
d2: 17.08%
d3: 45.24%
d1: 4.98%
1.0
Other: 2.34%
d5: 7.34%
0.5
d4: 23.03%
0.0
d1 d2 d3 d4 d5 d6
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
d5 (2–16 year cycles). Compared to the other components of GDP the volatility
in the crystals of government spending is extremely weak, signifying the relatively
minor movements in government expenditures compared to private sector activity.
Interestingly also there is virtually no energy at short term horizons (6 month to
1 year cycles), and activity in other crystals dies down to only small fluctuations
after the mid-1970s, indicating that discretionary fiscal policies had largely been
abandoned as an instrument of demand management at that point.
60 P.M. Crowley and A. Hughes Hallett
0.20
d4: 25%
0.15
d3: 7.21%
d5: 29.93%
d2: 1.45%
0.10
Other: 9.33%
0.05
d6: 27.07%
0.00
d1 d2 d3 d4 d5 d6
These results are also to be seen in the variance decomposition by scale which
is shown in Fig. 10. Here crystals d4 and d5 have the highest variance. These
results also help answer an old debate on whether fiscal policies have been anti-
cyclical (stabilizing) or pro-cyclical (destabilizing). In the US, there is little cyclical
movement in government spending at any frequency after 1960 which suggests
it has largely been a-cyclical in practice. That means the US did not succeed in
stabilizing her economy through fiscal policy (or possibly hasn’t tried), but she
hasn’t made it worse either, as some claim. It is also worth noting that Table 1
and the text which follows indicate that G has been a better or more effective shock
absorber (stabilizer) than the export markets.
The MODWT plot shown in Fig. 11 for exports is rather surprising. It shows
a clear reduction in volatility for crystal d1 from around the mid-1970s with a
reduction in volatility in crystal d2 in roughly 1983, followed by reductions in
volatility in d3 and d4 in the late 1980s. Surprisingly, volatility then picks up again
for crystals d2, d3, d4 and d5 in the late 1990s and continues into the 2000s. This
is not matched in the d1 crystal, which shows hardly any short-term movements in
exports. Figure 12 shows that most of the energy in the series resides in crystals d3
and d4, with cycle frequencies between 2 and 8 years, corresponding to the business
cycle.
These last results require some explanation, but offer an important insight
into the vexed issue of whether the exchange rate acts as a shock absorber, or
equilibriating device, to offset various external or internal imbalances; or whether
it is an additional source of uncertainty in itself. Most business leaders and many
The Great Moderation Under the Microscope: Decomposition of. . . 61
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
d2: 15.73%
d3: 35.3%
0.4
d1: 8.31%
d6: 4.74%
d5: 9.43%
0.2
d4: 26.49%
0.0
d1 d2 d3 d4 d5 d6
later on. That shows the more market sensitive monetary policies of the 1980s
were used to stabilize the economy; but that this in turn affected the exchange rate,
converting it into a shock absorber and stabilizing exports at the same time. In effect,
the exchange rate becomes volatile and exports stable at those frequencies. But the
pattern changes in the mid-1990s. At that point the volatility in US exports begins
to pick up at business cycle frequencies. The explanation is that by the late-1990s
and into the 2000s, US monetary policy had become more activist in pursuit of low
and stable inflation, de facto inflation targeting.
The result of course was a more stable exchange rate in this period, and hence
rising export volatility, as can be seen in our decomposed cyclical data—except at
short cycles, reflecting the somewhat greater short run monetary policy activity. To
the extent that an inflation targeting regime depends on interest rates as a policy
instrument, it should lead to increased volatility in investment and consumption in
the same period. And it does—as can be seen in Figs. 5 and 7, principally at cycle
lengths d3 and d4, though the increases appear to be fairly small (not surprisingly
since other factors are also involved, and neither variable experiences volatility
increases back to the pre-1985 levels). The increases in the export volatility are,
by contrast, rather larger as we might expect—but again not up to the pre-1985
levels, which suggests that business cycles have become more synchronized across
countries than they were.
These results help resolve the controversy: the US exchange rate has acted as
a shock absorber more than a source of uncertainty. The US, being a relatively
flexible economy, has adjusted as required to remove or balance off external or
internal imbalances against each other. When it moved to targeting inflation, some
export volatility returned but with a persistent trade deficit since the easiest way to
keep inflation low is to let the exchange rate appreciate. The second conclusion is
therefore that what often passes for exchange rate uncertainty is in fact fluctuations
in the variables that underlie the exchange rate, not random shocks in the exchange
itself.
To summarize, it is clear that the “great moderation”, although discernable
in GDP growth data for the US, is more apparent at various frequencies and
in various components of GDP than in others. Nor does it represent some kind
of long term paradigm shift. The timing and dynamics that lead to the “great
moderation” do not translate directly back to the components of GDP growth.
Consumption and investment appear to be the sources for the “great moderation”,
with consumption volatility moderation occurring in the early 1980s and investment
volatility moderation occurring in the later part of the 1980s. Changes in government
expenditure and export expenditures do not appear to be major sources of the origins
of lower volatility in real GDP growth. Lower volatility is also therefore not a result
of government stabilisation policies. Instead monetary policy, with effects on the
exchange rate, must be the culprit because the residuals (s6) and short term shocks
(d1) play little or no role in these moderations after the mid-1950s. These are all
features that cannot be detected from aggregate data on output, or with traditional
time series analysis.
The Great Moderation Under the Microscope: Decomposition of. . . 63
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
4.2 UK Results
The same exercise is now repeated for the UK. In Fig. 13 we observe the same
patterns for UK GDP as in US GDP, with crystals d1, d2 and d3 exhibiting lower
volatility after the era of the miners and other strikes in 1984–1985 and after the
Thatcher policies took hold, but with d4 exhibiting slightly lower volatility and d5
and d6 hardly changing. The longer residual cycle is once again weak, and has
a periodicity of approximately 35 years. Figure 14 once again shows that most
of the variance resides in d3, d4 and d5 (2–16 year periodicities), with d4 (4–8
year cycles) containing most energy. However, compared with the US, the volatility
is more evenly spread across cycles. It is also evident that d1–d3 show the great
moderation like the US, while d4 and d5 actually get less stable in the moderation
period. This again suggests a mechanism that shifts short run instability to long
term instability. Most recently, the double-dip downturn in the UK can be seen quite
clearly in d1–d3, where as d5 and d6 (cycles over 8 years in length) point to a longer
term recovery.
In Figs. 15 and 16 the MODWT and the variance decomposition by scale are
shown for UK consumption growth. In Fig. 15 the “great moderation” is evident
from 1983 in d1, but doesn’t occur until roughly 1991 in d2 and d3, and not until
1995 in d4. In terms of volatility, d4 and d5 clearly have most energy and, although
d4 has been less volatile until the recent downturn, d5 has not. Apparently a new
cycle appears to also have emerged since the mid-1970s in the d6 crystal with
roughly a 16 year periodicity. There appears to be little cyclicality beyond this
frequency. Once again the volatility is spread across a wider range of frequencies
compared to the US. As with the GDP data, these results show a much richer and
more complex set of dynamics than could be captured by traditional real business
cycle models.
64 P.M. Crowley and A. Hughes Hallett
0.03
d3: 25.8%
d2: 9.02%
0.02
d1: 6.62%
d4: 28.95%
d6: 13.81%
0.01
d5: 15.81%
0.00
d1 d2 d3 d4 d5 d6
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
d3: 19.23%
d2: 9.08%
d1: 8.17%
0.02
d4: 24.28%
d6: 16.27%
0.01
d5: 22.97%
0.00
d1 d2 d3 d4 d5 d6
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
A reasonable question is, why does the UK show more volatility in investment
spending than the US—especially at frequencies shorter and longer than business
cycles? It will be observed from Fig. 17 that this higher volatility is mostly in the
boom years of the mid-1980s and late 1990s, and is largely restricted to d2–d5.
In addition, because this extra volatility does not show up (proportionately) in the
other components of UK GDP, nor is there any excess volatility in the residuals
or short cycles while the investment itself is less well coordinated/correlated with
C and G but better coordinated with UK exports, we can conclude that the extra
66 P.M. Crowley and A. Hughes Hallett
0.20
d3: 27.32%
d2: 10.39%
0.15
d1: 7.83%
0.10
d6: 12.66%
d4: 25.23%
0.05
d5: 16.56%
0.00
d1 d2 d3 d4 d5 d6
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
investment volatility is due to the UK’s successful record of attracting FDI in those
boom periods.
With the log annual change in UK government expenditures, there is also much
more volatility than with the US measure, as Fig. 19 shows. Here there appears
to have been a dampening of volatility in d1 beginning only in the mid-1990s.
And while for d2 very little change has occurred, for d3 and d4 a dampening of
volatility appears to have taken place around 1982, a time when monetary policy
moved away from monetarism and fiscal policy started to be more closely managed.
The Great Moderation Under the Microscope: Decomposition of. . . 67
0.020
d2: 12.05%
0.015
d3: 18.61%
d1: 15.88%
0.010
d5: 14.52%
0.000
d1 d2 d3 d4 d5 d6
There seem to be cycles operating at lower frequencies as well, with a very irregular
cycle captured by the d5 crystal and rather strange semi-cyclical movements in the
d6 crystal, which almost certainly means that the UK moderation has been achieved
by policy actions not by a smoother operating economy. The implication therefore
is that the better and smoother performance of the UK economy in the Thatcher and
Blair years was held together by policy actions, rather than by favourable market
and institutional reforms that promote smoother running markets. The contrast with
the US post-1970 for any cycle is instructive. Further, there are no obvious breaks
in behaviour (except possibly d5 and d6 after the 1970s). It is also noteworthy that
recent fiscal austerity measures in the UK can be seen at all frequencies except those
for 4–8 year frequencies.
Figure 20 shows that most energy resides in the d4 crystal, but what is surprising
here is that a significant amount of movement is found in d1, which contains cycles
of 6 months to 1 year duration. Here the volatility is fairly evenly distributed across
different cycles with noise less important than business cycles. Hence automatic
stabilisers must have been at work. There is also no obvious shift in weight from
short to long run, so it is difficult to see a distinction between discretionary policy
vs automatic stabilizers.12
12
Separating automatic from discretionary fiscal policies in a cyclical environment is not an easy
matter. Bernoth et al. (2013) review different methods, and show how it can be done by combining
real time and ex-post data.
68 P.M. Crowley and A. Hughes Hallett
d1{-4}
d2{-11}
d3{-25}
d4{-53}
d5{-109}
d6{-221}
s6{-189}
d2: 19.33%
0.15
d1: 19.52%
0.10
d6: 5.88%
d3: 34.33%
d5: 6%
0.05
d4: 14.95%
0.00
d1 d2 d3 d4 d5 d6
Much smaller fluctuations are observed in d1–d3 (and to some extent in d4)
after 1983. But by 2005 the volatility in export expenditures had clearly returned.
At that point d5 appears to suggest that a regular 10 year cycle has emerged and d6
suggests a weak cycle at roughly a 27 year periodicity. So what is notable here is
the moderation in short-run cycles (noise) and post 1980 (up to d3), a moderation
that was lost again by 2004. The explanation for this result is the same as in the US.
During the 1980s the UK became a convinced exchange rate floater, which meant
the exchange rate became the shock absorber that lowered the volatility of exports
(at least in the shorter cycles). But, after 1994 and the unsuccessful EMS period, she
had adopted explicit inflation targets which led to smoother monetary policies and
a (mostly) smoother exchange rate path, and with it higher export volatilities, once
the new monetary regime had settled down. However these results also show that
there is no case for saying that fixing the exchange rate stabilizes the economy, at
least for the UK. Significant regime changes, like fixing the exchange rate in 1990–
1992, or the EMS crisis which seriously unfixed them again, do not destabilize
the economy or exports. Those events just do not show up in the data. Figure 21
shows that higher frequency cycles (with periodicity less than 4 years) dominate the
variance decomposition in this case—once again, a quite different result from that
observed in the US—suggesting that the great moderation was transitory in the UK,
and largely took the form of shifting short-term volatility (d3 and lower) to longer
term cycles (d4 and higher). These changes are clearly seen in Fig. 21.
Once again, the conclusion is that the exchange rate has acted as a shock absorber
rather than as a source of uncertainty—albeit a little less successfully than in the US
because the UK economy is less flexible and is less well stabilized. Figure 22 shows
that d3 is the most important cycle in UK exports, but has a variance of only 0.2
(or 30 % of total variance in exports), compared to 0.65 or 34 % for the US.
5 Conclusions
In this paper we have used wavelet transformations to decompose the separate parts
of domestic expenditures which make up real GDP for the US and the UK into
their component cycles. The first finding is that decomposing the components of
real GDP growth separately into different cycles reveals characteristics of the cycles
in growth, and the relationships between them, that cannot be seen in an analysis
of the aggregate data for real GDP alone. That is because the cycles of the various
components offset each other to a degree, leading to a loss of information at the
aggregate level. The second finding is that although the “great moderation” is found
in most of the data, it is not consistent across different frequency cycle lengths,
appearing only in cycles generally shorter than or equal to the business cycle. This
is an important finding, as it demonstrates (as we now know) that the so-called
“great moderation” was not as significant as economists had thought, given that the
business cycle still was evident, and that longer cycles did not abate in strength at
all. The “great moderation” in fact appears to have been more a case of shifting
70 P.M. Crowley and A. Hughes Hallett
short term volatility to long cycle volatility, than moderating volatility as such. This
means that changes in volatility, like the “great moderation”, will be very difficult to
detect with any certainty without a full frequency decomposition of the components
of GDP.
In terms of the comparison between the US and the UK, the volatility in
components at the specified frequency ranges in discrete wavelet analysis is
markedly different. The analysis shows that there is much more volatility in GDP
components at very short and longer frequencies for the UK than there is in the
US. This is particularly the case for government expenditures, where activist fiscal
policies have clearly had a much greater impact than in the US. There has also been
a tendency for volatility to have been shifted from shorter cycles to longer cycles,
more in the UK than the US. This we put down to the changes in monetary regimes,
and hence exchange rate arrangements, which focussed first on stabilisation and
then on explicit or implicit inflation targeting. Fiscal policies, by contrast, have
largely been acylical or ineffective for stabilisation in the US; but pro-cyclical and
destabilising in the UK.
Acknowledgements This research was completed while Crowley was visiting the School of
Public Policy at George Mason University in Fairfax, VA, USA during the fall of 2009. Dean
Kingsley Haynes should be thanked for hosting Crowley at George Mason University in 2009 and
Texas A&M University - Corpus Christi is acknowledged for providing faculty development leave
funding.
References
Aguiar-Conraria L, Soares M (2011) Business cycle synchronization and the euro: a wavelet
analysis. J Macroecon 33(3):477–489
Aguiar-Conraria L, Martins M, Soares M (2012) The yield curve and the macro-economy across
time and frequencies. J Econ Dyn Control 36(12):1950–1970
Bernoth K, Hughes-Hallett A, Lewis J (2013) The cyclicality of automatic and discretionary fiscal
policy: what can real time data tell us? Macroecon Dyn 5: 1–23
Bruce A, Gao HY (1996) Applied wavelet analysis with S-PLUS. Springer, New York
Christiano L, Fitzgerald T (1998) The business cycle: it’s still a puzzle. Fed Reserve Bank Chic
Econ Perspect 22:56–83
Crowley P (2007) A guide to wavelets for economists. J Econ Surv 21(2):207–267
Crowley P (2010) Long cycles in growth: explorations using new frequency domain techniques
with US data. Bank of Finland Discussion Paper 6/2010, Helsinki
Crowley P, Lee J (2005) Decomposing the co-movement of the business cycle: a time-frequency
analysis of growth cycles in the euro area. Bank of Finland Discussion Paper 12/05, Helsinki
Furlanetto F, Seneca M (2010) Investment-specific technology shocks and consumption. Norges
Bank Discussion Paper 30-2010, Oslo
Gallegati M, Gallegati M (2007) Wavelet variance analysis of output in G-7 countries. Stud
Nonlinear Dyn Econ 11(3):1435–1455
Gallegati M, Gallegati M, Ramsey J, Semmler W (2011) The US wage phillips curve across
frequencies and over time. Oxf Bull Econ Stat 73:489–508
Gençay R, Selçuk F, Whicher B (2001) An introduction to wavelets and other filtering methods in
finance and economics. Academic, San Diego
The Great Moderation Under the Microscope: Decomposition of. . . 71
Greenhall C (1991) Recipes for degrees of freedom of frequency stability estimators. IEEE Trans
Instrum Meas 40:994–999
King R, Plosser C, Rebelo S (1988) Production, growth and business cycles i: the basic neoclassical
model. J Monet Econ 21:195–232
Lampart C, Ramsey J (1998) Decomposition of economic relationships by timescale using
wavelets. Macroecon Dyn 2(1):49–71
Percival D, Mofjeld H (1997) Analysis of subtidal coastal sea level flusctuations using wavelets. J
Am Stat Assoc 92:868–80
Percival D, Walden A (2000) Wavelet methods for time series analysis. Cambridge University
Press, Cambridge
Rebelo S (2005) Real business cycle models: past, present and future. Scand J Econ 107(2):217–
238
Shensa M (1992) The discrete wavelet transform: wedding the à trous and mallat algorithms. IEEE
Trans Signal Process 40:2464–2482
Walden A, Cristan C (1998) The phase-corrected undecimated discrete wavelet packet transform
and its application to interpreting the timing of events. Proc R Soc Lond Math Phys Eng Sci
454(1976):2243–2266
Yogo M (2008) Measuring business cycles: a wavelet analysis of economic time series. Econ Lett
100(2):208–212
Nonlinear Dynamics and Wavelets for Business
Cycle Analysis
1 Introduction
1
The economic indicator used in the work is the US Industrial Production Index. We remark that
our methodology can easily be applied to any known business cycle indicator.
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 77
method with both iAAFT and wavelet-based surrogates. We then use the DVV
method to detect the nonlinear nature of the economic indicator using the values of
the optimal embedding parameters obtained via the differential entropy method with
surrogates. Finally, we perform wavelet analysis with a complex Morlet wavelet
using a continuous wavelet transform to discover patterns or hidden information
that cannot be captured with traditional methods of business cycle analysis such as
spectral analysis. Our results is consistent with business cycle dates published by
the National Bureau of Economic Research (NBER). We are able to detect these
business cycle dates and study these fluctuations in the economy over frequency
and time. This serves as an important finding in terms of forecasting and pattern
recognition.
The paper is organized as follows: The concept of wavelet analysis and our
choice of analyzing wavelet is presented in Sect. 2.1. Surrogate generation method-
ology and differential entropy based method for determining optimal embedding
parameters of the phase-space representation of time series are then presented
in Sects. 2.2 and 2.3 respectively. Lastly, we present, in Sect. 2.4, an overview
of the “delay vector variance” method with illustrations. In Sect. 3, we present
a comprehensive analysis of the feasibility of this approach to analyze the US
Business cycle. Section 4 concludes.
This means that the wavelet has no zero-frequency component. The value of the
admissibility constant, Cadm; or CadmC ; depends on the chosen wavelet. This
property allows for an effective localization in both time and frequency, contrary to
the Fourier transform, which decomposes the signal in terms of sines and cosines,
i.e. infinite duration waves.
There are essentially two distinct classes of wavelet transforms: the continuous
wavelet transform and the discrete wavelet transform. We refer the reader to
Addison (2005), Walden and Percival (2000) for a review on wavelet transforms.
In this work, we employ a complex wavelet via a continuous wavelet transform
(CWT) in order to separate the phase and amplitude information, because the phase
information will be useful in detecting and explaining the cycles in the data. We
provide in Appendix “Continuous Wavelet Transform (CWT)” an overview of CWT
and its relevance to our work.
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 79
The Morlet wavelet is the most popular complex wavelet used in practice. A com-
plex Morlet wavelet (Teolis 1998) is defined by
1 2
i 2f
./ D p e c fb (3)
fb
shown that the Morlet wavelet (3) is a modulated Gaussian function and involutive,
i.e. D Q . The Fourier
R transform O has a maximum value of 1 which occurs
at fc , since k k1 WD j j D 1. This wavelet has an optimal joint time-frequency
concentration since it has an exponential decay in both time and frequency domains,
meaning that it is the wavelet which provides the best possible compromise in these
two dimensions. In addition, it is infinitely regular, complex-valued and yields an
exactly reconstruction of the signal after the decomposition via CWT.
In this work, the wavelet that best detects the US business is the complex Morlet
wavelet with fb D 1 and fc D 0:5. In this case, the Morlet wavelet becomes
1 2
./ D p e i ; (4)
which we will often refer to as Morlet wavelet. The nature of our choice of wavelet
function and the associated center frequency is displayed in Fig. 1. It illustrates the
80 P.M. Addo et al.
oscillating nature of the wavelet with short duration of the time support. In other
words, the wavelet is bounded, centered around the origin, and have time support
(respectively frequency support).
m ln N
Rent .m; / D I.m; / C ; (5)
N
where N is the number of delay vectors, which is kept constant for all values of m
and under consideration,
H.x; m; /
I.m; / D (6)
hH.xs;i ; m; /ii
X
T
H.x/ D ln.Tj / C ln 2 C CE (7)
j D1
where T is the number of samples in the data set, j is the Euclidean distance of the
j -th delay vector to its nearest neighbor, and CE . 0:5772/ is the Euler constant.
This ratio criterion requires a time series to display a clear structure in the phase
space. Thus, for time series with no clear structure, the method will not yield a clear
minimum, and a different approach needs to be adopted, possibly one that does
not rely on a phase space representation. When this method is applied directly to
a time series exhibiting strong serial correlations, it yields embedding parameters
which have a preference for opt D 1. In order to ensure robustness of this
method to the dimensionality and serial correlations of a time series, Gautama et al.
(2003) suggested to use the iAAFT method for surrogate generation since it retains
within the surrogate both signal distribution and approximately the autocorrelation
structure of the original signal. In this Paper, we opt to use wavelet-based surrogate
generation method, WiAAFT by in Keylock (2006), for reasons already discussed
in the previous section.
2
This means that the underlying process that generate the data can theoretically be described
precisely by a set of linear or nonlinear equations. Thus, the component of a time series that can
be predicted from a number of previous samples (Wold 1938).
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 83
Fig. 2 Nonlinear and deterministic nature of signals. The first row of (a) and (b) are DVV plots for
a linear benchmark signal: AR(2) signal and a nonlinear benchmark signal: Henon signal, where
the red line with crosses denotes the DVV plot for the average of 25 WiAAFT-based surrogates
while the blue line denotes that for the original signal. The second row of (a) and (b) denote the
DVV scatter diagrams for those two signals, where error bars denote the standard deviation of the
target variances of surrogates. (a) AR(2) signal. (b) Henon signal (Color figure online)
1 PN 2
2 kD1 k .%d /
.%d / D N
(8)
x2
the surrogate time series. If the surrogate time series yield DVV plots similar to that
of the original time series, as illustrated by the first row of Fig. 2a, the DVV scatter
diagram coincides with the bisector line, and the original time series is judged to be
linear, as shown in second row of Fig. 2a. If not, as illustrated by first row of Fig. 2b,
the DVV scatter diagram will deviate from the bisector line and the original time
series is judged to be nonlinear, as depicted in the second row of Fig. 2b. Statistical
testing of the null of linearity using a non-parametric rank-order test (Theiler et al.
1992) is performed to enhance robust conclusion of results obtained via the DVV-
analysis. We refer the reader to Appendix “DVV Plots of Simulated Processes” for
more on DVV analysis of some simulated process.
We provide below a summary of our methodology which can be characterized in
two stages:
1. Stage One: Detection of Nonlinearity in the underlying time series.
(a) We study the structure of the economic indicator via a phase-space repre-
sentation using the differential entropy-based method with both iAAFT and
wiAAFT surrogates. Embedding parameters that yields lower entropy ratio
is selected for the DVV analysis in next step. The main advantage of this
differential entropy-based method is that a single measure is simultaneously
used to obtain the embedding dimension, m, and time lag, .
(b) In order to detect the nonlinear behavior in the underlying time series, we
use the DVV method discussed in Sect. 2.4. We are able to generate delay
vectors necessary for the DVV analyzes using the .m; / obtained in the
step (a). Unlike classical nonlinearity testing procedures, this non-parametric
method is essentially data-driven and carry no a priori assumptions about
the intrinsic properties or mathematical structure of the underlying time
series. In particular, this method provides a straightforward visualization and
interpretation of results. With this approach, we are able to obtain important
information on the underlying economic indicator, which is essential in
choosing the appropriate class of models suggested by the data itself. It is
noteworthy that this procedure does not need the underlying time series
to be stationary. Statistical testing of the null of linearity using a non-
parametric rank-order test (Theiler et al. 1992) is performed to enhance
robust conclusion of results obtained via the DVV-analysis.
2. Stage Two: Detection and explaining the business cycle.
The next stage of the methodology deals with the problem of discovering
pattern or hidden information that cannot be captured with traditional methods
of business cycle analysis such as spectral analysis, which in only localized in
frequency. In this work, we perform wavelet analysis using a complex-valued
wavelet via a continuous wavelet transform in order to separate the phase and
amplitude information. The phase information will be useful in explaining
the economy-wide fluctuations in production that occur around a long-term
growth trend. Information on the magnitude of such cycles across time will
be obtained from the amplitude information.
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 85
3 Data Analysis
It is now well-known that the United States and all other modern industrial
economies experience significant swings in economic activity. In this section, we
perform analysis to characterize and detect nonlinear schemes for the US business
cycle considering the monthly industrial production. The Industrial Production
index is a business cycle indicator that has been widely used in business cycle
analysis (see Artis et al. 2002, 2003; Anas et al. 2008; Billio et al. 2012a; Addo
et al. 2013b). Firstly, we characterize the nature of the time series using the DVV
method with both iAAFT and WiAAFT surrogates and then employ complex Morlet
wavelet to discover the cycles or hidden information in the data. In particular, we
show that this new methodology permits to study the dynamics of the underlying
economic indicator without a priori assumptions on the statistical properties and also
allows for the detection of recessions periods. In addition, we attempt to establish a
comparison between the late-2000s financial crisis and the Great Depression of the
1930s.
The monthly US Industrial Production Index (IPI) time series3 spanning over the
period January, 1919 to July, 2012 is considered for the data analysis. Figure 3a is
the plot of the monthly IPI series for the period: 1919:01–2012:07, implying 1123
observations, where the shaded regions corresponds to NBER4 published dates for
US recessions from 1920. Figure 3b is the plot of the IPI spectrum which can be
interpreted as a presence of long memory dynamics in the data.
We now give a comprehensive analysis of the IPI in level. To begin with, we opted
for the differential-entropy based method (Gautama et al. 2003) to determine the
optimal embedding parameters, i.e., the embedding dimension, m, and the time lag,
, for the DVV method with both the iAAFT surrogates and WiAAFT surrogates.
We consider two approaches for estimating .m; /. In the first case, the optimal
embedding parameters are estimated using wiAAFT surrogates are m D 3 and
D 1 with an entropy ratio, Rent .m; / D 0:7923, indicated as an open circle
in the diagram with a clear structure in Fig. 4b. This result indicates the presence
of time correlations, in the time series, implying a higher degree of structure, thus,
a lower amount of disorder. The second case is by using the iAAFT surrogates of
which the estimated values of the optimal embedding parameters are m D 4 and
D 7 with entropy ratio, Rent .m; / D 0:7271, which is less than that obtained
via wiAAFT surrogates. In selecting the embedding parameters to generate delayed
vectors needed to perform the DVV analysis, we choose the estimates with lower
entropy ratio implying a higher degree of structure. In this case, m D 4 and D 7
is used to generate the delayed vectors needed to perform the DVV analysis.
3
The data can be downloaded from Federal Reserve Bank of St. Louis:
http://research.stlouisfed.org/fred2/.
4
National Bureau of Economic Research: http://www.nber.org/cycles.html.
86 P.M. Addo et al.
Fig. 3 US Industrial Production Index (IPI) time series. (a) is the plot of the monthly IPI series
for the period: 1919:01–2012:07 .n D 1123/, where the shaded regions corresponds to the US
recessions from 1920 published by NBER. (b) is the plot of the Spectrum of IPI. (a) Industrial
Production Index with the shaded areas indicating the US recessions. (b) The Spectrum of the
Industrial Production Index (IPI) which can be interpreted as a presence of long memory dynamics
Fig. 4 The optimal embedding parameters obtained via the Differential-Entropy based method
using the two types of surrogates are indicated as an open circle in the diagrams with a clear
structure. We obtain a lower entropy Rent .m; / with iAAFT surrogates which corresponds to a
higher degree of structure. The values will be used is creating delay vectors needed for the DVV
analysis. (a) Differential-Entropy based method with iAAFT surrogates. The optimal embedding
values are m D 4 and D 7 with entropy ratio, Rent .m; / D 0:7271. (b) Differential-Entropy
based method with wiAAFT surrogates. The optimal embedding values are m D 3 and D 1 with
Rent .m; / D 0:7923
Fig. 5 This is the DVV analysis with iAAFT surrogates performed on the IPI using the embedding
parameters obtained via the differential entropy-based method. We clearly observe a deviation from
the bisector on the DVV scatter diagram. The DVV plot also indicates that the process is neither
strictly deterministic or strictly stochastic. Thus, the original time series, IPI, exhibits nonlinear
dynamics since the surrogates are linear realizations of the original (Schreiber and Schmitz 1996,
2000)
Table 1 Results of the non-parametric rank-order test. The null of linearity is rejected as soon
as the Rank-Order is greater than Rank-Threshold. The code H takes the value 0 or 1, where
H D 0 corresponds to failure of rejecting the null of linearity and H D 1 the rejection of linearity
for nonlinearity. The number of iAAFT surrogates considered for the DVV-analysis is 25, which is
greater than the minimum requirement of 19 surrogates for testing at ˛ D 0:05 level of significance
Data Code, H Rank-order Rank-threshold Decision
IPI 1 26 24:7 Nonlinear dynamics
Fig. 6 The optimal embedding parameters obtained via the Differential-Entropy based method
using the two types of surrogates are indicated as an open circle in the diagrams with a clear
structure. The values of the embedding parameters are m D 2 and D 1 in both diagrams.
This result D 1 indicates the presence of time correlations, in the growth rate of IPI, implying a
higher degree of structure, thus, a lower amount of disorder. (a) Differential-Entropy based method
with iAAFT surrogates on the growth rate of the IPI. (b) Differential-Entropy based method with
wiAAFT surrogates on the growth rate of the IPI
Anderson 1992; Terasvirta 1994; Dias 2003). Thus, some possible class of nonlinear
models such as Markov switching models, smooth transition autoregressive (STAR)
models, threshold autoregressive models, could capture such nonlinear behavior
(Kim and Nelson 1999; Granger and Terasvirta 1993; Franses and van Dijk 2000;
Addo et al. 2014; Billio et al. 2012b).
On understanding the dynamics of the growth rate (suppose we denote the IPI as
Xt , then the growth rate of IPI defined as Yt D log.Xt / log.Xt 1 /) of IPI to study
the business cycle, we obtain the same estimates of embedding parameters m D 2
and D 1 using the differential entropy-based method with both iAAFT surrogates
and WiAAFT surrogates (Fig. 6). Using the values of the embedding parameters,
we are able to generate the phase space representation as displayed in Fig. 7 and
perform the DVV analysis in Fig. 8. The purpose of studying the growth rate of
IPI is not to ensure stationarity but to enable a better comparison of IPI dynamics
over time. Business cycles are usually measured by considering the growth rate of
industrial production index or the growth rate of real gross domestic product.
The DVV analysis, in Fig. 8 and the statistical testing of the null of linearity using
the non-parametric rank-order test, Table 2, suggests that the time series under con-
sideration behaves more of a nonlinear stochastic process than a deterministic one.
In the following step, we perform CWT on the IPI growth rate using wavelets of
the form in Eq. (3) at different bandwidths fb and center frequency fc . In detecting
the recession dates, the wavelet analysis was first performed for the period 1919:02–
1940:01 using the US recession dates published by NBER as benchmark. The
Morlet wavelet that captures the recession dates in this sample is then chosen as the
wavelet to be used for the whole sample period. The Morlet wavelet that best detect
cycles and hidden information in the data for the period 1919:02–1940:01, is given
in Eq. (4). The colormap used in the coefficient plots and scalogram plot ranges from
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 89
Fig. 7 Phase Space reconstruction using the embedding parameters m D 2 and D 1. This
represents the embedding of the underlying time series, growth rate of IPI, in phase space.
The attractor is clearly visualized
Fig. 8 The DVV analysis on Growth rate of IPI indicates that it is characterize by nonlinear
dynamics. (a) DVV with iAAFT surrogates. (b) DVV with wiAAFT surrogates
blue to red, and passes through the colors cyan, yellow, and orange. The blue root-
like structures on the phase-angle plot of the Coefficient plots, in Fig. 9, corresponds
to recession periods of the economy. These are the periods where the economy-wide
fluctuations in production are below the long-term growth trend.
90 P.M. Addo et al.
Table 2 Results of the non-parametric rank-order test. The null of linearity is rejected as soon as
the Rank-Order is greater than Rank-Threshold. The code H takes the value 0 or 1, where H D 0
corresponds to failure of rejecting the null of linearity and H D 1 the rejection of linearity for
nonlinearity. The number of surrogates considered for the DVV-analysis is 25, which is greater
than the minimum requirement of 19 surrogates for testing at ˛ D 0:05 level of significance
Data Surrogates Code, H Rank-order Rank-threshold Decision
Growth rate of IPI wiAAFT 1 25 24:7 Nonlinear dynamics
Growth rate of IPI iAAFT 1 26 24:7 Nonlinear dynamics
Fig. 9 Coefficients plots obtained from the CWT using complex Morlet wavelet on the growth rate
of IPI: First row represents the phase (angle) plot and second row is the corresponding Modulus
plot. The colormap ranges from blue to red, and passes through the colors cyan, yellow, and orange.
The blue regions on the Angle Coefficient plot corresponds to periods of relative stagnation in
the economy from 1920. Thus, we consider only such structures with a minimum of 6 months as
recession in the economy. The corresponding amplitudes can be read from the Modulus plot (Color
figure online)
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 91
The detection of the recession dates are represented by blue root-like structures
on the angle coefficient plot in Fig. 9. We consider only such structures with a
minimum of 6 months5 as recession in the economy. The corresponding magnitude
of these cycles can be read from the modulus plot of the coefficient plot in Fig. 9.
The Wall Street Crash of 1929, followed by the Great Depression of the 1930s—
the largest and most important economic depression in the twentieth century—are
well captured on the phase-angle coefficient plot in the Fig. 9 for time period (128–
235), reported on Table 3. The three recessions between 1973 and 1982: the oil
crisis—oil prices soared, causing the stock market crash are shown on the blue
root-like structures of the phase-angle coefficient plot in the Fig. 9 for the time
periods (659–676), (733–739), (751–767). Furthermore, the bursting of dot-com
bubble—speculations concerning Internet companies crashed is also detected for
the time periods (987–995). The wavelet energy at time periods are displayed on
the scalogram in Fig. 10 and the pseudo-frequency corresponding to scales are
displayed in Fig. 11. This interesting finding provide support for the use of wavelet
methodology in business cycle modeling.
In order to compare the late-2000s financial crisis with the Great Depression
of the 1930s, we perform the wavelet analysis on the growth rate of the IPI. The
IPI growth rate dynamics are well captured by the phase-angle coefficient plot
in Fig. 9, where the blue root-like structures corresponds to periods of relative
stagnation in the economy from 1920. The amplitudes associated with these
economic fluctuations can be read from the modulus plot in Fig. 9. Looking at
5
This is a known censuring period accepted in Business Cycle literature and National Bureau of
Economic Research: http://www.nber.org/cycles.html.
92 P.M. Addo et al.
Fig. 10 The IPI growth rate and associated Scalogram from the CWT. The bar with colour scales
on the left-hand side of the scalogram plot indicates the percentage of energy for each wavelet
coefficient. Higher energy levels can be clearly observed for the Great Depression of the 1930s
compared to the period of late-2000s financial crisis, also known as the Global Financial Crisis
(Color figure online)
Fig. 11 The pseudo-frequency associated to scale, in Hertz (Hz). The horizontal axis represents
the scales and the vertical axis corresponds to the frequency associated to a scale
the modulus plot in Fig. 9 and scalograms in Figs. 10 and 12, we clearly observe
higher amplitude and energy levels on the interval (0.008–0.016 %) corresponding
to the Great depression of the 1930s compared to the late-2000s financial crisis
with energy levels below 0:004 %. These results, based on the data set we used,
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 93
Fig. 12 A contour representation of Scalogram, Fig. 10, associated with the US IPI growth rate
suggests that the intensity of the late-2000s financial crisis, also known as the Global
Financial Crisis (GFC) is at the moment not so high as compared to the Great
Depression of the 1930s.
4 Conclusion
Acknowledgements The authors are grateful to the Editors and anonymous referees for their
careful revision, valuable suggestions, and comments that have improved this paper. We thank the
conference participants of ISCEF 2012, CFE–ERCIM 2012, COMPSTAT 2012 and the participants
at the Econometrics Internal Seminar at Center for Operations Research and Econometrics (CORE)
for their participation and interest. We also would like to thank Sébastien Van Bellegem, Luc
Bauwens, Christian Hafner, Timo Terasvirta and Yukai Kevin Yang for their remarks and questions.
In this paper, we made use of the algorithms in wavelet toolbox in Matlab and DVV toolbox avail-
94 P.M. Addo et al.
W L2 .R/ ! L2 .R/
The continuous wavelet transform (CWT) differs from the more traditional short
time Fourier transform (STFT) by allowing arbitrarily high localization in time
of high frequency signal features. The CWT permits for the isolation of the
high frequency features due to it’s variable window width related to the scale of
observation. In particular, the CWT is not limited to using sinusoidal analyzing
functions but allows for a large selection of localized waveforms that can be
employed as long as they satisfy predefined mathematical criteria (described below).
Let H be a Hilbert space, the CWT may be described as a mapping parameterized
by a function
C W H ! C .H/: (9)
1 t
.t Ds /./ D 1
(11)
jsj 2 s
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 95
For each point .t; s/ in the time-scale plan, the wavelet transform assigns a
(complex) numerical value to a signal f which describes how much f like a
translated by t and scaled by s version of .
The CWT of a signal f is defined as
Z
1 t
.C f /.t; s/ D 1
f ./ N d (13)
jsj 2 R s
where N ./ is the complex conjugate of the analyzing wavelet function ./.
Given that is chosen with enough time-frequency localization,6 the CWT gives
a gives a picture of the time-frequency characteristics of the function f over the
whole time-scale plane R .Rnf0g/. When Cadm; < 1, it is possible to find the
inverse continuous transformation via the relation known as Calderón’s reproducing
identity,
Z
1 1
f ./ D hf; t Ds it Ds ./ ds dt: (14)
Cadm; R2 s2
and if s restricted in RC , then the Calderón’s reproducing identity takes the form
Z 1 Z 1
1 1
f ./ D hf; t Ds it Ds ./ ds dt: (15)
CadmC ; 1 0 s2
6
The time-frequency concentrated functions, denoted TF.R/, is a space of complex-valued finite
energy functions defined on the real line that decay faster than 1t simultaneously in the time
and frequency domains. This is defined explicitly as TF.R/ WD f' 2 L2 .R W j'.t /j <
.1 C jt j/.1C"/ and j'./j
O < .1 C jj/.1C"/ for < 1; " > 0g.
96 P.M. Addo et al.
E.t; s/ D jC j2 (16)
Fig. 13 DVV analysis on ARIMA and Threshold Autoregressive signals. (a) DVV analysis on
ARIMA(1,1,1) signal. (b) DVV analysis on TAR(1) signal
of data points, the original time series represented by the vector X.n/, where
n D 1; 2; ; N , can be decomposed on a hierarchy of time scales by details,
Dj .n/, and a smooth part, SJ .n/, that shift along with X :
X
J
X.n/ D SJ .n/ C Dj .n/ (18)
j D1
We provide the structure of the DVV analysis on some simulated processes such as:
a Threshold autoregressive process (TAR), linear autoregressive integrated moving
average (ARIMA) signal, a Generalized autoregressive conditional heteroskadastic
process (GARCH), and a Bilinear process (Figs. 13 and 14).
98 P.M. Addo et al.
Fig. 14 DVV analysis on GARCH and Bilinear signals. (a) DVV analysis on GARCH(1,1).
(b) DVV analysis on Bilinear signal
References
Addison PS (2005) Wavelet transforms and ecg : a review. Physiol Meas 26:155–199
Addo PM, Billio M, Guégan D (2012) Understanding exchange rate dynamics. In: Colubi A,
Fokianos K, Kontoghiorghes EJ (eds) Proceedings of the 20th international conference on
computational statistics, pp 1–14
Addo PM, Billio M, Guégan D (2013a) Nonlinear dynamics and recurrence plots for detecting
financial crisis. North Am J Econ Financ. http://dx.doi.org/10.1016/j.najef.2013.02.014
Addo PM, Billio M, Guégan D (2013b) Turning point chronology for the eurozone: a distance plot
approach. J Bus Cycle Meas Anal (forthcoming)
Addo PM, Billio M, Guégan D (2014) The univariate mt-star model and a new linearity and unit
root test procedure. Comput Stat Data Anal. http://dx.doi.org/10.1016/j.csda.2013.12.009
Anas J, Billio M, Ferrara L, Mazzi GL (2008) A system for dating and detecting turning points in
the euro area. Manchester Schools 76(5):549–577.
Artis MJ, Marcellino M, Proietti T (2002) Dating the euro area business cycle. CEPR Discussion
Papers No 3696 and EUI Working Paper, ECO 2002/24
Artis MJ, Marcellino M, Proietti T (2003) Dating the euro area business cycle. CEPR Discussion
Papers (3696)
Ashley RA, Patterson DM (1936) Linear versus nonlinear macroeconomics: A statistical test. Int
Econ Rev 30:685–704
Billio M, Casarin R, Ravazzolo F, van Dijk HK (2012a) Combination schemes for turning point
predictions. Q Rev Econ Financ 52:402–412
Billio M, Ferrara L, Guégan D, Mazzi GL (2013) Evaluation of regime-switching models for real-
time business cycle analysis of the euro area. J. Forecast 32:577–586. doi:10.1002/for.2260
Brock WA, Sayers CL (1988) Is the business cycle characterised by deterministic chaos. J Monet
Econ 22:71–90
Bruce L, Koger C, Li J (2002) Dimentionality reduction of hyperspectral data using discrete
wavelet transform feature extraction. IEEE Trans Geosci Remote Sens 40:2331–2338
Burns AF, Mitchell WC (1946) Measuring business cycles. NBER
Chambers D, Mandic J (2001) Recurrent neural networks for prediction: learning algorithms
architecture and stability. Wiley, Chichester
Nonlinear Dynamics and Wavelets for Business Cycle Analysis 99
Dias FC (2003) Nonlinearities over the business cycle: An application of the smooth transition
autoregressive model to characterize gdp dynamics for the euro-area and portugal. Bank of
Portugal, Working Paper 9-03
Franses PH, van Dijk D (2000) Non-linear time series models in empirical finance. Cambridge
University Press, Cambridge
Gallegati M (2008) Wavelet analysis of stock returns and aggregate economic activity. Comput
Stat Data Anal 52:3061–3074
Gallegati M, Gallegati M (2007) Wavelet variance analysis of output in g-7 countries. Stud
Nonlinear Dyn Econ 11(3):6
Gautama T, Mandic DP, Hulle MMV (2003) A differential entropy based method for determining
the optimal embedding parameters of a signal. In: Proceedings of ICASSP 2003, Hong Kong
IV, pp 29–32
Gautama T, Mandic DP, Hulle MMV (2004a) The delay vector variance method for detecting
determinism and nonlinearity in time series. Phys D 190(3–4):167–176
Gautama T, Mandic DP, Hulle MMV (2004b) A novel method for determining the nature of time
series. IEEE Trans Biomed Eng 51:728–736
Granger CWJ, Terasvirta T (1993) Modelling nonlinear economic relationship. Oxford University
Press, Oxford
Guttorp B, Whitcher P, Percival D (2000) Wavelet analysis of covariance with application to
atmospheric time series. J Geophys Res 105:14941–14962
Hegger R, Kantz H, Schreiber T (1999) Practical implementation of nonlinear time series methods:
The tisean package. Chaos 9:413–435
Ho A, Moody K, Peng G, Mietus C, Larson J, Levy M, Goldberger D (1997) Predicting survival in
heart failure case and control subjects by use of fully automated methods for deriving nonlinear
and conventional indices of heart rate dynamics. Circulation 96:842–848
Jensen M (1999) An approximate wavelet mle of short and long memory parameters. Stud
Nonlinear Dyn Econ 3:239–253
Kaplan D (1994) Exceptional events as evidence for determinism. Phys D 73(1):38–48
Keylock CJ (2006) Constrained surrogate time series with preservation of the mean and variance
structure. Phys Rev E 73:036707
Keylock CJ (2008) Improved preservation of autocorrelative structure in surrogate data using an
initial wavelet step. Nonlinear Processes Geophys 15:435–444
Keynes JM (1936) The general theory of employment, interest and money. Macmillan, London
Kim CJ, Nelson CR (1999) State-space models with regime-switching: classical and Gibbs-
sampling approaches with applicationss. MIT Press, Cambridge
Kontolemis ZG (1997) Does growth vary over the business cycle? some evidence from the g7
countries. Economica 64:441–460
Kugiumtzis D (1999) Test your surrogate data before you test for nonlinearity. Phys Rev E
60:2808–2816
Leonenko NN, Kozachenko LF (1987) Sample estimate of the entropy of a random vector. Probl
Inf Transm 23:95–101
Luukkonen R, Saikkonen P, Terasvirta T (1988) Testing linearity against smooth transition
autoregressive models. Biometrica 75:491–499
Mitchell WC (1927) Business cycles. the problem and its setting. National Bureau of Economic
Research, New York
Ramsey J (1999) The contributions of wavelets to the analysis of economic and financial data.
Philos Trans R Soc Lond A 357:2593–2606
Ramsey J, Lampart C (1998) The decomposition of economic relationships by time scale using
wavelets: expenditure and income. Stud Nonlinear Dyn Econ 3:23–42
Schreiber T, Schmitz A (1996) Improved surrogate data for nonlinearity tests. Phys Rev Lett
77:635–638.
Schreiber T, Schmitz A (2000) Surrogate time series. Phys D 142:346–382
100 P.M. Addo et al.
Sichel DE (1993) Business cycle asymmetry: a deeper look. Econ Inquiry 31:224–236
Stock MW, Watson JH (1999) Forecasting inflation. J Monet Econ 44(2):293–335
Teolis A (1998) Computational signal processing with wavelets. Birkhauser, Basel
Terasvirta T (1994) Specification, estimation and evaluation of smooth transition autoregressive
models. J Am Stat Assoc 89:208–218
Terasvirta T (September 2011) Modelling and forecasting nonlinear economic time series, wGEM
workshop, European Central Bank
Terasvirta T, Anderson HM (1992) Characterizing nonlinearities in business cycles using smooth
transition autoregressive models. J Appl Econ 7:119–136
Theiler JD, Eubank J, Longtin S, Galdrikian A, Farmer B (1992) Testing for nonlinearity in time
series: the method of surrogate data. Phys A 58:77–94
van Dijk D, Terasvirta T, Franses PH (2002) Smooth transition autoregressive models -a survey of
recent developments. Econ Rev 21:1–47
Walden DB, Percival AT (2000) Wavelet methods for time series analysis. Cambridge University
Press, Cambridge
Weigend MC, Casdagli AS (1994) Exploring the continuum between deterministic and stochastic
modelling, in time series prediction: Forecasting the future and understanding the past.
Addison-Wesley, Reading, pp 347–367
Wold HOA (1938) A study in the analysis of stationary time series. Almquist and Wiksell, Uppsala
Part II
Volatility and Asset Prices
Measuring the Impact Intradaily Events
Have on the Persistent Nature of Volatility
Abstract In this chapter we measure the effect a scheduled event, like the opening
or closing of a regional foreign exchange market, or a unscheduled act, such as a
market crash, a political upheaval, or a surprise news announcement, has on the
foreign exchange rate’s level of volatility and its well documented long-memory
behavior. Volatility in the foreign exchange rate is modeled as a non-stationary,
long-memory, stochastic volatility process whose fractional differencing parameter
is allowed to vary over time. This non-stationary model of volatility reveals
that long-memory is not a spurious property associated with infrequent structural
changes, but is a integral part of the volatility process. Over most of the sample
period, volatility exhibits the strong persistence of a long-memory process. It is only
after a market surprise or unanticipated economic news announcement that volatility
briefly sheds its strong persistence.
1 Introduction
The current class of long-memory, stochastic volatility models assume the condi-
tional variance is a stationary long-memory process with a fractional differencing
parameter that is constant and does not change its value over time. These models
of volatility assume that a stationary environment exists in the financial world and
ignore the regular intradaily market micro-structure style events and unexpected
market crashes, surprises, and political disruptions that are prevalent in the data
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 103
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__5,
© Springer International Publishing Switzerland 2014
104 M.J. Jensen and B. Whitcher
where at time t the mean corrected return from holding a financial instrument
is yt and ht is the unobservable log-volatility, as a time-varying, long-memory,
stochastic volatility (TVLMSV) model. At every t, jd.t/j < 1=2, and t and t are
uncorrelated Gaussian white noise processes. The fractional differencing operator,
.1 B/d.t / , where B is the lag operator, xt s D B s xt , is defined by the binomial
expansion:
1
X .l d.t//
.1 B/d.t / D Bl ;
.l C 1/ .d.t//
lD0
where ./ is the gamma function. The parameter .t/ is the standard deviation of
the log-volatility and .t/ is the modal instantaneous volatility.
The TVLMSV is a non-stationary model belonging to the Dahlhaus (1997)
class of locally stationary processes. In the TVLMSV model, the fractional dif-
ferencing parameter, d.t/, the modal instants volatility, .t/, and the volatility
of volatility, .t/, are smooth functions that change value over time. This time-
varying property allows the TVLMSV model to produce responses to volatility
shocks that are not only persistent in the classical long-memory sense of a slow
decaying autocorrelation function, but depending on when a shock takes place,
the level of persistence associated with the shock can vary. For example, the time-
varying fractional differencing parameter enables the TVLMSV to model levels of
persistence unique to volatility over the operating hours of a regional exchange, or
to its dynamics over an entire 24 hour trading day. The TVLMSV is also flexibly
enough that by setting d.t/ D d , .t/ D , and .t/ D , for all t, it
becomes the stationary, long-memory, stochastic volatility model. Hence, the time-
varying differencing parameter of the TVLMSV model equips us with the means of
determining whether the long-memory found in volatility is structural (d.t/ > 0,
over a wide range of values for t) or just a spurious artifact of unaccounted regime
changes or shocks (d.t/ D 0, over a wide range of values of t) as suggest by Russell
et al. (2008), Jensen and Liu (2006), Diebold and Atusushi (2001), Lamoureux and
Lastrapes (1990), and Lastrapes (1989).
1
Departure from the assumption of stationarity can be found in Stăriciă and Granger (2005).
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 105
where:
A1. the complex stochastic process Z.!/ is defined on Œ; with Z.!/ D
Z.!/, EŒZ.!/ D 0 and:
EŒdZ.!/dZ.! 0 / D
.! C ! 0 / d! d! 0
P
where
.!/ D 1 ı.! C 2j / is the period 2 extension of the Dirac delta
j D1R
function, ı.!/, where f .!/ı.!/ d! D f .0/, for all functions f continuous
at 0.
A2. there exists a complex-valued 2-periodic function, A.u; !/ W Œ0; 1
Œ; ! C , with A.u; !/ D A.u; !/, that is uniformly Lipschitz continuous
in both arguments with index ˛ > 1=2, and a constant K such that:
ˇ ˇ
sup ˇA0t;T .!/ A.t=T; !/ˇ KT 1 (4)
t;!
for all T .2
A3. the drift .u/ is continuous in u 2 Œ0; 1.
Except for the transfer function, A0t;T , having the time, t, and number of
observations, T , subscripts, Eq. (3) is the same as the spectral representation of a sta-
tionary process (see Brockwell and Davis (1991), Theorem 4.8.2). Like a stationary
process, Assumption A.1 ensures that Z.!/ is a continuous stochastic process with
orthogonal increments, such as a Brownian motion process. The only difference
between the spectral representation of a stationary process and the representation
of the locally stationary process is in Assumption A.2. Assumption A.2 precisely
quantifies the smoothness of the time-varying transfer function, A.t=T; !/, over
not only the frequency domain, but also over the time domain. The class of
uniformly Lipschitz functions includes transfer functions that are not only smooth
2
Our notation is such that u will always represent a time point in the rescaled time domain Œ0; 1;
i.e., u D t =T .
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 107
differentiable functions in time and frequency, but this function class also contains
non-differentiable transfer functions that have bounded variation.
Under Assumption A.2, Dahlhaus (1996) shows that the locally stationary
process, Xt;T , has a well defined time-varying spectrum. In a manner analogous to
the spectral density measuring a stationary process’s energy at different frequencies,
the time-varying spectral density of Xt;T equals f .u; !/ jA.u; !/j2 . Combining
the triangular process, Xt;T , with the smoothness conditions of A.u; !/, Dahlhaus
(1996, Theorem 2.2) overcomes the redundant nature of the time-frequency repre-
sentation of a non-stationary process. Rather than there being a number of spectral
representations of Xt;T , Dahlhaus shows that if there exists a spectral representation
of the form found in Definition 1 with a “smooth” A.u; !/ as quantified by
Assumption A.2, then f .u; !/ is unique. In other words, there is not just one
spectral representation associated with Xt;T . However, the representation found in
Definition 1 is the only one where the transfer function is “smooth” and the f .u; !/
is unique.
In Definition 1, as T ! 1, more and more of Xt;T behavior as determined by
A.uo ; !/, where u 2 Œuo ; uo C , will be observed. This definition of asymptotic
theory is similar to those found in nonparametric regression estimation. Since future
observations of Xt;T tell us nothing about the behavior of a non-stationary process
at an earlier t, in our setting T ! 1 has the interpretation of measuring the series
over the same time period but at a higher sampling rate. Phillips (1987) referred
to this form of asymptotic as continuous record asymptotic since in the limit a
continuous record of observations is obtained. In the context of Phillips (1987), the
locally-stationary process Xt;T can be regarded as a triangular array of a dually index
random variable, ffXnt gTt D1
n
g1
nD1 where as n ! 1, Tn ! 1 and the length between
observations, kn , approaches zero such that kn Tn D N so that the time interval
Œ0; N may be considered fixed. Given that the existing asymptotic results for locally
stationary processes are mostly due to Dahlhaus (1996, 1997), we choose to follow
his notation and use the triangular array, Xt;T , to describe a locally stationary series.
Applying the asymptotics found in Definition 1 to the TVLMSV model of Eqs. (1)–
(2), we define the triangular array:
where t D 1; : : : ; T , jd.u/j < 1=2, and d.u/, .u/, and .u/ are continuous on
R with d.u/ D d.0/; .u/ D .0/; .u/ D .0/ for u < 0, and d.u/ D
d.1/; .u/ D .1/; .u/ D .1/ for u > 1, and differentiable for u 2 .0; 1/
with bounded derivatives.
108 M.J. Jensen and B. Whitcher
.u/
A.u; !/ D p .1 e i ! /d.u/ ;
2
2 .u/
f .u; !/ D j1 e i ! j2d.u/ C 4:93=2; (7)
2
p R
where .t=T / D log 2 .t=T / 1:2704, zt D 4:93=2 e i !t dZ z .!/, where Zz
is a complex stochastic process found in Assumption A.1.
From Theorem 1, yt;T is not only a random process, it is also a locally stationary
process with time-varying spectrum, f .u; !/. The only difference between f .u; !/
and the spectrum from a stationary, long-memory stochastic volatility model is that
the fractional differencing, the modal instantaneous volatility, and the variance of
3
All proofs are rendered to Appendix 1.
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 109
volatility parameters are allowed to change their value over time. In the TVLMSV
model the risk of holding a financial asset is time-varying in the context that
volatility follows a stationary stochastic process, but now the risk is not time
homogeneous. In other words, the level of persistence associated with a shock to
the conditional variance now depends on when the shock takes place. Suppose that
at some point in time d.t/ is close to 1=2. Shocks to volatility during this time period
will be more persistent than if the same size shock occurred during a time period
when d.t/ is closer to zero.
4
See Percival and Walden (2000) for an introduction to the MODWT.
5
See Mallat (1989) for the seminal article on wavelets as presented from a multiresolution analysis
point of view.
110 M.J. Jensen and B. Whitcher
longer and longer time intervals. For example, if one observes five minute return
data and aggregates these returns into ten minute returns the “details” at time scale
five minutes would equal the information needed to construct the five minute returns
from the ten minute returns.
Unlike the DWT, the MODWT is not an orthogonal basis. Instead, it is
redundant and uses an approximate zero-phase filter to produce an over-determined
representation of a time series. An advantage of this redundancy is that the “details”
at each time scale and the “smooth” have the same number of observations as the
original time series. This enables the “details” and the “smooth” to be aligned in
time with the original series so that the impact of an event can be analyzed over
different time scales.
To facilitate the introduction of our estimator of d.t/, consider applying the
MODWT to the locally stationary process defined in Definition 1, Xt;T . Letting
j D 1; : : : ; J , be the scale parameters, where J log2 T is the longest time
interval over which the original time series is aggregated, and fhQ j;l j l D 0; : : : ; Lj g
be the level-j , real-valued, MODWT wavelet filters, where L1 D L < T is an even,
positive, integer and Lj D .2j 1/.L 1/ C 1. The level-j MODWT coefficients
of Xt;T are obtained from the linear, circular filter6 :
Lj 1
X
WQ j;t;T D hQ j;l Xt l mod T;T t D 1; : : : ; T; (8)
lD0
Notice from the MODWT coefficients, WQ j;t;T , t Lj , (those that do not involve
the circularity assumption), that the above filter is compactly supported on the
time interval Œt Lj C 1; t. From the above definition of Lj this time support
increases as the scale j increases. In Appendix 2 we show that just the opposite
is the case in the frequency domain representation of the MODWT; i.e., as j
increases the frequency support of the transfer function of fhQ j;l g shrinks and covers
a lower octave of frequencies. This inverse relationship between time and frequency
6
The third equation in Assumption A.4 guarantees the orthogonality of the filters to double shifts
and the first two conditions ensure that the wavelet has at least one vanishing moment and
normalizes to one, respectively.
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 111
domain properties where a large (small) time support is associated with low (high)
frequencies is one of the wavelets many strengths.
Now define the MODWT scaling filters as the quadrature mirror filters of the
MODWT wavelet filters:
From A.4 it follows that the MODWT scaling filter satisfies the conditions:
Lj 1 Lj 1 Lj 1
X X X
gQ j;l D 1; gQ j;l
2
D 2j ; and gQ j;l hQ j;l D 0:
lD0 lD0 lD0
The J th-order scaling filters enable us to define the level-J MODWT scaling
coefficients in terms of the filter:
J 1
LX
VQJ;t;T D gQ J;l Xt l mod T;T t D 1; : : : ; T: (10)
lD0
Because the wavelet filter fhQ j;l g sums to zero, has the same length, Lj , as the
scaling filter fgQ j;l g, and the two filters are orthogonal, the wavelet filters represent
the difference between two windowed weighted averages each with bandwidths of
effective width 2j ; i.e., a MODWT coefficient tells how much a weighted moving
average over a particular time period of length 2j changes to the next. The level-
J MODWT scaling coefficients are associated with the output from a weighted
moving average with a window of length 2J that captures the variation in Xt;T over
time periods associated with scales J and higher.
By multiresolution analysis, the level-j wavelet “details” associated with the
MODWT coefficients are defined as:
Lj 1
X
DQ j;t;T D hQ j;l WQ j;t Cl mod T;T t D 1; : : : ; T; (11)
lD0
J 1
LX
SQJ;t;T D gQ J;l VQJ;t Cl mod T;T t D 1; : : : ; T: (12)
lD0
Since information about the original series is lost in calculating the weighted moving
averages, the “details” are the portion of the MODWT synthesis associated with the
changes at scale j , whereas the “smooth” is the portion attributable to variation
112 M.J. Jensen and B. Whitcher
of scale J and higher. Together the “details” and “smooths” form an additive
decomposition:
X
J
Xt;T D DQ j;t;T C SQJ;t;T :
j D1
Because the MODWT coefficients, WQ j;t;T , capture the information that is lost as
the MODWT intertemporally pans out on the locally stationary process, Xt;T , it
follows that the MODWT wavelet coefficients at each scale are themselves locally-
stationary processes. This is the result of the following theorem:
Theorem 2. Let Xt;T be a locally-stationary process as defined in Definition 1
where the 2-periodic function A.u; !/ has third derivative with respect to u that
is uniformly bounded on u 2 .0; 1/ and j!j , and fhQ j;l j l D 0; : : : ; Lj g,
j D 1; : : : ; J log2 T , be maximal overlap discrete wavelet transform filters that
satisfy Assumption A.4, then the MODWT wavelet coefficient for a given scale, j ,
that do not involve the circularity assumption are locally-stationary processes with
spectral representation:
Z
Q
Wj;t;T D e i !t Aoj;t;T .!/ dZ.!/ (13)
ˇ ˇ
ˇ ˇ
with supt;! ˇAoj;t;T .!/ Aj .t=T; !/ˇ KT 1 , where Aj .t=T; !/ D HQj .!/
A.t=T; !/, and time-varying spectral density function fj .u; !/ D jHQj .!/j2 f .u; !/.
Applying Eq. (8) to ht;T , we define the level-j MODWT wavelet coefficients of the
locally stationary, long-memory process as:
Lj 1
X
WQ j;t;T D hQ j;l ht lmodT;T
.h/
t D 1; : : : ; T:
lD0
Using the definition of hQ j;l transfer function found in Appendix 2, Jensen and
Whitcher (1999) show that WQ j;t;T is a locally stationary process with mean zero
.h/
where
Taking the logarithmic transformation of Eq. (14), we obtain the log-linear relation-
ship
the time-varying variance, h2 .u; j /, with the sample variance calculated from the
wavelet coefficients whose support contain the point t. Because the support of the
level-j wavelet filter includes several filter coefficients with a value close to zero,
a ‘central portion’ of the wavelet filter j D fŒg.Lj /=2; : : : ; Œg.Lj /=2g, where
0 < g.Lj 1 / < g.Lj /, is utilized.7 In other words, h2 .u; j / was estimated with
the time-varying sample variance of the wavelet coefficients
1 X Q .h/
Q h2 .u; j / D .Wj;t Cl;T /2 : (16)
#j
l2j
7
Since the ‘central portion’ is dependent on the particular family and order of the wavelet filter,
there is no closed-form expression for g.Lj /. However, Whitcher and Jensen (2000) do tabulate
the time width of j for the Daubechies family of wavelets.
114 M.J. Jensen and B. Whitcher
Although yt;T is comprised of the locally stationary, long-memory process, ht;T ,
and the white noise processes, zt , the local second order properties of yt;T and
ht;T are asymptotically equivalent. The ratio of the time-varying spectrum of yt;T
and ht;T
fy .u; !/
! K; as ! ! 0;
fh .u; !/
for some K < 1. Furthermore, it is well known that wavelets are well equipped to
filter out unwanted white noise (Donoho and Johnstone 1994, 1995, 1998; Jensen
2000). The variance of yt;T MODWT coefficients equals
Since there is no structure to be found in the white noise process, zt , neither the
wavelet coefficients from the white noise process, nor their variances will exhibit
any systematic decay or relationship with the scaling parameter. Thus, the OLS
wavelet estimator will provide a good estimator of the TVLMSV model.8
The Deutsche mark-US dollar (DM-$) exchange rate has been extensively used to
investigate the intra- and inter-daily behavior of foreign exchange rates (Andersen
et al. 2003, 2001; Andersen and Bollerslev 1997a,b, 1998; Müller et al. 1990;
Baillie and Bollerslev 1990). In its time the interbank spot DM-$ market had
the largest turnover of any financial market, was highly liquid, and had low
transaction costs. Furthermore, the spot market for DM-$ was a 24-hour market
comprised of sequential but not mutually exclusive regional trading centers. Except
for the endogenous slow downs in the level of trading that occurred on weekends
and regional holidays, the spot DM-$ market was essentially always open. This
makes the market for the DM-$ ideal for analyzing the time-varying, long-memory
behavior of volatility.
5.1 Data
We use the tick-by-tick DM-$ spot exchange rate bid and ask quotes recorded by
Olsen and Associates from the interbank Reuters network over the time period
8
A possible alternative to our semi-parametric OLS estimator of d.u/ is the MCMC methodology
of Jensen (2004), but this would first require developing a Bayesian sampling method for locally
stationary processes. This is a topic for future research.
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 115
0.08
0.07
0.06
0.05
Avg. Absolute
Return
0.04
0.03
0.02
0.01
0.0
Fig. 1 Intraday average of the absolute value for the 288 daily five-minute returns for the DM-$
exchange rate
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
S12
Fig. 2 Multiresolution analysis for the DM-$ log-squared return series wavelet and scaling
coefficients over the time period October 1, 1992 through September 29, 1993
The MODWT coefficients of the log-squared DM-$ returns are computed using
the Daubechies (1992) “least asymmetric” class of wavelet filters with 8 nonzero
coefficients (LA(8)); i.e., L D L1 D 8 in Eq. (8). Since the LA(8) wavelet has a near
zero phase shift, the resulting wavelet coefficients will line up nicely in time with
the events that noticeably affect the volatility of the DM-$ return. Because its filter
length is long enough to ensure the near bandpass behavior needed for measuring
long-memory, the LA(8) wavelet is also a logical choice for the estimation of d.t/,
If at the beginning and end of the time period y behavior is similar, the
circularity assumption of Eq. (8) will have less of an effect on those WQ .y / near
variation of yt;T at a localized interval of time where the scales j D 1; 2; 3; 4
correspond to variations at the 5; 10; 20; 40 minute level, the scales j D 5; 6; 7; 8
correspond to approximately the 1:5; 2:5; 5 hour level, and scales j D 9; 10; 11; 12
to approximately the 1=2; 1; 2; 3:5; 7 day level.9 For example, the wavelet details,
DQ 1;t;T , are associated with changes (differences in weighted averages) in the original
series at the 5 minute interval, whereas the DQ 10;t;T capture the information that is
lost when volatility is computed from a two-day return rather than a daily return.
Because j D 9 is associated with approximately daily variations, the details, DQ j;t;T ,
j D 1; : : : ; 8, measure the intradaily variation of yt;T .
Q
The wavelet smooths, S12;t;T , are weighted averages of yt;T
, approximately a
week in length, that measure the log-squared returns long-term variation at time
scales associated with a week and greater in length. Because the MODWT is a
dyadic transform yt;T could be decomposed up to scales equal to the integer value of
log2 T . However, we choose j D 12 as the largest scale because intraday behavior
at frequencies lower than a day-of-the-week effects have not be present in high-
frequency exchange rate data (see Harvey and Huang 1991).
Focusing on the SQ12;t;T between the beginning of January (interval 19,009) and
the end of July (interval 56,448), there is evidence of a slow, moderate, quarterly
cycle in the log-squared returns. The plot of the smooths SQ12;t;T over the January to
July time period reveals periodicity of approximately one and half cycles in length.
It is possible that this cycle continues in the data both before January and after July,
but because of October’s US Employment Report (Oct. 2, interval 439), the US
stock market crash (Oct. 5, interval 816), and the Russian crises (Sept. 21, interval
73,098), the value of SQ12;t;T before January and after July may be artificially inflated
because the above events may be compounding the boundary effects of the wavelet
9
Table 1 provides the actual conversion between the scaling parameter, j , of the MODWT and the
time scale of the DM-$ time series.
118 M.J. Jensen and B. Whitcher
decomposition. These boundary effects may also explain the relatively large values
of Dej;t;T , for j D 10; 11; 12, found at the beginning and end of the sample.
Cyclical behavior in the log-squared returns is also visible at the weekly scale. In
the plot of D e12;t;T , found in Fig. 2, a pattern of four cycles occurs each quarter. This
cycle is fairly robust over the entire year, flattening out slightly during February,
March, and May.
The details DQ j;t;T at the scales j D 1; : : : ; 9 reveal the transient effect anticipated
and unanticipated news events have on volatility. Some note worthy events that our
data set encompasses are the election of Bill Clinton to the US presidency (Nov. 11,
interval 6,951), the floating of the Swedish krona followed by the realignment
of Europe’s exchange mechanism (Nov. 19 & 22), and the military confrontation
between then Russian president Yeltsin and the old guard of the Russian Parliament
(Sept. 21, interval 73,098). There are also macroeconomic news events, most
notably the monthly US Employment report (first week of each month), especially
the US Employment report for the month of May (June 4, interval 50,876), and also
the Bundesbank meeting report (March 4, interval 32,157).
The impact of new information on log-squared returns is captured by the details
size relative to those around it at the point in time of the event. In Fig. 2 every
time the absolute value of DQ 1;t;T deviates noticeably from zero it corresponds with
the arrival of new information to the market. A similar pattern is also found in the
details at scales j D 2; : : : ; 9. This pattern in the details suggestions anticipated
and unanticipated announcements affect the behavior of volatility over the course
of the day the news is released. Two days later, however, the market has assimilated
the news and volatility has reverted back to its typical behavior as quantified by
the variation in volatility over a two day period. For instance, those details at the
five minute to one day level of variation (j D 1; : : : ; 9) that correspond with the
May’s US Employment report (June 4, interval 50,876) are clearly larger than those
before and after this date. But the early June details at the scales j D 10; 11; 13 are
not visibly different from those details at the same scales during other time periods
when the market did not experience any news.
We now take the MODWT coefficients of yt;T and calculate the time-varying
wavelet variances, Q .t=T; j /, for t D 1; : : : ; 74;880 and j D 1; : : : ; 16. For each
2
value of t, we take the log of Q 2 .t=T; j /, where j D 5; : : : ; 12, and regress these
time-varying wavelet variances on a constant and log 2j . This produces our OLS
estimates of D.t/ from which d.t/ D .D.t/ C 1/=2 are calculated. Since d.t/
measures the low frequency behavior of ht;T , it is common practice to exclude the
first few scales of the wavelet variances from the OLS regression (see Jensen 1999b).
These excluded time-varying wavelet variances at the scales j D 1; : : : ; 4 measure
the energy in ht;T over those frequencies associated with behavior on the order of
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 119
five to forty minutes in length. We also exclude the wavelet variances at the four
largest scales, j D 13; 14; 15; 16, from the regression. This helps to reduce the
adverse impact the MODWT boundary effects might have on the estimates of d.t/.
Our results for d.t/ are plotted in Fig. 3 (middle line). The upper and lower lines
found in the figure represent the 90% confidence interval of d.t/ that we construct
by bootstrapping 1,000 replications where the estimated residuals are randomly
sampled with replacement. In all but a few instances, the time-varying, long-
memory parameter is positive. Since a global estimate of the differencing parameter
is the average of the time-varying differencing parameter estimates over the entire
time period, finding a positive differencing parameter supports the conclusions of
others that the volatility of intradaily foreign exchange rates exhibit long-memory
behavior and is not a spurious artifact of structural breaks in the series.
In Fig. 3, the two largest values of d.t/ can be ignored. The first occurrence cor-
responds with Christmas Day and the other with New Years Day. Because these days
are effectively “weekends” where low quote activity occurs, the level of volatility
during these days is meaningless with regards to the time-varying long-memory
parameter. The three most negative values, d.t/ D 0:3275; 0:3080; 0:2537
correspond to the second highest (June 4), eighth highest (Sept. 21) and highest
(Oct. 2) volatility levels, respectively. The first and third most negative values of
d.t/ occur at the US Employment report announcements. The second smallest value
occurs on the day of the Russian crisis (Sept. 21). On the 12 days when the monthly
US employment report is released, d.t/ only stays positive during the December
announcement. In all the other months, the employment report causes d.t/ to fall
below zero.
In every instance where the value of d.t/ rapidly declines to zero or becomes
negative and then quickly rebounds back to positive values, the date corresponds
with either a macroeconomic announcement, a political event (US presidential
election), a meeting of the Bundesbank, or a plunge in the US stock market. This
behavior in d.t/ suggests that volatility becomes anti-persistent in response to
the release of scheduled economic news and to expected and unexpected political
events, in the sense that while volatility is still highly persistent and strongly
correlated with past observations it is now negatively correlated. Although the new
information from these events clearly affects volatility, the rapid increase in d.t/
suggests the market quickly trades the asset to its new price and then relies on its
long-term dynamics of risk and volatility’s inherent property of long-memory in
carrying out trades when information is not being disseminated.
In Fig. 4 we plot the average value of d.t/ at each of the daily 288 five-minute
intervals. Unlike the regional U-shape behavior of volatility found in the average
absolute returns of Fig. 1, where volatility increases and the market thickens due to
the opening and closing of a regional trading center (Müller et al. 1990; Baillie and
120 M.J. Jensen and B. Whitcher
0.5
0.0
−0.5
time
1.0
estimated d(t)
0.5
0.0
−0.5
time
1.0
estimated d(t)
0.5
0.0
−0.5
time
1.0
estimated d(t)
0.5
0.0
−0.5
Fig. 3 Semiparametric wavelet estimates (middle line) of the time-varying differencing parameter,
d.t /, and their bootstrapped 90% confidence interval (upper and lower line) for the DM-$ log-
squared returns using wavelet coefficients (WQ j;t;T , j D 5; : : : ; 12, t D 1; : : : ; 74;880) calculated
with the LA(8) wavelet filter
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 121
0.38
0.36
0.34
FTSE
0.32
0.3
Nikkei
d(t)
NYSE
0.28
0.26
0.24
0.22
0.2
0 2 4 6 8 10 12 14 16 18 20 22 24
Hourly GMT
Fig. 4 Intraday average over all 260 trading days for each of the 288 daily five-minute intervals of
d.t /. The arrows show when the particular market is open, the Nikkei, 1:00–7:00 GMT, the FTSE,
10:00–16:30 GMT, and the NYSE, 14:30–22:00 GMT
Bollerslev 1990; Andersen and Bollerslev 1997b, 1998), Fig. 4 reveals an intradaily
pattern of time-varying, long-memory being highest when the first and third most
active trading centers (London and New York) are closed, and the lowest average
value of the time-varying, long-memory parameter corresponds with when the
London market is closing.
Because a larger long-memory parameter leads to a smoother or less volatile
process, the period of the day when both the London and New York markets are
closed is a tranquil time with small and infrequent changes in the DM-$ exchange
rate. Whereas, the market is on average the most turbulent during those hours when
the London and New York markets are both open (14:30–16:30 GMT). Since the
volume during the operation of the London market is of the largest level of the three
regional markets and is nearly twice that of the New York market, the small long-
memory average associated with the closing of the London market (16:30 GMT)
suggests that the heavier market activity leads to lower degrees of long-memory
with its accompanying large and frequent changes in volatility.
It is difficult to determine if the decline in the degree of long-memory that occurs
when London and New York are simultaneously trading is the result of public
or private information. In the equity markets, French and Roll (1986) argue that
private price information held by informed traders must be exploited prior to market
closing, and thus, higher levels of volatility are to be expected immediately before
the market closes. On the other hand, Harvey and Huang (1991) and Ederington and
Lee (1993) argue that, unlike the equity markets, the foreign exchange market is a
continuous market, giving informed traders a liquid market in which to capitalize on
their private information almost 24 hour out of every day. As a result, the increase
122 M.J. Jensen and B. Whitcher
6 Conclusion
Acknowledgements Mark Jensen would like to personally thank James Ramsey for his guidance
and advice and for his openness to wavelet analysis and the inference it makes possible in
economics, finance and econometrics. Both authors thank the seminar and conference participants
at the University of Kansas, Brigham Young University, the Federal Reserve Bank of Atlanta, the
Midwest Economic Meetings, the Symposium on Statistical Applications held at the University of
Missouri–Columbia, the Conference on Financial Econometrics held at the Federal Reserve Bank
of Atlanta, and the James Ramsey Invited Session of the 2014 Symposium on Nonlinear Dynamics
and Econometrics held in New York, The views expressed here are ours and are not necessarily
those of the Federal Reserve Bank of Atlanta or the Federal Reserve System.
Appendix 1
For clarity and guidance in understanding the class of locally stationary models we
first prove the following lemma.
Lemma 1. If jd.u/j < 1=2 and d.u/ and .u/ are continuous on R with d.u/ D
d.0/; .u/ D .0/ for u < 0, and d.u/ D d.1/; .u/ D .0/ for u > 1, and
differentiable for u 2 .0; 1/ with bounded derivatives, then the triangular process
ht;T defined in Eq. (6) is a locally stationary process with transfer function:
124 M.J. Jensen and B. Whitcher
.u/
A.u; !/ D p .1 e i ! /d.u/
2
.u/
j1 e i ! j2d.u/
f .u; !/ D
2
p
Proof of Lemma 1. From Stirling’s formula, .x/ 2e xC1 .x 1/x1=2 as
x ! 1, it follows that as l ! 1, .l C d.t=T //=. .l C 1/ .d.t=T ///
l d.t =T /1 = .d.t=T //,
1
X 2
.l C d.t=T //
< 1;
.l C 1/ .d.t=T //
lD0
and,
X
T
.l C d.t=T //
e i l! D .1 e i ! /d.t =T / ; as T ! 1:
.l C 1/ .d.t=T //
lD0
It then follows by Theorem 4.10.1 in Brockwell and Davis (1991) that the triangular
process:
1
X .l C d.t=T //
ht;T D .t=T / t l : (17)
.l C 1/ .d.t=T //
lD0
where,
.t=T /
A0t;T .!/ D p .1 e i ! /d.t =T / :
2
Now define:
.u/
A.u; !/ D p .1 e i ! /d.u/ :
2
Since the spectral representation of ht;T only involves evaluating d./ and ./ at
t=T , A0t;T .!/ D A.u; !/ must hold. t
u
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 125
p
where A0t;T .!/ D A.u; !/ D . .u/= 2/.1 e i ! /d.u/ . By adding and
subtracting the a mean of log t2 , 1:2704, to Eq. (18) we arrive at:
Z
2
log yt;T D .t=T / C e i !t A0t;T .!/ dZ.!/ C log t2 C 1:2704;
where .t=T / D log 2 .t=T / 1:2704. Since log t2 C 1:2704 is independent and
identically distributed with mean zero and spectral density, 4:93=2, by the Spectral
Representation Theorem [Brockwell and Davis (1991), p. 145]:
p Z
log t2 C 1:2704 D 4:93=2 e i !t dZ .!/:
so that:
Z p Z
2
log yt;T D .t=T / C e i !t
Aot;T .!/dZ.!/ C 4:93=2 e i !t dZ .!/: (19)
t
u
Proof of Theorem 2. Let Xt;T be a locally-stationary process with spectral represen-
tation
Z
Xt;T D e i !t Aot;T .!/dZ.!/ (20)
Lj 1
X
WQ j;t;T D hQ j;l Xt l;T : (21)
lD0
Substituting the definition of Xt;T from Eq. (20) into Eq. (21) the MODWT coeffi-
cient equals
Lj 1
X Z
WQ j;t;T D hQ j;l e i !.t l/ Aotl;T .!/dZ.!/:
lD0
Since Aot;T .!/ D A.u; !/ C O.T 1 / holds uniformly on u 2 .0; 1/ and j!j
8 9
Z <LX
j 1 =
WQ j;t;T D e i !t e i !l hQ j;l A.u l=T; !/ C O.T 1 / dZ.!/:
: ;
lD0
@ v 2 @2
A.u C v; !/ D A.u; !/ C v A.u; !/ C A.u; !/ C O.v 3 /
@u 2 @u2
we find
8
Z <LX
j 1
l @
WQ j;t;T D e i !t e i !l hQ j;l A.u; !/ C A.u; !/
: T @u
lD0
2 ! #)
l @2 l 3 1
C A.u; !/ C O C O.T / dZ.!/: (22)
T @u2 T
By the power scaling rule where if g.t/ D th.t/, then G.!/ D .2 i /1 H 0 .!/,
where G./ and H./ are the respective Fourier transforms Eq. (22) becomes
Z
1 @ Q @
WQ j;t;T D e i !t
HQj .!/ A.u; !/ C O .T 1 / C .2 i/1 Hj .!/ A.u; !/
T @! @u
1 1 @2 Q @2 3
Hj .!/ A.u; !/ C O .T / dZ.!/ (23)
2 T 2 @! 2 @u2
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 127
PLj 1
where HQj .!/ D lD0 e i !l hQ j;l . Since
1 @ Q @
Hj .!/ A.u; !/ D O.T 1 /
T @! @u
1 1 @2 Q @2
Hj .!/ A.u; !/ D O.T 2 /
2 T 2 @! 2 @u2
with
ˇ ˇ
ˇ ˇ
sup ˇAoj;t;T .!/ Aj .u; !/ˇ KT 1
u;!
Appendix 2
To determine the MODWT frequency domain properties define the transfer function
of the filter fhQ 1;l g as:
X
L1
HQ .!/ hQ 1;l e i !l :
lD0
Since the filtered output of the MODWT wavelet filter fhQ 1;l g produces the informa-
tion lost by filtering the series with a weighted moving average, fhQ 1;l g is a high-pass
filter whose transfer function HQ ./ is supported on the nominal band-pass set of
frequencies Œ; =2/ [ .=2; . From Eq. (9) it follows that fgQ 1;l g is a low-
pass filter whose transfer function:
X
L1
GQ.!/ gQ 1;l e i !l D e i !.L1/ HQ . !/;
lD0
j 2
Y
HQj .!/ HQ .2j 1 !/ GQ.2k !/
kD0
which by the definition of HQ has support on the octave .˙2=2j C1; ˙2=2j . The
J th-ordered scaling filter has the transfer function:
1
JY
GQJ .!/ GQ.2k !/;
kD0
References
Andersen TG, Bollerslev T (1997a) Heterogeneous information arrivals and return volatility
dynamics: Uncovering the long-run in high frequency returns. J Finance 52:975–1005
Andersen TG, Bollerslev T (1997b) Intraday periodicity and volatility persistence in financial
markets. J Empir Finance 4:115–158
Andersen TG, Bollerslev T (1998) Deutsche mark-dollar volatility: Intraday activity patterns,
macroeconomic announcements, and longer run dependencies. J Finance 53:219–265
Andersen TG, Bollerslev T, Diebold FX, Labys P (2001) The distribution of realized exchange rate
volatility. J Am Stat Assoc 96:42–55
Andersen TG, Bollerslev T, Diebold FX, Labys P (2003) Modeling and forecasting realized
volatility. Econometrica 71:579–625
Baillie RR, Bollerslev T (1990) Intra-day and inter-market volatility in foreign exchange rates. Rev
Econ Stud 58:565–585
Breidt FJ, Crato N, de Lima P (1998) The detection and estimation of long memory in stochastic
volatility. J Econometrics 83:325–348
Brockwell PJ, Davis RA (1991) Time series: theory and methods, 2nd edn. Springer, New York
Crowley PM (2007) A guide to wavelets for economicst. J Econ Surv 21:207–267
Dahlhaus R (1996) On the Kullback-Leibler information divergence of locally stationary processes.
Stoch Process Appl 62:139–168
Dahlhaus R (1997) Fitting time series models to nonstationary processes. Ann Stat 25:1–37
Daubechies I (1992) Ten lectures on wavelets. SIAM, Philadelphia
Diebold FX, Atusushi I (2001) Long memory and structural change. J Econometrics 105:131–159
Donoho DL, Johnstone IM (1994) Ideal spatial adaptation via wavelet shrinkage. Biometrika
81:425–455
Donoho DL, Johnstone IM (1995) Adapting to unknown smoothness by wavelet shrinkage. J Am
Stat Assoc 90:1200–1224.
Donoho DL, Johnstone IM (1998) Minimax estimation via wavelet shrinkage. Ann Stat 26:879–
921
Ederington LH, Lee JH (1993) How markets process information: News releases and volatility. J
Finance 48:1161–1191
French KR, Roll R (1986) Stock return variances: The arrival of information and the reaction to
traders. J Financ Econ 17:5–26
Gallegati M, Ramsey JB, Semmler W (2014) Interest rate spreads and output: A time scale
decomposition analysis using wavelets. Comput Stat Data Anal 76:283–290
Measuring the Impact Intradaily Events Have on the Persistent Nature of Volatility 129
Gençay R, Selçuk F, Whitcher B (2001) An introduction to wavelets and other filtering methods
for finance and economics. Academic Press, San Diego
Gençay R, Selçuk F, Whitcher B (2005) Mutiscale systemic risk. J Int Money Finance 24:55–70
Harvey AC (2002) Forecasting volatility in the financial markets, 2nd edn, chap Long Memory in
Stochastic Volatility. Butterworth-Heinemann, Oxford, pp 307–320
Harvey CB, Huang RD (1991) Volatility in the foreign currency future market. Rev Financ Stud
4:543–569
Hong Y, Kao C (2004) Wavelet-based testing for serial correlation of unknown form in panel
models. Econometrica 72:1519–1563
Hong Y, Lee J (2001) One-sided testing for ARCH effects using wavelets. Economet Theory
6:1051–1081
Jensen MJ (1999a) An approximate wavelet MLE of short and long memory parameters. Stud
Nonlinear Dynam Econometrics 3:239–253
Jensen MJ (1999b) Using wavelets to obtain a consistent ordinary least squares estimator of the
fractional differencing parameter. J Forecast 18:17–32
Jensen MJ (2000) An alternative maximum likelihood estimator of long-memory processes using
compactly supported wavelets. J Econ Dynam Control 24:361–387
Jensen MJ (2004) Semiparametric Bayesian inference of long-memory stochastic volatility. J Time
Ser Anal 25:895–922
Jensen MJ, Liu M (2006) Do long swings in the business cycle lead to strong persistence in output?
J Monetary Econ 53:597–611
Jensen MJ, Whitcher B (1999) A semiparametric wavelet-based estimator of a locally stationary
long-memory model. Tech. rep., Department of Economics, University of Missouri
Lamoureux CG, Lastrapes WD (1990) Persistence in variance, structural change and the GARCH
model. J Bus Econ Stat 8:225–234
Lastrapes WD (1989) Exchange rate volatility and U.S. monetary policy: An ARCH application.
J Money Credit Bank 21:66–77
Mallat S (1989) A theory of multiresolution signal decomposition: The wavelet representation.
IEEE Trans Pattern Anal Mach Intell 11:674–693
Müller U, Dacorogna M, Olsen R, Pictet O, Schwarz M, Morgenegg C (1990) Statistical study of
foreign exchange rates, empirical evidence of a price change scaling law, and intraday analysis.
J Bank Finance 14:1189–1208
Percival DB, Walden AT (2000) Wavelet methods for time series analysis. Cambridge University
Press, Cambridge
Phillips PCP (1987) Time series regression with a unit root. Econometrica 55:277–301
Ramsey JB (1999) The contribution of wavelets to the analysis of economic and financial data.
Phil Trans R Soc Lond A 357:2593–2606
Ramsey JB, Lampart C (1998a) The decomposition of economic relationships by time scale using
wavelets: Expenditure and income. Stud Nonlinear Dynam Econometrics 3:23–42
Ramsey JB, Lampart C (1998b) Decomposition of economic relationships by time scale using
wavelets: Money and income. Macroeconomic Dynamics 2:49–71
Rua A, Nunes LC (2009) International comovement of stock market returns: A wavelet analysis. J
Empir Finance 16:632–639
Russell JR, Ohanissian A, Tsay RS (2008) True or spurious long memo? A new test. J Bus Econ
Stat 26:161–175
Stăriciă C, Granger C (2005) Nonstationarities in stock returns. Rev Econ Stat 87:503–522
Whitcher B, Jensen MJ (2000) Wavelet estimation of a local long-memory parameter. Explor
Geophys 31:94–103.
Wavelet Analysis and the Forward Premium
Anomaly
Michaela M. Kiermeier
Abstract Forward and corresponding spot rates on foreign exchange markets differ
so that forward rates cannot be used as unbiased predictors for future spot rates. This
phenomenon has entered the literature under the heading of the Forward Premium
Anomaly. We argue that standard econometric analyses implicitly assume that the
relationship is time scale independent. We use wavelet analysis to decompose the
exchange rate changes, and the forward premia, using the maximal overlap discrete
wavelet transform (MODWT). Then we estimate the relationship on a scale-by-
scale basis, thereby allowing for market inefficiencies such as noise, technical, and
feedback trading as well as fundamental and rational trading. The results show
that the forward premia serve as unbiased predictors for exchange rate changes
(unbiasedness hypothesis) for certain time scales only. Monthly and weekly data
concerning Euro, US-dollar and British Pound for forward periods from 1 month
to 5 years is analysed. We find that the unbiasedness hypothesis cannot be rejected
if the data is reconstructed using medium-term and long term components. This is
most prevalent in the forward transaction periods up to 1 year.
1 Introduction
Spot and forward exchange rates are determined by current expectations about
future events. The theory of rational expectations links expectations about future
inflation rates, interest rates, with changes in prices in currency markets. Currency
price adjustments result from various attempts of market participants to manage
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 131
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__6,
© Springer International Publishing Switzerland 2014
132 M.M. Kiermeier
risks and returns. The Interest Rate Parity states that investors demand a premium
or a discount in forward exchange markets according to differentials in interest
rates, see for example Shapiro (2009). Since forward rates and interest rates
are theoretically linked through the Uncovered Interest Rate Parity (UIP), we
investigate the rational expectation theory by focusing on the forward rate’s ability
to forecast future exchange rate changes. We thereby test if current forward rates
provide unbiased predictors for next periods’ spot exchange rates, which is what
we call the unbiasedness hypothesis throughout this paper. Tests for the floating
exchange rate era indicate that future exchange rates are negatively correlated with
current forward rates. Its interpretation being, that the forward rate serves a better
purpose as a contrary indicator than as an unbiased predictor for future spot rates.
According to Fama (1984) these empirical results are referred to as the “Forward
Premium Puzzle” or the “Forward Premium Anomaly”. Engel (1996) stresses that
the unbiasedness hypothesis is routinely rejected in empirical tests which is also the
finding of related research, see for example Hodrick (1987), MacDonald and Taylor
(1992), Taylor (1995), and Wang and Jones (2002).
Cutler et al. (1990) summarize three important characteristics of data concerning
forward premia and exchange rate changes. They find that monthly returns exhibit
positive autocorrelation with regards to previous months. Additionally, they point
to the fact that there is negative auto-correlation in the medium or long term, and
that returns are best explained by fundamentals in the long run. Analyses of survey
data support these findings. Allen and Taylor (1990) find that for the London foreign
exchange market 90 % of intraday and short term traders use technical analysis for
their decisions. In the long run, however, they argue that fundamentals are used by
85 % of market participants in forming expectations. Similar results are found by
Cheung and Wong (2000) on the Asian interbank markets for foreign exchange.
Positive feedback traders continue to buy when prices increase, and sell, when
prices decrease, whereas negative feedback traders do exactly the opposite. They
sell when prices increase and buy when prices decrease. Motivation for positive
feedback trading can arise from strategies that include portfolio insurance, positive
wealth elasticity, or simply from technical trading. Negative feedback trading, on
the other hand, can be induced by profit taking or investment strategies that ask
for a constant share of wealth in pre-defined asset classes. Cutler et al. (1990)
explain the positive 1 month autocorrelation of exchange rates by assuming that
fundamental or positive feedback traders only learn about the fundamentals with a
time lag. The use of technical analysis (i.e. noise trading) also results in positive
auto-correlation of exchange rates. The negative medium, or long term, auto-
correlation is then a direct result from misperceptions in the short run which are
corrected in the medium or long term. For these time periods fundamentals become
the main driving force. Other possible explanations for overshooting prices and
deviations from long term fundamental equilibrium are outlined by Black (1988),
and Frankel and Froot (1986), who assume that investors change their willingness
to take on risks according to non-fundamental factors, or as a result of the success
of forecasting models in the previous period. This way Frankel and Froot (1986) are
able to explain the continued deviation of the US$ from its equilibrium rate during
the time period 1980 to 1985.
Wavelet Analysis and the Forward Premium Anomaly 133
If exchange rate markets are efficient past observations of exchange rate changes,
or forward premia, cannot be significant in explaining current exchange rates.
However, De Long et al. (1990, 1991) demonstrate that even rational investors can
have different opinions on the data generating process with regards to the near future
and the long run. They argue that rational investors can correctly perceive positive
feedback trading as the driving force behind price changes in the near future, and
at the same time acknowledge a reversion to a fundamental equilibrium in the long
run.
In this paper we apply wavelet analysis to be able to allow for various types
of trading as outlined above. The wavelet decomposition allows us to specifically
distinguish short, medium, and long run periods. At the same time we can allow
information from past observations to continue to be of importance for the respective
time periods. Within these time periods investors can either learn about the relevant
information with a time delay, or use feedback, noise, technical, fundamental,
rational trading as their respective strategy. We investigate if averaging over various
time periods veils the fact that the unbiasedness hypothesis holds true for certain
time scales only, i.e. that the fundamental relationship between forward premia and
exchange rate changes holds true only at certain time horizons. For that purpose
we decompose exchange rate changes, and the forward premia, into their time-scale
components using the maximal overlap discrete wavelet transform (MODWT). We
thereby restrict the variation of the data to be of influence for a certain time period
only. Decomposing weekly and monthly data to their respective time scales allows
us to distinguish one short term, three medium term, and three long term periods.
We then proceed by estimating the impact of the forward premia on exchange
rate changes on a scale-by-scale basis. The robustness of the results is tested by
analyzing forward transaction time periods that vary from 1 month to 5 years.
Only recently researchers analyze relationships to hold for various time periods
and not only for the short and the long run. This is why wavelet analysis has been
applied to macro-economic and financial theories, see for example Ramsey and
Lampart (1996), Kiermeier (2014), Kim and Haueck In (2003), Raihan et al. (2005),
Gallegati et al. (2011), Gencay et al. (2009).
This paper is organized in the following way. In Sect. 2, we review shortly the
underlying theories and attempts to explain the forward premium anomaly in Sect. 2.
In Sect. 3 we introduce the basic ideas of wavelet analysis and motivate its use to
test for the unbiasedness hypothesis. Section 4 describes the data and the results
from performing regression analyses on a scale-by-scale basis. Section 5 concludes.
To test the hypothesis if the forward rate (F) is an unbiased predictor for future spot
rates (S) Eq. (1) has to be analyzed econometrically.
st C1 st D a C b .ft st / C ut C1 (1)
134 M.M. Kiermeier
with ut being a white noise error term. The lower letter case s and f indicate the
logarithmic transformations of the variables S and F.
For the unbiasedness hypothesis to hold, “a” needs to be equal to zero and “b” to
one. If “b” equals one the above specification becomes equal to Eq. (2).
st C1 D a C b .ft / C t C1 (2)
The empirical evidence rejects the unbiasedness hypothesis. In general, the slope
coefficient in a regression of the ex post exchange rate changes on a constant and
the forward rate differential is significantly negative, see Engel (1996). Therefore, an
approach of time varying risk premia was introduced (see Hodrick and Srivastava
1986; Kaminski and Peruga 1990; Bensberg 2012, among others). Backus et al.
(1996) conclude that these models have serious shortcomings since they are not in
line with market data.
Froot and Thaler (1990) summarize models that attempt to explain the forward
premium anomaly including the peso problem and give an outlook for a possible
explanation. They argue that the assumption of efficient currency exchange markets
cannot be made because in practice investors have different response times to new
information. They therefore include past interest rate changes in their econometric
specification which results in some positive estimates of the coefficient “b”. Chinn
and Meredith (2004) use a macro-economic model to give a theoretical foundation
for the necessity that different time periods have to be considered in the econometric
analysis.
In this paper we extend the idea of analyzing different time horizons, and allow
for inefficiencies, such as delayed learning about relevant information, or other
forms of feedback, or technical trading as outlined above.
Standard econometric estimation techniques are able to distinguish between
short and long term dynamics only. Non-stationary features of the data are usually
removed prior to performing an analysis, resulting in the known problem that
relationships seem to change in times of financial distress. Different data generating
processes (regimes) seem to govern price movements in financial markets. We do
not adjust the data prior to the regression analyses. We decompose the data with
wavelet analysis which allows for various forms of non-stationarities to be present
in the data, and nonetheless, do not cause problems in our analysis when we estimate
the relationship on a scale-by-scale basis.
3 Estimation Techniques
Time series analysis and standard econometric methods cannot account for changes
in frequency behavior. We use wavelets as a time–frequency analysis that provides
information about the frequency behavior of time series at a given point in
time. Wavelet analysis estimates the frequency structure of a time series (forward
premium and exchange rate changes). In addition to that it keeps the information
Wavelet Analysis and the Forward Premium Anomaly 135
when an event of the time series takes place. Wavelet analysis can be understood
as a rotation in the function space. The basis functions used in that transformation
are wavelets which have finite support on the time axis, i.e. are small waves. For
the purpose of transforming the time series, the basis function (wavelet) is dilated,
or compressed, to capture frequency behavior, and is shifted along the time axis to
capture the date when a certain event takes place. This is how it is possible for a
disturbance to be of influence for certain frequencies, or finite time periods only.
The result is a representation of the time series in the time and frequency domain.
The wavelet approach can allow an analysis of processes whose behaviors differ
across scales, i.e. depict different behavior with regards to different time horizons.
This is most likely the case for the (forward) currency exchange market due to the
aforementioned reasons.
For the purpose of allowing different behavior for different time horizons, the
variables exchange rate change and forward premia are decomposed into their
time-scale components applying the maximal overlap discrete wavelet transform
(MODWT). This procedure allows for any length of a time series and is able
to get robust estimators. Wavelets ( j,k and ®J,k ) when multiplied with their
respective coefficients at a certain level “j” or “J” are called atoms Dj,k and SJ,k
(i.e. dj,k * j,k D Dj,k and sJ,k * ®J,k D SJ,k ) with j,k and ®J,k being the wavelet and
scaling functions at level “j” or “J” and “k” indicating the location of the wavelet
on the time axis. The sums of all atoms, SJ,k (t) and Dj,k (t), over all locations on the
time axis k D 1, : : : , 2nj at a certain level “j” or “J” are called crystals and are given
by Eqs. (3) and (4).
n
X
2J
Defining the importance of information to be valid for a specific time period only,
the time series are decomposed to their respective resolutions in time (time scales).
The time series forward premia and exchange rate changes are then approximated
using only parts of the coefficients and their respective wavelets. To analyze the
impact of information for a certain time period the multiresolution decomposition
is applied to the time series (stC1 st ) and (ft st ) which is defined in Eq. (5):
The wavelets used in the analysis are “symmlets”, which are smooth and com-
paratively symmetric filter functions. The decomposition is performed sequentially
from smallest (high frequencies) to largest (low frequencies) scales. The support
width on the time axis doubles in size with each following level i.e. scale. The
number of scales used in this analysis equals five (i.e. J D 5) which is a direct result
of the number of observations available (see Crowley 2005). We then perform the
regression analysis at each level. Changes in exchange rates are regressed on the
forward premia at different time scales, i.e. Eq. (1) is estimated at every time scale
1, : : : , J using the reconstructed time series as outlined in Eqs. (6) and (7):
.st C1 st / Dej .t/ D a C b .ft st / Dpj .t/ C ut C1 8D1 D5 (6)
.st C1 st / ŒSe5 .t/ D a C b .ft st / Sp5 .t/ C ut C1 S5 (7)
The unbiasedness hypothesis is then tested by imposing and testing the linear
restrictions on the estimated parameter as in the aggregate analysis, i.e. the linear
restriction is imposed on “b” to equal one.
4 Empirical Analysis
The data used in this analysis is taken from Bank of America/Merril Lynch.
Weekly and monthly Eurocurrency rates for Euro, UK, and the US are taken.
Forward Rates are calculated for time periods of 1, 3, 6, 12 months, 2, and 5 years.
The weekly rates are available from January 2000 to March 2012, the monthly
closing data is available from January 1998 until June 2013. The exchange rates
for the currencies US$/Euro and US$/British Pound are available as weekly and
monthly observations for all estimation periods from the same source. We begin our
analysis with monthly observations and a 1-month forward transaction period. The
forward rates are calculated according to the interest rate parity from the monthly
data for spot exchange rates, and interest rates, using Eq. (8):
1 C if
Ft D St (8)
1 C id
with
Ft D forward exchange rate (foreign currency per one unit domestic currency) at t
St D spot exchange rate at time t
if D foreign 1-month Eurocurrency rate
id D domestic 1-month Eurocurrency rate
Wavelet Analysis and the Forward Premium Anomaly 137
We then proceed by applying wavelet analysis to the data to allow for the
possibility that averaging over time scales veils the fact that the forward rate is an
unbiased predictor for future spot rates with regards to certain time periods only.
Table 1 Variation of the time series explained by crystals (in %) for forward transaction period of
1 month
Forward premium Exchange rate Forward premium Exchange rate
(US/Euro) change (US/Euro) (US/BP) change (US/BP)
D1 0.455 50.065 0.718 52.012
D2 0.461 23.744 0.508 16.781
D3 1.013 13.291 0.934 14.841
D4 4.164 7.469 5.295 9.321
D5 23.457 2.692 19.533 4.160
S5 70.450 2.738 73.013 2.886
Table 2 Regression results for the US$–Euro and US$–British Pound exchange rate changes
regressed on forward premia using reconstructed time series (1 month forward rates)
US$/Euro US$/British Pound
Crystals Intercept Forward prem. R2 Crystals Intercept Forward prem. R2
D1 0.00* 3.1* 0.02 D1 0.00* 1.08 0.01
D2 0.00* 1.17 0.01 D2 0.00* 1.7* 0.03
D3 0.00* 0.33 0.00 D3 0.00* 1.1* 0.02
D4 0.00* 1.7* 0.21 D4 0.00* 1.3* 0.26
D5 0.00* 0.067 0.01 D5 0.00* 0.43* 0.25
S5 0.00* 0.23* 0.3 S5 0.00* 0.47* 0.6
*Significance at a 5 % confidence level
on the respective forward premia using the reconstructed time series at scales “D1”–
“S5” for a forward transaction period of 1 month.
For the monthly data of US$–Euro exchange rate we find a significant influence
of forward premia in explaining exchange rate changes at scales “D1”, “D4”,
and “S5”. The significant influence that we find at scales “D1” and “D4” is
positive. They represent short medium term (2–4 months) and short long term
(1.3–2.6 years), respectively. At level “S5” (more than 5 years) the relationship
is, however, significantly negative. The amount of variation explained is highest at
levels “D4” and “S5”, in all three cases the F-statistics support the estimation design.
We therefore conclude that if information from the forward rate is allowed to be of
influence for 2–4 months and 1.3–2.6 years respectively then the variables from
Eqs. (6)–(7) are significantly, positively linked. This means if we allow information
from the previous 2–4 months at scale “D1” and from previous 1.3–2.6 years at
scale “D4” to be relevant in explaining respective adjustment time periods of the
exchange rate changes that the estimated relationship is positive as is predicted by
rational expectation theory. In the long run (more than 5 years) however (at level
“S5”) the relation is significantly negative. This indicates that at this level a reversion
to a mean is the main driving force for market prices. The mean reversion in the long
run can be a result from error corrections.
For the US$–British Pound changes in exchange rates can be significantly
explained by forward premia at levels “D2”, “D3”, “D4”, “D5”, and “S5”. Again,
Wavelet Analysis and the Forward Premium Anomaly 139
Table 3 Wald-test of the unbiasedness hypothesis for forward transaction time period of 1 month
US$/Euro US$/British Pound
Crystals Test of unbiasedness hypothesis Crystals Test of unbiasedness hypothesis
D1 Not rejected D1 Not rejected
D2 Not rejected D2 Not rejected
D3 Not rejected D3 Not rejected
D4 Rejected D4 Not rejected
D5 Rejected D5 Rejected
S5 Rejected S5 Rejected
with the exception of “S5” the estimated relationship is significantly positive in the
medium term. For the medium (“D4”) and long term (“D5” and “S5”) the amount
of variation in exchange rate changes explained by respective components of the
forward premia is highest. The F-test supports the estimation set up. For the US$–
British Pound the rational expectation theory is supported at two medium term
scales and two long term scales. In general, the statistical evidence for the rational
expectation theory to be the main driving force behind exchange rates changes is
higher than in the case of the US$–Euro market because the estimated relationship
is significantly positive at all scales except “D1” and “S5”. At the highest frequency
“D1” the data is not conclusive. This can be a result of the continued importance
of technical analysis for that time period as was pointed out by Cheung and Wong
(2000). As is the case in the US$–Euro exchange rate, at the longest time period
(more than 5 years) the relationship is significantly negative. This supports the idea
that in the very long run reversions to a fundamental equilibrium take place, which
is one of the stylized facts about capital markets put forth by Cutler et al. (1990).
Determining significant components gives us insights into how long time periods
for processing information are. To test the unbiasedness hypothesis, however, we
need to impose the linear restriction that the estimated coefficients from Eqs. (6)–
(7) comply with rational expectation theory. We apply the Wald test for linear
restrictions in a regression model. The null hypothesis (unbiasedness hypothesis)
requires the estimated coefficients of the forward premia “b” to be equal to one.
Under the null hypothesis the Wald-statistic is distributed as an F-distribution, the
degrees of freedom are given by the number of restrictions (i.e. one) and the number
of observations in the respective regression analyses, see Wald (1943). The results
of these tests are summarized in Table 3.
For the US$–Euro exchange rate we find that the unbiasedness hypothesis is not
rejected at levels “D1”, “D2”, “D3”. A significant positive influence of forward
premia for these scales is however only given at scale “D1”. In other words,
the null hypothesis that the estimated coefficient is equal to one is statistically
meaningful only at the significant level “D1” (medium term 2–4 months). Although
the hypothesis is not rejected at scales two and three as well, regression results
indicate that at these time scales the forward premium is not significant in explaining
changes in exchange rates. Therefore we conclude, for the US$–Euro exchange rate
the forward premium is significant and the unbiasedness-hypothesis is not rejected
140 M.M. Kiermeier
only at a time scale where characteristics of the data are influential for 2–4 months
(short medium term). At time scales where information is of importance for a
longer time period either the forward premium is not significant or the unbiasedness
hypothesis is rejected.
For the US$–British Pound exchange rate the forward premia are significant
in explaining future exchange rate changes at levels “D2”–“S5”. The hypothesis
that the estimated coefficient of the forward premia equals one is not rejected for
significant levels “D2”, “D3”, and “D4”. At time scale “D1” the hypothesis is not
rejected as well, but the regression results indicate that the forward premium is not
significant as explanatory variable. The probability for the unbiasedness hypothesis
to hold is highest at level “D4”. The US$–British Pound exchange rate depicts
different characteristics than the US$–Euro exchange rate. In case of US$–British
Pound rate we find a significant influence at three levels (“D2”, “D3”, and “D4”), in
addition to that the unbiasedness hypothesis cannot be rejected at these levels. The
time scale “D2” represents data characteristics that prevail for 4–8 months, “D3”
represents 8–16 months, and “D4” 1.3–2.6 years respectively.
We conclude that aggregating over time scales “D1”–“S5” results in misleading
interpretations of the influence of the forward premia in explaining future exchange
rate changes because the data demonstrates different behavior in the medium and
long term. Only at time scales that represent medium terms the premium is of
significant, positive influence for future exchange rate changes. We find different
time scales to be significant for different exchange rates.
Finally, to analyze the short term period (2–4 weeks) as well, the above analysis
is repeated using weekly data which allows the definition of a short term period. The
hypothesis is not supported by statistical inference in the short run, which is what
the survey data suggests (see Allen and Taylor 1990; Cheung and Wong 2000). At
the short horizon technical trading is perceived to be the most important influence
in forming expectations, therefore the insignificance of the forward rate to explain
exchange rate changes for that time period is in line with previous results and market
data.
In order to check for the robustness of our findings we also analyze the forward
premium anomaly for time to maturities of the forward contract of 3 months, 6
months, and 1, 2, and 5 years in the same manner that has been described above.
The MODWT shows similar influences of the crystals “D1”–“S5” in explaining the
variance of the time series exchange rate changes and forward premia in the case of
the three, 6–12 months forward transaction period. Again, the exchange rate changes
are best explained by information at every time scale, whereas for the forward
premia lower scales carry more influence. Once again, we reconstruct the time series
exchange rate change and forward premium for the various forward transaction
time periods with information captured at the various time scales “D1”–“S5” for
the two exchange rates US$–Euro and US$–British Pound. With the reconstructed
time series we estimate Eqs. (6) and (7). In case of the 3 and 6 months forward
transaction periods we find similar results as in the 1 month forward transaction
period. However, there are differences with regards to the significantly, positive
relationships for the medium terms. The relationships are less significant. In case
Wavelet Analysis and the Forward Premium Anomaly 141
5 Conclusion
In this paper we argue that the assumptions made in standard econometric proce-
dures to test for the unbiasedness hypothesis might be responsible for the failure of
the theory to be validated in practice. We use the maximal overlap discrete wavelet
transform to decompose the data into their time-scale components to allow for
inefficiencies in the exchange rate markets. We assume feedback, noise, technical,
fundamental and rational trading to be present. The decomposition of the time series
exchange rate changes and forward premia allows information to continue to be
relevant in the price formation for specific, pre-defined time periods. This way we
analyze the forward premium anomaly at different time scales. We then test the
unbiasedness hypothesis at the respective scales and find that the hypothesis cannot
be rejected at certain time scales. We get different results from analyzing the US$–
Euro and the US$–British Pound exchange rates. In case of the forward transaction
time period of 1 month a significant positive relationship between the variables
forward premia and exchange rate changes can be found in the medium terms for
both currencies. It is more pronounced in the case of US$–British Pound exchange
rate. The unbiasedness hypothesis is supported for the US$–Euro exchange rate for
a time scale of 2–4 months. In case of the US$–British Pound exchange rate the
unbiasedness hypothesis gets supported at medium term time scales and for the
time period of 1.3–2.6 years (i.e. long term). The analysis of weekly data allows
for the definition of a short term period. We find that the unbiasedness hypothesis
is not supported in the short run which is in line with survey data for exchange rate
markets. The findings are similar when the forward transaction period is extended
from 1 month to 5 years. For forecasting periods above a year the influence of the
forward market is mostly significantly negative. We conclude that the adjustment
time period to new information is crucial for the validity of the unbiasedness
hypothesis. Aggregating over the time scales veils the fact, that the theory seems
appropriate for certain time periods only. The unbiasedness hypothesis is supported
for the medium term.
142 M.M. Kiermeier
References
Allen H, Taylor MP (1990) Charts, noise and fundamentals in the London foreign exchange market.
Econ J 100:49–59
Backus D, Foresi S, Telmer C (1996) Affine models of currency pricing. NBER Working Paper
5623
Bensberg D (2012) Das forward premium puzzle als ergebnis adverser selektion: eine untersuchung
auf theoretischer basis. SVH-Verlag, Saarbruecken
Black F (1988) An equilibrium model of the crash. NBER Macroeconomics Annual, pp 269–395
Cheung Y, Wong C (2000) A survey of market practitioners’ views on exchange rate dynamics.
J Int Econ 51:401–419
Chinn MD, Meredith G (2004) Monetary policy and long horizon uncovered interest parity, IMF
Staff Papers 51(3)
Crowley PM (2005) An intuitive guide to wavelets for economists. Bank of Finland Research
Discussion Papers
Cutler DM, Poterba JM, Summers LH (1990) Speculative dynamics and the role of feedback
traders. Am Econ Rev 80(2):63–68
De Long JB, Shleifer A, Summers LH, Waldmann RJ (1990) Positive-feedback investment
strategies and destabilizing rational speculation. J Finance 45(2):379–395
De Long JB, Shleifer A, Summers LH, Waldmann RJ (1991) The survival of noise traders in
financial markets. J Bus 64:1–19
Engel C (1996) The forward discount anomaly and the risk premium: a survey of recent evidence.
J Empirical Finance 3(2):123–192
Fama E (1984) Forward and spot exchange rates. J Monetary Econ 14:319–338
Frankel JA, Froot KA (1986) Understanding the US dollar in the eighties: the expectations of
chartists and fundamentalists. Economic Record Special Issue, 24–38
Froot KA, Thaler RH (1990) Anomalies: foreign exchange. J Econ Perspect 4(3):179–192
Gallegati M, Gallegati M, Ramsey JB, Semmler W (2011) The US wage Phillips curve across
frequencies and over time. Oxf Bull Econ Stat 73(4):489–508
Gencay R, Selçuk R, Whitcher B (2009) An introduction to wavelets and other filtering methods
in finance and economics. Academic, Philadelphia
Hodrick RJ (1987) The empirical evidence on the efficiency of forward and futures foreign
exchange market. Harwood Academic Publisher, Switzerland
Hodrick RJ, Srivastava S (1986) The covariation of risk premia and expected future spot rates. J
Int Money Finance 3:5–30
Kaminski G, Peruga R (1990) Can a time varying risk premium explain excess returns in the
forward market for foreign exchange? J Int Econ 28:47–70
Kim S, Haueck In F (2003) The relationship between financial variables and real economic activity:
evidence from spectral and wavelet analysis. Stud Nonlinear Dynamics Econ 7(4):1–18
Kiermeier MM (2014) Essay on wavelet analysis and the European term structure of interest rates.
Business and Economic Horizons. 9(4):18–26. Doi: 10.15208/beh.2013.19.
MacDonald R, Taylor MP (1992) Exchange rate economics: a survey. IMF Staff Papers 39:1–57
Raihan S, Wen Y, Zeng B (2005) Wavelet: a new tool for business cycle analysis. The Federal
Reserve Bank of St. Louis, Working Paper 2005-050A
Ramsey JB, Lampart C (1996) The decomposition of economic relationships by time scale using
wavelets. New York University, New York
Shapiro AC (2009) Multinational financial management. Wiley, Hoboken
Taylor MP (1995) The economics of exchange rates. J Econ Lit 33(1):13–47
Wald A (1943) Tests of statistical hypothesis concerning several parameters when the number of
observations is large. Trans Am Math Soc 54:426–482
Wang P, Jones T (2002) Testing for efficiency and rationality in foreign exchange markets: a
review of the literature and research on foreign exchange market efficiency and rationality with
comments. J Int Money Finance 21:223–239
Oil Shocks and the Euro as an Optimum
Currency Area
Abstract We use wavelet analysis to study the impact of the Euro adoption on
the oil price macroeconomy relation in the Euroland. We uncover evidence that
the oil-macroeconomy relation changed in the past decades. We show that after
the Euro adoption some countries became more similar with respect to how their
macroeconomies react to oil shocks. However, we also conclude that the adoption
of the common currency did not contribute to a higher degree of synchronization
between Portugal, Ireland and Belgium and the rest of the countries in the Euroland.
On the contrary, in these countries the macroeconomic reaction to an oil shock
became more asymmetric after adopting the Euro.
1 Introduction
L. Aguiar-Conraria ()
NIPE and Economics Department, University of Minho, Braga, Portugal
e-mail: [email protected]
T.M. Rodrigues
Economics Department, University of Minho, Braga, Portugal
e-mail: [email protected]
M.J. Soares
NIPE and Department of Mathematics and Applications, University of Minho, Braga, Portugal
e-mail: [email protected]
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 143
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__7,
© Springer International Publishing Switzerland 2014
144 L. Aguiar-Conraria et al.
necessary condition: a country with an asynchronous business cycle will face several
difficulties in a monetary union, because of the ‘wrong’ stabilization policies.
In the economics literature, to test if a group of countries form an Optimum
Currency Area (OCA), it is common to check if the different countries face
essentially symmetric or asymmetric exogenous shocks (e.g. see Peersman 2011). In
the latter case, it is more difficult to argue for a monetary union. However, even if the
shock is symmetric, one still has to check if its impact is similar across countries.
If this is not the case, the symmetric shock will have asymmetric effects, which
deteriorates the case for a monetary union.
There is a caveat to the previous argument. Some authors argue that even if
a region is not ex ante an OCA it may, ex post, become one. The argument for
this endogenous OCA is simple and intuitive: by itself the creation of a common
currency area will create the conditions for the area to become an OCA. For
example, Frankel and Rose (1998) and Rose and Engel (2002) argue that, because
currency union members have more trade, business cycles are more synchronized
across currency union countries. Imbs (2004) makes a similar argument for financial
links. After the creation of a currency area, the finance sector will become more
integrated and hence business cycles will become more synchronized. In effect,
Inklaar et al. (2008) conclude that convergence in monetary and fiscal policies
has a significant impact on business cycle synchronization. However, Baxter and
Kouparitsas (2005) conclude otherwise and Camacho et al. (2008) present evidence
that differences between business cycles in Europe have not been disappearing.
We tackle this issue by focusing on one shock that every country faces: oil
price changes. We study the relation between oil and the macroeconomy in the 11
countries that first joined the Euro in 1999. We investigate how this relation changed
after the adoption of the Euro and test if it became more or less asymmetric after
the Euro adoption. The analysis is performed in the time-frequency domain, using
wavelet analysis.
We are not the first authors to use wavelets to analyse the oil price-
macroeconomy relationship. Naccache (2011) and Aguiar-Conraria and Soares
(2011a) have already relied on this technique to assess this relation. Actually,
wavelet analysis is particularly well suited for this purpose for several reasons.
First, because oil price dynamics is highly nonstationary, it is important to use
a technique, such as wavelet analysis, that does not require stationarity. Second,
wavelet analysis is particularly useful to study how relations evolve not only
across time, but also across frequencies, as it is unlikely that these relations remain
invariant. Third, Kyrtsou et al. (2009) presented evidence showing that several
energy markets display consistent nonlinear dependencies. Based on their analysis,
the authors call for nonlinear methods to analyze the impact of oil shocks. Wavelet
analysis is one such method. We should also add that wavelets have already proven
to be insightful when studying business cycles synchronizations, e.g. see Aguiar-
Conraria and Soares (2011b) and Crowley and Mayes (2008).
We use data on the Industrial Production for the first countries joining the Euro
and estimate the coherence between this variable and oil prices. The statistical
procedure is similar to the one used by Vacha and Barunick (2012) to study
Oil Shocks and the Euro as an Optimum Currency Area 145
where s is a scaling or dilation factor that controls the width of the wavelet and is a
translation parameter controlling the location of the wavelet. Here, and throughout,
the bar denotes complex conjugate.
When the wavelet .t/ is chosen as a complex-valued function, as we do, the
wavelet transform Wx .; s/ is also complex-valued. In this case, the transform can be
146 L. Aguiar-Conraria et al.
separated into its real part, <.Wx /, and imaginary part, =.Wx /, or in its amplitude,
jWx .; s/j, and phase, x .; s/ W Wx .; s/ D jWx .; s/j e i x .;s/ .1 For real-valued
wavelet functions, the imaginary part is constantly zero and the phase is, therefore,
undefined.
When one is interested in studying the oscillatory behavior of a variable, or
a set of variables, it is almost mandatory to use a complex wavelet, because the
phase yields important information about the position of the variable in the cycle. In
particular, if one is comparing two time-series, one can compute the phases and the
phase-difference of the wavelet transform of each series and thus obtain information
about the possible delays in the oscillations of the two series as a function of time
and frequency.
In order to describe the time-frequency localization properties of the CWT, we
have to assume that both the wavelet and its Fourier transform O are well
localized functions. More precisely, these functions must have sufficient decay to
guarantee that the quantities defined below are all finite.2 In R 1what follows, for
simplicity, assume that the wavelet has been normalized so that 1 j .t/j2 dt D 1:
With this normalization, j .t/j2 defines a probability density function. The mean
and standard deviation of this distribution are called, respectively, the center, ;
and radius, ; of the wavelet. They are, naturally, measures of localization and
spread of the wavelet. The center O and radius O of O , the Fourier transform of
the wavelet , are defined in a similar manner. The interval ; C
his the set where i.t/ attains its “most significant” values whilst the interval
O O ; O C O plays the same role for O .f / : The rectangle H WD
h i
; C O O ; O C O in the .t; f / plane is called the
Heisenberg box or window
for the function . We then say that is localized
around the point ; O of the time-frequency plane, with uncertainty given
by O . The Heisenberg uncertainty principle establishes that the uncertainty is
bounded from below by the quantity 1=2.
The Morlet wavelet became the most popular of the complex valued wavelets for
several reasons.3 Among then we highlight two: (1) the Heisenberg box area reaches
its lower bound with this wavelet, i.e. the uncertainty attains the minimum possible
value; (2) the time radius and the frequency radius are equal, D O D p12 ;and,
therefore, this wavelet represents the best compromise between time and frequency
concentration. The Morlet wavelet is given by
1
The phase-angle x .; s/ of the complex number Wx .; s/ can be obtained from the formula:
=.W .;s//
tan.x .; s// D <.Wxx .;s// ; using the information on the signs of <.Wx / and =.Wx / to determine
to which quadrant the angle belongs to.
2
The precise requirements are that j .t /j < C.1 C jt j/.1C/ and j O .f /j < C.1 C jf j/.1C/ ,
for C < 1, > 0.
3
Actually, it is also common to call it the Gabor wavelet. Authors who do this, usually reserve the
name Morlet to the real part of Eq. (2).
Oil Shocks and the Euro as an Optimum Currency Area 147
1 t2
!0 .t/ D 4 e i !0 t e 2 ; (2)
In analogy with the terminology used in the Fourier case, the (local) wavelet
power spectrum (sometimes called scalogram or wavelet periodogram) is defined
as WPSx .; s/ D jWx .; s/j2 : This gives us a measure of the variance distribution
of the time-series in the time-scale (frequency) plane.
In our applications, we are interested in detecting and quantifying relationships
between two time series. The concepts of cross-wavelet power, cross-wavelet
coherency and wavelet phase-difference are natural generalizations of the basic
wavelet analysis tools that enable us to deal with the time-frequency dependencies
between two time-series.
The cross-wavelet transform of two time-series, x.t/ and y.t/, is defined as
where Wx and Wy are the wavelet ˇtransformsˇ of x and y, respectively. The cross-
wavelet power is simply given by ˇWxy .; s/ˇ. While we can interpret the wavelet
power spectrum as depicting the local variance of a time-series, the cross-wavelet
power of two time-series depicts the local covariance between these time-series at
each time and frequency.
In analogy with the concept of coherency used in Fourier analysis, given two
time series x.t/ and y.t/ one can define their complex wavelet coherency %xy by:
S Wxy
%xy D 1=2 ; (4)
S .jWx j2 / S jWy j2
times.4 Time and scale smoothing can be achieved by convolution with appropriate
windows; see Aguiar-Conraria and Soares (2014), for details.
The absolute value of the complex wavelet coherency is called the wavelet
coherency and is denoted by Rxy , i.e.
jS Wxy j
Rxy D 1=2 ; (5)
S .jWx j2 / S jWy j2
A phase-difference5 of zero indicates that the time series move together at the
specified time-frequency; if xy 2 .0; 2 /, then the series move in phase, but the time
series x leads y; if xy 2 . 2 ; 0/, then it is y that is leading; a phase-difference of
(or ) indicates an anti-phase relation; if xy 2 . 2 ; /, then y is leading; time
series x is leading if xy 2 .; 2 /.
To test for statistical significance of the wavelet coherency we rely on Monte
Carlo simulations. However, there are no such tests for the phase-differences,
because there is no consensus on how to define the null hypothesis. The advice is
that we should only interpret the phase-difference on the regions where coherency
is statistically significant.
Cxy D U †V H ; (7)
4
In the above formula and in what follows, we will omit the arguments
.; s/.
5 =.Wxy /
Some authors prefer a slightly different definition, Arctan <.W / : In this case, one would
xy
lkx WD uH
k Cx and lky WD vH
k Cy : (8)
X
F X
F
Cx D uk lkx ; Cy D vk lky ; (9)
kD1 kD1
and also that very good approximations can be obtained by using only a small
number K < F of terms in the above expressions.
After reducing the information contained in the complex coherency matrices Cx
and Cy to a few components, say the K most relevant leading patterns and singular
vectors, the idea is to define a distance between the two matrices by appropriately
measuring the distances from these components. We compute the distance between
two vectors (leading patterns or leading vectors) by measuring the angles between
each pair of corresponding segments, defined by the consecutive points of the two
vectors, and take the mean of these values. This would be easy to perform if all the
values were real. In our case, because we use a complex wavelet, we need to define
an angle in a complex vector space. Aguiar-Conraria and Soares (2011b) discuss
several alternatives. In this paper, we make usep of the Hermitian inner product
ha; biC D aH b and corresponding norm kak D ha; aiC and compute the so-called
Hermitian angle between the complex vectors a and b, ‚H .a; b/, by the formula
jha; biC j
cos .‚H / D ; ‚H 2 Œ0; : (10)
kakkbk 2
X
M 1
1 p q
d.p; q/ D ‚H si ; si (11)
M 1 i D1
p p
where the i th segment si is the two-vector si WD .i C1; pi C1 /.i; pi / D .1; pi C1
pi /. To compare the matrix Cx with the complex wavelet coherencies of country x
150 L. Aguiar-Conraria et al.
with the corresponding matrix for country y, Cy , we then compute the following
distance:
PK h
i
kD1 k
2
d lk k
;
x y l C d .uk ; v k /
dist Cx ; Cy D PK ; (12)
2
kD1 k
where k are the kth largest singular values that correspond to the first K leading
patterns and K leading vectors.
The above distance is computed for each pair of countries and, with this
information, we can then fill a matrix of distances.
We analyze the oil price-macroeconomy relation in the Euro area by looking both at
the coherency content and phasing of cycles. We look at the first 11 countries joining
the Euro: Austria, Belgium, Finland, France, Germany, Ireland, Italy, Luxembourg,
Netherlands, Portugal and Spain. For this type of purpose, to measure real economic
activity, most studies use either real GDP or an Industrial Production Index. We use
the Industrial Production Index because wavelet analysis is quite data demanding,
and having monthly data is a plus. We use seasonally adjusted data from the OECD
Main Economic Indicators database, from January of 1986 to December of 2011.
We have, therefore, 26 years of data. Exactly 13 years before and 13 years after
the Euro adoption. The oil price data is the West Texas Intermediate Spot Oil Price
taken from the Federal Reserve Economic Data—FRED—St. Louis Fed.
In Fig. 1, we can see the behavior of the Industrial Production Index for three
distinct countries: Finland, Germany, and Portugal. These three countries, as we will
see next, have distinct behaviors. While Germany is part of the Euro core, Finland
and Portugal are not. In particular, while in the frst half of the sample Finland is
not synchronized with the rest of Europe, in the second half the convergence is
obvious. This convergence is not observed in the case of Portugal. If one computes
the correlation between the series, these results can be reasonably predicted. For
example, before 1999, the correlation between Finland’s and Germany’s IP is 0.05,
which increases to 0.83 after 1999. Between Portugal and Germany, the correlation
between the two series remains relatively constant (0.5 in both samples).
Note, however, that we do not plan to compare the industrial production indexes
by themselves. We want to compare their reactions to the same oil price shocks. For
each country, we estimate the wavelet coherency between the yearly rate of growth
of Industrial Production and the oil price. It is known that oil price increases are
6
To replicate our results, the reader can use a Matlab wavelet toolbox that we wrote. It is
freely available at http://sites.google.com/site/aguiarconraria/joanasoares-wavelets. Our data is
also available in that website.
Oil Shocks and the Euro as an Optimum Currency Area 151
Industrial producon
30
20
10
0
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
-10
-20
Finland Germany Portugal
-30
more important than oil price decreases. Because of that, Hamilton (1996 and 2003)
proposed a nonlinear transformation of the oil price series. In our computations, we
use the Hamilton’s Net Oil Price. Because we focus our analysis on business cycle
frequencies, we estimate the coherence for frequencies corresponding to periods
between 1.5 and 8 years.
In Fig. 2, we have our first set of results. For each country, on the left (a) we have
the wavelet coherency between Industrial Production and Oil Prices.7 On the right,
we have the phase-differences: on the top (b), we have the phase-difference in the 2–
4 years frequency band (chosen to capture the region of high coherency that appears
in most of the countries after 2000); in the bottom (c), we have the phase-difference
in the 4–8 years frequency band, which captures the regions of high coherency in
the late 1980s and in the first half of the 1990s—recall that it only makes sense to
interpret the phase-differences in the regions of high coherency.
For most countries the region with the strongest coherency is located between the
mid-1980s and mid-1990s at the 4–8 years frequency band. And for most countries,
the phase-difference is consistently between =2 and ; suggesting that oil price
increases anticipate downturns in the Industrial Production. After the Euro adoption,
in 1999, for most of the countries, the strongest region of high coherency is in
the 2–4 years frequency band after 2005. Again, the phase differences are located
between =2 and , consistent with the idea that negative oil shocks anticipate
7
The grey contour designates the 5 % significance level, obtained by 1,000 Monte Carlo sim-
ulations based on two independent ARMA(1,1) processes as the null. Coherency ranges from
white/light grey (low coherency) to black/dark grey (high coherency). The cone of influence, which
is the region subject to border distortions is shown with a thick line.
152 L. Aguiar-Conraria et al.
Fig. 2 On the left—wavelet coherency between each country’s Industrial Production and Oil
Prices. The grey scale ranges from white/light grey (low coherency) to black/dark grey (high
coherency). The grey contour designates the 5 % significance level, based on MonteCarlo
simulations. On the right—phase-difference between Industrial Production and Oil Prices at 2–
4 years (top) and 4–8 years (bottom) frequency bands
downturns in the Industrial Production. The most interesting aspect is this change
in the predominant frequencies.
These results are consistent with the results of other authors, who conclude
that, in the more recent times, the negative impact of oil shocks is shorter-lived
than before. This may happen because the oil exporting countries follow different
pricing strategies—see, for example, Aguiar-Conraria and Wen (2012)—, because
Oil Shocks and the Euro as an Optimum Currency Area 153
Au Be Fi Fr Ge Ir It Lx Ne Pt Sp
Austria 0.091 0.055 0.042 0.050 0.097 0.048 0.066 0.056 0.095 0.049
Belgium 0.056 0.084 0.085 0.082 0.126 0.077 0.087 0.074 0.070 0.093
Finland 0.077 0.067 0.061 0.078 0.097 0.053 0.073 0.059 0.093 0.065
France 0.049 0.076 0.075 0.054 0.100 0.049 0.069 0.049 0.086 0.047
Germany 0.041 0.067 0.074 0.053 0.089 0.043 0.061 0.062 0.083 0.054
Ireland 0.056 0.063 0.072 0.054 0.064 0.075 0.106 0.098 0.104 0.078
Italy 0.060 0.066 0.078 0.048 0.058 0.060 0.059 0.051 0.084 0.040
Luxembourg 0.056 0.066 0.079 0.065 0.057 0.059 0.050 0.072 0.083 0.067
Netherlands 0.059 0.075 0.067 0.073 0.059 0.060 0.067 0.060 0.080 0.053
Portugal 0.075 0.077 0.074 0.068 0.071 0.070 0.056 0.069 0.062 0.092
Spain 0.063 0.076 0.078 0.052 0.065 0.045 0.060 0.066 0.055 0.083
Gray scale code: p < 0.01 p < 0.05 p < 0.10
Fig. 3 Lower triangle-complex wavelet dissimilarities before the Euro. Upper triangle-complex
wavelet dissimilarities after the Euro
the nature of oil shocks was different—see, for example, Hamilton (2009) or Kilian
(2008 and 2009)—or because the western macroeconomies became more flexible—
see, for example, Blanchard and Galí (2010) who argue that less rigid wages as
well as a smaller share of oil in the production are candidate explanations for the
shorter-lived impact of oil shocks.
To assess if the oil price-macroeconomy relation is similar between two coun-
tries, we compute the distance between the complex wavelet coherencies matrices
associated with both countries, using formula (12). This measure takes into account
both the real and the imaginary parts of the complex coherency. A value very
close to zero means that (1) the contribution of cycles at each frequency to the
total correlation between oil prices and the industrial production is similar in
both countries, (2) this contribution happens at the same time in both countries
and, finally, (3) the leads and lags between the oil price cycles and the industrial
production cycles are similar in both countries. Note that the Anna Karenina
principle applies. If the distance is zero, or close to zero, the two series are similar
in every regard. If the distance is not zero, the origin of the distance may be any
of the three referred aspects. To distinguish between them, one may look at the
pictures with the coherency and phase-differences between the two series. To test if
the similarity is statistically significant, we again rely on Monte Carlo simulations.
For each pair of countries we estimate two distances: one before the Euro
adoption and the other after the adoption. It is as if we divide each of the pictures in
Fig. 2 in two halves: left and right. To measure the distances between two countries
before the Euro, we compare the left halves. And we compare the right halves
to measure the distance after 1999. Given that, by definition, a distance matrix is
symmetric, to save space, we use the lower triangle for the distances before the Euro
adoption and in the upper-triangle we have the distances after the euro adoption.
These results are described in Fig. 3.
It is interesting to note that the endogeneity of the OCAs does not survive our
analysis, at least when one considers the case of Portugal, Belgium and, even more
154 L. Aguiar-Conraria et al.
strongly, Ireland. Before the Euro adoption, Portugal was synchronized with Italy
(1 % significance), France, Netherlands, Luxembourg (5 % significance), Austria,
Finland, Germany and Ireland (10 % significance). In the second half of the sample,
Portugal is only synchronized with Belgium. Similar results hold for Belgium.
The case of Ireland is even stronger. Before the birth of the Euro, Ireland was
synchronized with every country except Finland. With 1 % significance in the
majority of the cases. After the Euro adoption, Ireland is simply synchronized with
Italy and Spain at 10 % significance level. The only country that clearly became
more synchronized after the Euro adoption was Finland.
The same information is displayed in Fig. 4, where we use the distances of
Fig. 3 to plot a map with the countries into a two-axis system—see Camacho et al.
(2006).8 This cannot be performed with perfect accuracy because distances are not
Euclidean. In these maps it is clear that while most of the countries became slightly
tighter, particularly in the case of Finland who moved to the core after 1999, this
was not the case for Belgium, Portugal and Ireland, who now look like three isolated
islands with no strong connections to mainland.
4 Conclusions
8
Basically, we reduce each of the distance matrices to a two-column matrix, called the configura-
tion matrix, which contains the position of each country in two orthogonal axes.
Oil Shocks and the Euro as an Optimum Currency Area 155
used wavelet analysis to study the impact of the Euro adoption on the member
countries’ macroeconomic reaction to one of the most common shocks: oil shocks.
Given that energy is such an important production input, and that due to several
reasons (including ecological, political and economic reasons) it is such a volatile
sector, the transmission mechanism of oil shocks to the macroeconomy is bound
to have important effects. If a group of countries have asymmetric responses to the
same oil shock, it is highly unlikely that those countries form an OCA.
We estimated the wavelet coherency between the Industrial production of the
11 countries that first joined the Euro and the oil price. We uncovered evidence
that shows that the oil-macroeconomy relation changed in the past decades. In the
second half of 1980s and in the first half of 1990s, oil price increases preceded
macroeconomic downturns. This effect occurred at frequencies with periods around
6 years. However, in the last decade, the regions of high coherencies were located
at frequencies that corresponded to shorter-run cycles (cycles with periods around 3
years).
We also showed that after the Euro adoption some countries became more similar
with respect to how their macroeconomies react to oil shocks. This is true for
Austria, France, Germany, Italy, Luxembourg, Netherlands, and Spain and even
more true for Finland, who had a rather asymmetric reaction to oil shocks before
the Euro adoption. However, we also showed that at least three countries do not
share a common response to oil shocks: Portugal, Ireland and Belgium. Particularly
interesting is the conclusion that the adoption of the common currency did not
contribute to a higher degree of synchronization between these countries and the
rest of the countries in the Euroland. This effect is particular surprising in the case
of Ireland, who was highly synchronized before 1999.
Acknowledgements We offer this paper as a token of our intellectual respect for James Ramsey,
who, in a series of papers, some of them co-authored with Camille Lampart, got us interested on
wavelet applications to Economics. We thank an anonymous referee for his comments. The usual
disclaimer applies. Financial support from Fundação para a Ciência e a Tecnologia, research grants
PTDC/EGE-ECO/100825/2008 and PEst-C/EGE/UI3182/2013, through Programa Operacional
Temático Factores de Competitividade (COMPETE) is gratefully acknowledged.
References
Aguiar-Conraria L, Soares MJ (2011a) Oil and the macroeconomy: using wavelets to analyze old
issues. Empir Econ 40(3):645–655
Aguiar-Conraria L, Soares MJ (2011b) Business cycle synchronization and the Euro: a wavelet
analysis. J Macroecon 33(3):477–489
Aguiar-Conraria L, Soares MJ (2014) The continuous wavelet transform: moving beyond uni- and
bivariate analysis. J Econ Surv. 28(2):344–375
Aguiar-Conraria L, Wen Y (2012) OPEC’s oil exporting strategy and macroeconomic (in)stability.
Energy Econ 34(1):132–136
Aguiar-Conraria L, Magalhães PC, Soares MJ (2012) Cycles in politics: wavelet analysis of
political time-series. Am J Polit Sci 56(2):500–518
156 L. Aguiar-Conraria et al.
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 157
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__8,
© Springer International Publishing Switzerland 2014
158 J. Baruník et al.
Robe (2013). Fratzscher et al. (2013) show that oil was not correlated with stocks
until 2001, but as oil began to be used as a financial asset, the link between oil
and other assets strengthened. Finally, stocks reflect the economic and financial
development of firms and market perceptions of a companys standing; they also
represent investment opportunities and a link to perceptions of aggregate economic
development. Further, stock prices provide helpful information on financial stability
and can serve as an indicator of crises (Gadanecz and Jayaram 2009). Thus, a broad
market index can be used to convey information on the status and stability of the
economy. In our analysis, we consider the S&P 500, which is frequently used as a
benchmark for the overall U.S. stock market. In our analysis, stocks complement the
commodities of gold and oil to represent the financial assets traded by the modern
financial industry.
What motivates our analysis of the links among the three assets above? The
literature analyzing the dynamic correlations among assets proposes a number of
important reasons that the issue should be investigated. An obvious motivation
for analyzing co-movements is that substantial correlations among assets greatly
reduce their potential to be included in a portfolio from the perspective of risk
diversification. Even if assets in a portfolio initially exhibit low correlation, a
potential change in correlation patterns represents an imperative to redesign such
a portfolio. Both issues are also linked to the Modern Portfolio Theory (MPT) of
Markowitz (1952). MPT assumes, among other things, that correlations between
assets are constant over time. However, correlations between assets may well depend
on the systemic relationships between them and change when these relationships
change. Thus, evidence of time-varying correlations between assets substantially
undermines MPT results and, more important, its use to protect investors from risk.
Empirical evidence on co-movements among assets may well depend on the
choice of assets, technique employed, and the period under study. In a seminal
study on co-movements in the monthly prices of unrelated commodities, Pindyck
and Rotemberg (1990) find excess co-movement among seven major commodities,
including gold and oil. However, the co-movements are measured in a rather
simple manner as individual cross-correlations over the entire period (1960–1985).
The excess co-movements were attributed to irrational or herding behavior in
markets. Using a concordance measure, Cashin et al. (1999) analyze the same
set of commodities over the same period as Pindyck and Rotemberg (1990) and
find no evidence for co-movements in the prices of the analyzed commodities.
When they extend the period to 1957–1999, the co-movements are again absent
and they contend that the entire notion of co-movements in the prices of unrelated
commodities is a myth. A single exception is the co-movement in gold and oil prices
that Cashin et al. (1999) credit to inflation expectations and further provide evidence
that booms in oil and gold prices often occur at the same time (Cashin et al. 2002).
Still, it has to be noted that gold may well be traded independently from other assets
on the pretext of being a store of value during downward market swings. Hence, it
does not necessarily co-move with related or unrelated commodities.
An extension of the co-movement analysis to the time-frequency domain offers
the potential for an interesting comparison of how investment horizons influence
the diversification of market risk. The importance of various investment horizons
160 J. Baruník et al.
for portfolio selection has been recognized by Samuelson (1989). In this respect,
Marshall (1994) demonstrates that investor preferences for risk are inversely related
to time and different investment horizons have direct implications for portfolio
selection. Graham et al. (2013) provide empirical evidence related to the issues
studied in this chapter by studying the co-movements of various assets using wavelet
coherence and demonstrating that at the long-term investment horizon co-movement
among stocks and commodities increased at the onset of the 2007–2008 financial
crisis. Thus, the diversification benefits of using these assets are rather limited.
With the above motivations and findings in mind, in this chapter we adopt a
comprehensive approach and contribute to the literature by analyzing the prices
of three assets that have unique economic and financial characteristics: the key
commodities gold and oil and important stocks represented by the S&P 500 index.
To this end, we consider a long period (1987–2012) at both intra-day and daily
frequencies and an array of investment horizons to deliver a comprehensive study
in the time-frequency domain based on wavelet analysis. Our key empirical results
can be summarized as follows: (1) correlations among the three assets are low or
even negative at the beginning of our sample but subsequently increase, and the
change in the patterns becomes most pronounced after decisive structural breaks
take place (breaks occur during the 2006–2009 period at different dates for specific
asset pairs); (2) correlations before the 2007–2008 crisis exhibit different patterns
at different investment horizons; (3) during and after the crisis, the correlations
exhibit large swings and their differences at shorter and longer investment horizons
become negligible. This finding indicates vanishing potential for risk diversification
based on these assets: after the structural change, gold, oil, and stocks could not be
combined to yield effective risk diversification during the post-break period studied.
The chapter is organized as follows. In Sect. 2, we introduce the theoretical
framework for the wavelet methodology we use to perform our analysis. Our large
data set is described in detail in Sect. 3 with a number of relevant commentaries. We
present our empirical results in Sect. 4. Section 5 briefly concludes.
Ht D Dt Rt Dt : (1)
p
where Rt is the conditional correlation matrix and Dt D diagf hi;t g is a diagonal
matrix of time varying standard variation from the i -th univariate (G)ARCH.p; q/
processes hi;t . Parameter n represents the number of assets at time t D 1; : : : ; T .
The correlation matrix is then given by the transformation
p p p p
Rt D diag. q11;t ; : : : ; qnn;t /Qt diag. q11;t ; : : : ; qnn;t /; (2)
where Qt D .qij;t / is
1
Bauwens and Laurent (2005) demonstrate that the one-step and two-step methods provide very
similar estimates.
162 J. Baruník et al.
X
M
c t;h D
RC rt hC. i /h r0 t hC. i /h ; (4)
M M
i D1
As we are interested in how the correlations vary over time and at different
investment horizons, we need to conduct a wavelet analysis that allows us to work
simultaneously in the time and frequency domains. The DCC GARCH and realized
volatility methods outlined above do not allow the researcher to extend the analysis
2
This is the optimal sampling frequency determined based on the substantial research on the noise-
to-signal ratio. The literature is well surveyed by Hansen and Lunde (2006), Bandi and Russell
(2006), McAleer and Medeiros (2008), and Andersen and Benzoni (2007).
Wavelet-Based Correlation Analysis of the Key Traded Assets 163
to the frequency domain; hence we are only able to study the covariance matrix in
the time domain.
Wavelet time-frequency domain analysis is very powerful tool when we expect
changes in economic relationships such as structural breaks. Wavelet analysis can
react to these changes because the wavelet transform uses a localized function with
finite support for the decomposition—a wavelet. In contrast, when using a pure
frequency approach, represented by the Fourier transform, one obtains information
on all of the frequency components, but because the amplitude is fixed throughout
the period considered, the time information is completely lost. Thus, in the event
of sudden changes in economic relationships or the presence of breaks during
the period studied, one is unable to locate precisely where this change occurs.
Additionally, due to the non-stationarity induced by such breaks, Fourier transform-
based estimates may not be precise. Therefore, the wavelet transform has substantial
advantages over the Fourier transform when the time series is non-stationary or is
only locally stationary (Roueff and Sachs 2011).
An important feature of wavelet analysis is the decomposition of the economic
relationship into time-frequency components. Wavelet analysis often uses scale
instead of frequency, as scale typically characterizes frequency bands. The set
of wavelet scales can be further interpreted as investment horizons at which we
can study the economic relationships separately. Thus, every scale describes the
development of the economic relationship at a particular frequency while retaining
the time dynamics. Subsequently, the wavelet decomposition generally provides a
more complex picture compared to the time domain approach, which aggregates
all investment horizons. Therefore, if we expect that economic relationships follow
different patterns at various investment horizons, then a wavelet analysis can
uncover interesting characteristics of the data that would otherwise remain hidden.
An introduction to the wavelet methodology with a remarkable application to
economics and finance is provided in Gençay et al. (2002) and Ramsey (2002).
While we use a discrete version of the wavelet transform, we begin our introduction
with the continuous wavelet transform (CWT), as it is the cornerstone of the wavelet
methodology. Next, we continue by describing a special form of discrete wavelet
transform named the “maximal overlap discrete wavelet transform” (MODWT).
Following standard notation, we define the continuous wavelet
transform Wx .j; s/
t s
as a projection of a wavelet function3 j;s .t/ D p1j j
2 L 2
.R/ onto the time
series x.t/ 2 L2 .R/,
3
We use the least asymmetric wavelet with length LD8, denoted as LA(8).
164 J. Baruník et al.
Z 1
1 t s
Wx .j; s/ D x.t/ p dt; (5)
1 j j
where s determines the position of the wavelet in time. The scaling, or dilatation
parameter j controls how the wavelet is stretched or dilated. If the scaling parameter
j is low (high), then the wavelet is more (less) compressed and able to detect high
(low) frequencies. One of the most important conditions a wavelet must fulfill is
R1
the admissibility condition: C D 0 j‰.f f
/j2
df <1, where ‰.f / is the Fourier
transform of a wavelet .:/ . The decomposed time series x.t/ can be subsequently
recovered using the wavelet coefficients as follows
Z 1 Z 1
1 dj
x.t/ D Wx .j; s/ j;s .t/ds 2 ; s>0: (6)
C 0 1 j
Further, the continuous wavelet transform preserves the energy or variance of the
analyzed time series; hence
Z 1 Z 1
1 2 dj
x D
2
jWx .j; s/j ds 2 : (7)
C 0 1 j
Equation (7) is an important property that allows us to work with the wavelet
variance, covariance and the wavelet correlation. For a more detailed introduction
to continuous wavelet transform and wavelets, see Daubechies (1992), Chui (1992),
and Percival and Walden (2000).
As we study discrete time series, we only require a limited number of scales, and
some form of discretization is needed. The counterpart of the continuous wavelet
transform in discrete time is the discrete wavelet transform (DWT),4 which is a
parsimonious form of the continuous transform, but it has some limiting properties
that make its application to real time series relatively difficult. These limitations
primarily concern the restriction of the sample size to the power of two and the
sensitivity to the starting point of the transform. Therefore, in our analysis, we use
a modified version of the discrete wavelet transform—MODWT—which has some
advantageous properties that are summarized below.
In contrast to the DWT, the MODWT does not use downsampling, as a
consequence the vectors of the wavelet coefficients at all scales have equal length,
corresponding to the length of transformed time series. Thus, the MODWT is not
restricted to sample sizes that are powers of two. However, the MODWT wavelet
coefficients are no longer orthogonal to each other at any scale. Additionally, the
MODWT is a translation-invariant type of transform; therefore, it is not sensitive
4
For a definition and detailed discussion of the discrete wavelet transform, see Mallat (1998),
Percival and Walden (2000), and Gençay et al. (2002).
Wavelet-Based Correlation Analysis of the Key Traded Assets 165
to the choice of the starting point of the examined process. Both the DWT and
MODWT wavelet and scaling coefficients can be used for energy decomposition
and analysis of variance of a time series in the time-frequency domain, however
Percival (1995) demonstrates the dominance of the MODWT estimator of variance
over the DWT estimator. Furthermore, Serroukh et al. (2000) analyze the statistical
properties of the MODWT variance estimator for non-stationary and non-Gaussian
processes and show its statistical properties. For additional details on the MODWT,
see Mallat (1998) and Percival and Walden (2000).
X
L1 X
L1
Wx .1; s/ h1;l x.sl modN/; Vx .1;s/ g1;l x.sl modN/: (8)
lD0 lD0
The second step of the algorithm uses the scaling coefficients Vx .1;s/ instead of xt .
The wavelet and scaling filters have a width Lj D 2j 1 .L 1/ C 1; therefore, for
the second scale, the length of the filter is L2 D 15. After filtering, we obtain the
wavelet coefficients at scale j D 2:
X
L1 X
L1
Wx .2;s/ h2;l Vx .1;sl modN/; Vx .2;s/ g2;l Vx .1;sl modN/: (9)
lD0 lD0
After the two steps of the algorithm we have two vectors of the MODWT wavelet
coefficients at scale j D 1 and j D 2; Wx .1; s/ ; Wx .2; s/ and one vector
of the MODWT wavelet scaling coefficients at scale two Vx .2; s/, where s D
0; 1; : : : ; N 1 is the same for all vectors. The vector Wx .1; s/ represents wavelet
coefficients that reflect variations at the frequency band f Œ1=4; 1=2, Wx .2; s/:
f Œ1=8; 1=4 and Vx .2; s/: f Œ0; 1=8.
The transfer function of the filter hl W l D 0; 1; : : : ; L 1, where L is the width
of the filter, is denoted as H.:/. The pyramid algorithm exploits the fact that if we
166 J. Baruník et al.
increase the width of the filter to 2j 1 .L 1/ C 1, then the filter with the impulse
response sequence in the form5:
has a transfer function defined as H 2j 1 f . Using this feature of the filters, we
can write the pyramid algorithm simply in the following form
X
L1
Wx .j; s/ hl Vx j 1;s 2j 1 l modN ; sD 0; 1; : : : ; N 1; (11)
lD0
X
L1
Vx .j; s/ gl Vx j 1;s 2j 1 l modN ; sD 0; 1; : : : ; N 1; (12)
lD0
where for the first stage we set x D Vx .0; s/. Thus, after performing the MODWT,
we obtain J m log2 .N / vectors of wavelet coefficients and one vector of
scaling coefficients. The j -th level wavelet coefficients in vector Wx .j; s/ represent
frequency bands f Œ1=2j C1; 1=2j , while the j -th level scaling coefficients in
vector Vx .j; s/ represent f Œ0; 1=2j C1. In the subsequent analysis of the wavelet
correlations we apply the MODWT with the wavelet filter LA(8), with reflecting
boundary conditions.
5
The number of zeros between filter coefficients is 2j 1 1, i.e., for the filter at the first stage, we
have no zeros, and for the second stage there is just one zero between each coefficient; hence the
width of the filter is 15.
Wavelet-Based Correlation Analysis of the Key Traded Assets 167
wavelet correlation xy .j / between time series xt and yt at scale j is then defined
as (Whitcher et al. 2000):
cov Wx .j; s/Wy .j; s/ xy .j /
xy .j / D 12 .j / .j / ; (13)
Var .Wx .j; s// var Wy .j; s/ x y
where x2 .j / and xy .j / denote wavelet variance and covariance, respectively.
Additional details on wavelet variance and covariance are provided in Appen-
dices “Wavelet Variance” and “Wavelet Covariance”. The wavelet correlation
estimator directly uses the definition of the wavelet correlation Eq. (13), thus we
can write:
Oxy .j /
Oxy .j / ; (14)
O x .j /O y .j /
3 Data
In the empirical section, we analyze the prices of gold, oil, and a representative U.S.
stock market index, the S&P 500. The data set contains the tick prices of the S&P
500 and the futures prices of gold and oil, where we use the most active rolling
contracts from the pit (floor traded) session. All of the assets are traded on the
platforms of the Chicago Mercantile Exchange (CME).6
We restrict our study to the intraday 5-min and daily data sampled during the
business hours of the New York Stock Exchange (NYSE), as most of the liquidity
of the S&P 500 comes from the period when the U.S. markets are open. The
sample period runs from January 2, 1987 until December 31, 2012.7 To synchronize
the data, we employ Greenwich Mean Time (GMT) stamp matching. Further, we
exclude transactions executed on Saturdays and Sundays, U.S. federal holidays,
December 24 to 26, and December 31 to January 2, as the low activity on these
days could lead to estimation bias. Therefore, we use data from 6,472 trading
6
Oil (Light Crude) is traded on the New York Mercantile Exchange (NYMEX) platform, gold is
traded on the Commodity Exchange, Inc. (COMEX), a division of NYMEX, and the S&P 500 is
traded at the CME in Chicago. All data were acquired from Tick Data, Inc.
7
The CME introduced the Globex(R) electronic trading platform in December 2006 and began to
offer nearly continuous trading.
168 J. Baruník et al.
Table 1 Descriptive statistics for high-frequency and daily gold, oil and, stock (S&P 500) returns
over the sample period extending from January 2, 1987 until December 31, 2012
High-frequency data Daily data
Gold Oil Stocks Gold Oil Stocks
Mean 1.00e06 3.19e06 2.46e06 2.22e04 2.42e04 2.70e04
St. dev. 0.001 0.002 0.001 0.010 0.023 0.012
Skewness 0:714 1.065 0.326 0:147 1:063 0:392
Kurtosis 47.627 104.561 32.515 10.689 19.050 11.474
Minimum 0:042 0:045 0:024 0:077 0:384 0:098
Maximum 0.023 0.163 0.037 0.103 0.136 0.107
Fig. 1 Normalized prices of gold (thin black), oil (black), and stocks (gray). The figure highlights
several important recession periods in gray (described in greater detail in the text), and crashes
using black lines: (a) Black Monday; (b) the Asian crisis; (c) the Russian ruble devaluation; (d) the
dot-com bubble burst; (e) the WTC 9/11 attacks; (f ) the Lehman Brothers Holdings bankruptcy;
and (g) the Flash Crash
days. Descriptive statistics of the intra-day and daily returns of the data that form
our sample are presented in Table 1. Overall, the statistics are standard with the
remarkable exception of a very high excess kurtosis of 104.561 for oil. This is
mainly a consequence of a single positive price change of 16.3 % (January 19,
1991), when the worst deliberate environmental damage in history was caused by
Iraqi leader Saddam Hussein, who ordered a large amount of oil to be spilled into
the Persian Gulf (Khordagui and Al-Ajmi 1993).
Figure 1 depicts the development of the prices of the three assets, in which several
recessions and crisis periods can be detected. Following the National Bureau of
Economic Research (NBER),8 there were three recessions in the U.S. during the
period studied: July 1990 to March 1991, March 2001 to November 2001, and
December 2007 to June 2009. These recessions are highlighted by gray bands.
Furthermore, black lines depict 1-day crashes associated with large price drops.
Specifically, Black Monday (October 19, 1987), the Asian crisis (October 27, 1997),
8
US Business Cycle Expansions and Contractions, NBER, accessed April 5, 2013 (http://www.
nber.org/cycles.html).
Wavelet-Based Correlation Analysis of the Key Traded Assets 169
the Russian ruble devaluation (August 17, 1998), the dot-com bubble burst (March
10, 2000), the World Trade Center attacks (September 11, 2001), the Lehman
Brothers Holdings bankruptcy (September 15, 2008), and the Flash Crash (March
6, 2010). The largest 1-day drops in the studied sample occurred on the following
dates, with percentage declines given in parentheses: October 19, 1987 (20.47 %),
October 26, 1987 (8.28 %), September 29, 2008 (8.79 %), October 9, 2008 (7.62 %),
October 15, 2008 (9.03 %), and December 1, 2008 (8.93 %).
The above crashes differ in nature, and we discuss them briefly below. On
Monday, October 19, 1987, known as Black Monday, stock markets around the
world dropped in a very short time and recorded the largest 1-day crash in
history. After this extreme event, many expected the most troubled years since the
1930s. Nevertheless, stock markets quickly recovered from the losses and closed
1987 in positive territory. There is still no consensus on the cause of the crash;
potential reasons include illiquidity, program trading, overvaluation and market
psychology.9
For many consecutive years stock markets did not record large shocks until
1996 when the Asian financial crisis emerged. Investors were leaving emerging
overheated Asian shares that on October 27, 1997 resulted in a mini-crash of the
U.S. markets. On August 17, 1998 the Russian government devalued the ruble,
defaulted on domestic debt and declared a moratorium on payment to foreign
creditors, which also caused an international crash. The 1996 and 1997 crashes are
believed to be exogenous shocks to U.S. stock markets. The inflation of the so-called
dot-com bubble emerged in the period 1997–2000, when several internet-based
companies entered the markets and fascinated many investors confident in their
future profits, while overlooking the companies’ fundamental value. Ultimately,
this resulted in a gradual collapse, or bubble burst, during the years 2000–2001.
The World Trade Centre was attacked on September 11, 2001. Although markets
recorded a sudden drop, the shock was exogenous and should not be attributed to
internal market forces. The recent financial crisis of 2007–2008, also called as the
global financial crisis (for a detailed treatment, see Bartram and Bodnar (2009)),
was initiated by the bursting of the U.S. housing-market bubble. Consequently, in
September and October 2008, stock markets experienced large declines. On May
6, 2010, financial markets witnessed the largest intraday drop in history known
as the Flash Crash or The Crash of 2:45. The Dow Jones Industrial Average
declined by approximately 1,000 points (9 %), but the loss was recovered within
a few minutes. The crash was likely caused by high-frequency trading or large
directional bets.
9
For additional information on the crash, see Waldrop (1987) and Carlson (2007).
170 J. Baruník et al.
Fig. 2 Dynamics in gold-oil correlations. The upper plot of the panel contains the realized
correlation for each day of the sample and daily correlations estimated from the DCC GARCH
model. The lower plot contains time-frequency correlations based on the wavelet correlation
estimates from high-frequency data for each month separately. We report correlation dynamics
at 10-min, 40-min, 2.66 h (approximate), and 1.6-day (approximate) horizons depicted by the thick
black to thin black lines. The plots highlight several important recession periods in gray (described
in greater detail in the text), and crashes using black lines: (a) Black Monday; (b) the Asian crisis;
(c) the Russian ruble devaluation; (d) the dot-com bubble burst; (e) the WTC 9/11 attacks; (f ) the
Lehman Brothers Holdings bankruptcy; and (g) the Flash Crash
Dynamic correlations for each pair of assets are depicted in Figs. 2, 3, 4. Each
figure consists of two panels that plot correlations obtained by the three methods
described in Sect. 2. The upper panels of the figures display realized volatility-
based correlations computed on 5-min returns for each day and daily correlations
from the parametric DCC GARCH(1,1) estimates. The lower panels depict the
evolution of time-frequency correlations obtained through a wavelet decomposition
Wavelet-Based Correlation Analysis of the Key Traded Assets 171
Fig. 3 Dynamics in gold-stocks correlations. The upper plot of the panel contains the realized
correlation for each day of the sample and daily correlations estimated from the DCC GARCH
model. The lower plot contains time-frequency correlations based on the wavelet correlation
estimates from high-frequency data for each month separately. We report correlation dynamics
at 10-min, 40-min, 2.66 h (approximate), and 1.6-day (approximate) horizons depicted by the thick
black to thin black lines. The plots highlight several important recession periods in gray (described
in greater detail in the text), and crashes using black lines: (a) Black Monday; (b) the Asian crisis;
(c) the Russian ruble devaluation; (d) the dot-com bubble burst; (e) the WTC 9/11 attacks; (f ) the
Lehman Brothers Holdings bankruptcy; and (g) the Flash Crash
of 5-min data.10 The panel displays only four investment horizons as examples: 10,
40, 160 min and 1.6 days are depicted in the figures.
The correlations between asset pairs exhibit stable and similar patterns, where
majority of time the correlations are low or even negative, until 2001 between
gold and stocks, until 2004 between oil and stocks, and until 2005 between gold
and oil. After these stable years, the pattern of the correlations fundamentally
changes. The general pattern of dynamic correlations between the pairs of variables
is the same regardless of what method is used. Nevertheless, there are noticeable
differences. Correlations based on realized volatility provide very rough evidence.
More contoured correlation patterns are inferred from the DCC GARCH method.
The wavelet correlations illustrate the methods advantages over the two previous
10
For the sake of clarity in the plot, we report monthly correlations, computed on monthly price
time series.
172 J. Baruník et al.
Fig. 4 Dynamics in oil-stocks correlations. The upper plot of the panel contains the realized
correlation for each day of the sample and daily correlations estimated from the DCC GARCH
model. The lower plot contains time-frequency correlations based on the wavelet correlation
estimates from high-frequency data for each month separately. We report correlation dynamics
at 10-min, 40-min, 2.66 h (approximate), and 1.6-day (approximate) horizons depicted by the thick
black to thin black lines. The plots highlight several important recession periods in gray (described
in greater detail in the text), and crashes using black lines: (a) Black Monday; (b) the Asian crisis;
(c) the Russian ruble devaluation; (d) the dot-com bubble burst; (e) the WTC 9/11 attacks; (f ) the
Lehman Brothers Holdings bankruptcy; and (g) the Flash Crash
11
While the wavelet method is superior to the other two methods in terms of dynamic correlation
analysis, we employ the other two methods as a benchmark.
Wavelet-Based Correlation Analysis of the Key Traded Assets 173
values ranging from ˙0:014 for the first scale j D 1 up to ˙0:04 for the last scale
j D 4.12 Thus, based on the 95 % confidence intervals, all reported correlation point
estimates are statistically significant.
4.1.1 Gold–Oil
The analysis of the intraday data for the gold-oil pair reveals a short period (1990–
1991) of higher correlations, corresponding to the spike visible in Figs. 2, 3, 4,
which should be associated with the economic downturn in the U.S. from July
12
For the sake of brevity, we do not report confidence intervals for all estimates. These results are
available from the authors upon request.
174 J. Baruník et al.
1990 to March 1991. During the period 1992–2005, the intraday correlations are
remarkably low at short and longer horizons; see Table 2. In 2006, a significant
increase in correlation begins, reaching its maximum in 2012 at all investment
horizons. In contrast to the period 1990–1991, the recent financial crisis changed
the correlation structure of the gold and oil pair, indicating the existence of an
important structural break in the correlation structure. This result is in line with
the detected structural break on September 8, 2006 (Sect. 4.2). Therefore, in terms
of risk diversification, the situation changed dramatically for traders active at short-
term investment horizons, as there is a significant increase in correlation after 2008
at all available investment horizons.
Dynamic correlations based on daily data reveal a more complex pattern. From
1987 until just before the global financial crisis erupted, correlations at diverse
investment horizons seem quite heterogeneous (Table 2). We observe very low
correlations at short investment horizons measured in days, whereas at longer invest-
Wavelet-Based Correlation Analysis of the Key Traded Assets 175
In comparison to the gold-oil pair, the gold-stocks and oil-stocks pairs provide a
rather different picture (Tables 3 and 4). During the period 1991–1992, negative
correlations dominate, especially at longer horizons. The negative correlations are
176 J. Baruník et al.
quite frequent for the two pairs, but they occur more often for the gold-stocks pair.
Since 2001, the gold-stocks pair exhibits very rich correlation dynamics. The period
of negative correlation begins in 2001, reaching its minimum in 2003, followed by a
steady increase. After 2005, this pair exhibits significantly higher correlation, except
for two short periods in 2008 and 2009.13 In the 2012, we observe a significant
increase in the correlation between gold and stocks at all available scales. There is
an increase in magnitude that is three times larger relative to the previous year. This
finding indicates a very limited possibility to diversify risk between stocks and gold
in 2012.
The correlations of the oil-stocks pair also increased after the recent financial
crisis began. Nevertheless, unlike the other two pairs, the correlation between oil
and stocks before the crisis was considerably lower than after the crisis. This implies
that the developments in 2008 had the strongest impact on the correlation structure
of this pair. Further, from 2008 on, this pair has the highest correlation of the three
examined pairs and highly homogeneous correlations at all scales. Therefore, after
2008 until the end of our sample, we only observe a negligible possibility for risk
diversification in the sense of various investment horizons.
13
On an annual basis, there was only a small decrease in 2011, as shown in Table 3.
Wavelet-Based Correlation Analysis of the Key Traded Assets 177
to zero. Similar patterns are observed for the gold-stocks and oil-stocks pairs, for
which structural breaks were detected on May 5, 2009 and September 26, 2008,
respectively.
Thus, we observe that the correlations between asset pairs were very hetero-
geneous across investment horizons before the structural break. Conversely, after
the structural break, the correlation pattern became mostly homogeneous, which
implies that gold, oil and stocks could no longer be simultaneously included in a
single portfolio for risk diversification purposes. This finding contradicts the results
of Baur and Lucey (2010), who find gold to be a good hedge against stocks and
therefore a safe haven during financial market turmoil. However, our result is in line
with the argument of Bartram and Bodnar (2009) that diversification provided little
help for investors during the financial crisis.
The change in the correlation structure described above can also be attributed
to changes in investors’ beliefs,14 which become mostly homogeneous across
investment horizons after the structural break. The homogeneity can be partially
induced by broader uncertainty regarding financial markets pricing fundamentals.15
Investors tendency to favor more aggressive strategies may be one of the reasons that
we observe increased homogeneity in the correlations across investment horizons.
Furthermore, the homogeneity in correlations may have been increased by the
introduction of completely electronic trading on exchange platforms in 2005, which
was accompanied by an increased volume of automatic trading.
5 Conclusions
14
Additional information on the role of investors’ beliefs can be found in Ben-David and
Hirshleifer (2012).
15
Connolly et al. (2007) study the importance of time-varying uncertainty on asset correlation that
subsequently influences the availability of diversification benefits.
178 J. Baruník et al.
the investment horizon perspective. Thus, we are able to provide unique evidence
on how correlations among major assets vary over time and at different investment
horizons. We analyze dynamic correlations in the prices of gold, oil, and a broad
U.S. stock market index, the S&P 500, over 26 years from January 2, 1987 until
December 31, 2012. The analysis is performed on both intra-day and daily data.
Our findings suggest that the wavelet analysis outperforms the standard bench-
mark approaches. Further, it offers a crucial message based on the evidence of very
different patterns in linkages among assets over time. During the period before
the pairs of assets suffered from structural breaks, our results revealed very low,
even negative, but heterogeneous correlations for all pairs. After the breaks, the
correlations for all pairs increased on average, but their magnitudes exhibited large
positive and negative swings. Surprisingly, despite this strongly varying behavior,
the correlations between pairs of assets became homogeneous and did not differ
across distinct investment horizons. A strong implication emerges. Prior to the
structural break, it was possible to use all three assets in a well-diversified portfolio.
However, after the structural changes occurred, gold, oil, and stocks could not be
used in conjunction for risk diversification purposes during the post-break period
studied.
The variance of a time series can be decomposed into its frequency components,
which are called scales in the wavelet methodology. Using wavelets, we can identify
the portion of variance attributable to a specific frequency band of the examined
time series. In this section, we demonstrate how to estimate wavelet variance and
demonstrate that the summation of all of the components of wavelet variance yields
the variance of the time series.
Let us suppose a real-valued stochastic process xi , i D 1; : : : ;N , whose L2 th
backward difference is a covariance stationary stochastic process with mean zero.
Then, the sequence of the MODWT wavelet coefficients Wx .j; s/, unaffected by the
boundary conditions, for all scales j D 1; 2; : : : ; J m is also a stationary process
with mean zero. As we use the least asymmetric wavelet of length L D 8, we can
expect stationarity of the MODWT wavelet coefficients. Following Percival (1995),
we define the wavelet variance at scale j as the variance of wavelet coefficients at
scale j as:
For coefficients unaffected by the boundary conditions, which are defined for each
scale separately Mj DN Lj C1 > 0, the unbiased estimator of wavelet variance at
scale j reads:
X
N 1
1
x .j /2 D Wx .j; s/2 : (16)
Mj sDLj 1
where Hj .f / is the squared gain function of the wavelet filter hj (Percival and
Walden 2000). As the variance of a process xi is the sum of the contributions of the
wavelet variances at all scales we can write:
1
X
var.x/ D x .j /2 : (18)
j D1
In case we have only a finite number of scales, we have to add also variance of the
scaling coefficients vector; thus we can write:
Z 1=2 X
Jm
X
N 1
1
Oxy .j / D Wx .j; s/Wy .j; s/; (21)
Mj sDLj 1
For a finite real time series, the number of scales is limited by J m log2 .T /, the
covariance of xt and yt is a sum of covariances of the MODWT wavelet coefficients
xy .j / at all scales j D 1; 2; : : : ;J m and the covariance of the scaling coefficients
Vx .J; s/ at scale J m :
m
XJ
Cov .xt ; yt / DCov Vx .J ; s/; Vy .J ; s/ C
m m
xy .j /: (23)
j D1
X
L1 X
L1 1
X
hl D 0; h2l D 1=2; hl hlC2N D 0; N 2 ZN (24)
lD0 lD0 lD1
X
L1 X
L1 1
X
gl D 1; gl2 D 1=2; gl glC2N D 0; N 2 ZN : (25)
lD0 lD0 lD1
Wavelet-Based Correlation Analysis of the Key Traded Assets 181
The transfer function of a MODWT filter fhl g at frequency f is defined via the
Fourier transform as
1
X X
L1
H .f / D hl e i 2f l D hl e i 2f l (26)
lD1 lD0
References
Aggarwal R, Lucey BM (2007) Psychological barriers in gold prices? Rev Financ Econ 16(2):217–
230
Aguiar-Conraria L, Martins M, Soares MJ (2012) The yield curve and the macro-economy across
time and frequencies. J Econ Dyn Control 36:1950–1970
Andersen T, Benzoni L (2007) Realized volatility. In: Andersen T, Davis R, Kreiss J, Mikosch T
(eds) Handbook of financial time series. Springer, Berlin
Andersen T, Bollerslev T, Diebold F, Labys P (2003) Modeling and forecasting realized volatility.
Econometrica 71(2):579–625
Andrews DW (1993) Tests for parameter instability and structural change with unknown change
point. Econometrica 61:821—856
Andrews DW, Ploberger W (1994) Optimal tests when a nuisance parameter is present only under
the alternative. Econometrica 62:1383–1414
Bandi F, Russell J (2006) Volatility. In: Birge J, Linetsky V (eds) Handbook of financial
engineering. Elsevier, Amsterdam
Barndorff-Nielsen O, Shephard N (2004) Econometric analysis of realized covariation: high
frequency based covariance, regression, and correlation in financial economics. Econometrica
72(3):885–925
Bartram SM, Bodnar GM (2009) No place to hide: the global crisis in equity markets in 2008/2009.
J Int Money Finance 28(8):1246–1292
Baur DG, Lucey BM (2010) Is gold a hedge or a safe haven? an analysis of stocks, bonds and gold.
Financ Rev 45(2):217–229
Bauwens L, Laurent S (2005) A new class of multivariate skew densities, with application to
generalized autoregressive conditional heteroscedasticity models. J Bus Econ Stat 23(3):346–
354
Ben-David I, Hirshleifer D (2012) Are investors really reluctant to realize their losses? trading
responses to past returns and the disposition effect. Rev Financ Stud 25(8):2485–2532
Bollerslev T (1990) Modelling the coherence in short-run nominal exchange rates: a multivariate
generalized arch approach. Rev Econ Stat 72(3):498–505
Büyükşahin B, Robe MA (2013) Speculation, commodities and cross-market linkages. J Int Money
Finance 42:38–70
Carlson M (2007) A brief history of the 1987 stock market crash with a discussion of the federal
reserve response. Divisions of Research & Statistics and Monetary Affairs, Federal Reserve
Board
Cashin P, McDermott C, Scott A (1999) The myth of co-moving commodity prices. IMF working
paper WP/99/169 international monetary fund, Washington
Cashin P, McDermott CJ, Scott A (2002) Booms and slumps in world commodity prices. J Dev
Econ 69(1):277–296
Chui C (1992) An inroduction to wavelets. Academic, New York
182 J. Baruník et al.
Connolly RA, Stivers C, Sun L (2007) Commonality in the time-variation of stock–stock and
stock–bond return comovements. J Financ Mark 10(2):192–218
Daubechies I (1992) Ten lectures on wavelets. SIAM, Philadelphia
Engle R (2002) Dynamic conditional correlation: a simple class of multivariate generalized
autoregressive conditional heteroskedasticity models. J Bus Econ Stat 20(3):339–350
Engle RF, Sheppard K (2001) Theoretical and empirical properties of dynamic conditional
correlation multivariate garch. Technical report, National Bureau of Economic Research
Faÿ G, Moulines E, Roueff F, Taqqu M (2009) Estimators of long-memory: Fourier versus
wavelets. J Econom 151(2):159–177
Fratzscher M, Schneider D, Van Robays I (2013) Oil prices, exchange rates and asset prices.
CESifo working paper no 4264
Gadanecz B, Jayaram K (2009) Measures of financial stability–a review. Bank for international
settlements, IFC bulletin 3
Gallegati M, Gallegati M, Ramsey JB, Semmler W (2011) The us wage phillips curve across
frequencies and over time. Oxf Bull Econ Stat 74(4):489–508
Gençay R, Selçuk F, Whitcher B (2002) An introduction to wavelets and other filtering methods in
finance and economics. Academic, San Diego
Graham M, Kiviaho J, Nikkinen J (2013) Short-term and long-term dependencies of the s&p 500
index and commodity prices. Quant Finance 13(4):583–592
Hamilton J (2009) Causes and consequences of the oil shock of 2007-08. Brookings papers in
economic activity 40(1):215–283
Hansen BE (1992) Tests for parameter instability in regressions with i (1) processes. J Bus Econ
Stat 10:321–335
Hansen BE (1997) Approximate asymptotic p values for structuras-change tests. J Bus Econ Stat
15(1):60–67
Hansen P, Lunde A (2006) Realized variance and market microstructure noise. J Bus Econ Stat
24(2):127–161
Khordagui H, Al-Ajmi D (1993) Environmental impact of the gulf war: an integrated preliminary
assessment. Environ Manage 17(4):557–562
Mallat S (1998) A wavelet tour of signal processing. Academic, San Diego
Markowitz H (1952) Portfolio selection. J Finance 7(1):77–91
Marshall JF (1994) The role of the investment horizon in optimal portfolio sequencing (an intuitive
demonstration in discrete time). Financ Rev 29(4):557–576
McAleer M, Medeiros MC (2008) Realized volatility: a review. Econom Rev 27(1–3):10–45
Percival DB (1995) On estimationof the wavelet variance. Biometrika 82:619–631
Percival D, Walden A (2000) Wavelet methods for time series analysis. Cambridge University
Press, Cambridge
Pindyck RS, Rotemberg JJ (1990) The excess co-movement of commodity prices. Econ J
100(403):1173–89
Ramsey JB (2002) Wavelets in economics and finance: past and future. Stud Nonlin Dyn Econom
6(3): Article 1, 1–27
Roueff F, Sachs R (2011) Locally stationary long memory estimation. Stoch Process Appl
121(4):813–844
Samuelson PA (1989) The judgment of economic science on rational portfolio management:
indexing, timing, and long-horizon effects. J Portf Manage 16(1):4–12
Serroukh A, Walden AT, Percival DB (2000) Statistical properties and uses of the wavelet variance
estimator for the scale analysis of time series. J Am Stat Assoc 95:184–196
Vacha L, Barunik J (2012) Co-movement of energy commodities revisited: evidence from wavelet
coherence analysis. Energy Econ 34(1):241–247
Waldrop MM (1987) Computers amplify black monday: the sudden stock market decline raised
questions about the role of computers; they may not have actually caused the crash, but may
well have amplified it. Science 238(4827):602
Wavelet-Based Correlation Analysis of the Key Traded Assets 183
Joanna Bruzda
1 Introduction
Time-scale (wavelet) analysis is well known for its ability to examine the frequency
content of the processes under scrutiny with a good joint time–frequency resolution.
The endogenously varying time window, which underlies this type of frequency
analysis, makes this approach efficient computationally, enables a precise timing
of events causing or influencing economic fluctuations and makes it possible to
analyze economic relationships decomposed according to time horizons. The latter
characteristic of wavelet analysis makes use of the fact that this type of studies is not
limited to the short and long run, thus making it possible to conduct an examination
for octave frequency bands, as is the case for the discrete wavelet transform (DWT)
and the so-called continuous discrete (non-decimated) wavelet methodology, or
even any arbitrary frequency band when continuous wavelet methods are used.
As Ramsey and Lampart (1998) noticed, economists have long recognized that
J. Bruzda ()
Nicolaus Copernicus University, Toruń, Poland
e-mail: [email protected]
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 187
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__9,
© Springer International Publishing Switzerland 2014
188 J. Bruzda
1
For the forecasting procedures suggested further in the text the best outcomes are usually produced
by the shortest Daubechies wavelets and a small number (1–2) of decomposition stages. In any
case, however, a reasonable strategy may be to use an optimized forecasting setup––see also our
remarks in the concluding section.
Forecasting via Wavelet Denoising: The Random Signal Case 189
The wavelet transform consists in decomposing a signal into rescaled and shifted
versions of a function, called the mother wavelet, which integrates to 0 and whose
energy is equal to 1. It is assumed that dilated and translated copies of the mother
wavelet (called the wavelet atoms or the daughter wavelets) are well localized on
both the time and frequency axes. This way the wavelet transform is well suited to
examine phenomena exhibiting various periodicities evolving with time. In contrast
to the short-time Fourier transform, which uses the same data windows to analyze
different frequencies, the wavelet transform is a windowing technique with a varying
window size, i.e., it examines high frequency oscillations in small data windows
and low frequency components in large windows. This distinguishing property of
wavelets, which makes it possible to ‘see both the forest and the trees’, is referred to
as the wavelet zoom and stands behind the success of wavelets in applied science.
Further in the text we will be working with discretely sampled wavelets (i.e., we
will conduct cascade filtering known as the discrete wavelet transform––DWT)
190 J. Bruzda
because––following Percival and Walden (2000)––we feel that they are a natural
tool for analyzing discrete time series.
Let us consider a discrete signal of length N D 2J0 of the form: x D
.x0 ; x1 ; : : : ; xN 1 /T . The DWT of x is defined via a couple of quadrature mirror
filters: the (half-band) high-pass wavelet filter fhl gj D0;:::;L1 and the corresponding
(half-band) low-pass scaling filter fgl gj D0;:::;L1 , where L is the filters’ width and
L is even. The two filters fulfill the so-called quadrature mirror relationship, i.e.,
gl D .1/lC1 hL1l , they are even-shift orthogonal,p the coefficients of the wavelet
filter integrate to 0 and those of the scaling filter to 2 while, in both cases, their
squares sum up to 1 (see Percival and Walden 2000, Chap. IV). In our simulation
and empirical studies in Sects. 5 and 6 we concentrate exclusively on the two
shortest filters belonging to the family of Daubechies filters. These are the Haar
filters defined as:
h0 D 1
p
2
; h1 D p1 2 ; g0 D 1
p
2
; g1 D 1
p
2
; (1)
and the Daubechies’ extremal phase filters of width 4 (denoted d4), for which:
p p p p
h0 D 1p 3
4 2
; h1 D 3C
p 3
4 2
; h2 D 3Cp 3
4 2
; h3 D 1
p 3
4 2
; (2)
X
L1 X
L1
Wjt D hl Vj 1;Œ2t C1l mod Nj 1 ; Vjt D gl Vj 1;Œ2t C1l mod Nj 1 (3)
lD0 lD0
X
L1 X
L1
WQ jt D hQ l VQj 1;Œt 2j 1 l mod N ; VQjt D gQ l VQj 1;Œt 2j 1 l mod N (4)
lD0 lD0
Forecasting via Wavelet Denoising: The Random Signal Case 191
for t D 0; Q
p : : : ; N 1,
pj D 1; : : : ; J0 , where V0t D xt (t D 0; : : : : ; N 1) and
hQ l D hl = 2, gQ l D gl = 2.
In the definitions above the circular convolution is used in order to define the
boundary coefficients, although according to our forecast procedures described in
further sections of this paper the coefficients affected by the extrapolation method
are removed or replaced with coefficients computed with backcasted values of the
signal.
In our further considerations we will adopt the following matrix notation. First,
the coefficients of the DWT at level J J0 can be written in the form of a vector:
2 3
W1
6 W2 7
6 7
6 7
W D 6 ::: 7 ;
6 7
4 WJ 5
VJ N 1
W D W x;
X
J X
J
xDW WD T
WjT Wj C VJT VJ D Dj C SJ : (5)
j D1 j D1
Qj DW
W Q j x; Q J D VQ J x:
V
X
J X
J
xD W Q j C VQ JT V
Q jT W QJ D Q j C SQ J :
D (6)
j D1 j D1
We finish this section with two further remarks. First, for our considerations in
Sect. 4 we notice that, exclusively in the case of the Haar filters, the following
additive decomposition also takes place:
Q 1CW
xDW Q 2 CCW
Q J CV
QJ: (7)
This becomes obvious by noting that the following relationships hold for J J0 :
WQ J;t D 1
2 VQJ 1:t VQJ 1;t J ; VQJ;t D 1
2 VQJ 1:t C VQJ 1;t J ;
Among the methods of wavelet forecasting of univariate time series, Schlüter and
Deuschle (2010) distinguished the following:
1. Forecasting signals estimated via wavelet thresholding
2. Structural time series modeling with component processes obtained through the
wavelet MRA
3. Modeling wavelet and scaling coefficients
4. Forecasting locally stationary wavelet processes.
Forecasting via Wavelet Denoising: The Random Signal Case 193
The first approach makes use of the fact that a white noise stochastic component
in a representation of the form ‘signal C noise’ affects wavelet coefficients from all
decomposition levels in the same way. As a result, for deterministic signals it is
suggested to remove all the coefficients that are less in magnitude than a certain
threshold. The modified wavelet coefficients constitute the building blocks of the
signal estimate, which is eventually obtained via the inverse wavelet transform.
In order to compute the modified coefficients, the so-called hard and soft
thresholding rules are usually used. The former is defined as:
where is the assumed threshold value. For DWT-based denoising, the threshold
can be set globally, while in the case of the non-decimated wavelet transform the
threshold changes with the decomposition stage.2 So far, DWT-based thresholding
has mainly been used for forecasting purposes (see Alrumaih and Al-Fawzan 2002;
Ferbar et al. 2009; Schlüter and Deuschle 2010) and the thresholds were those
introduced by Donoho and Johnstone (1994, 1995).
In Schlüter and Deuschle (2010), the authors document that wavelet denoising
improves short-term forecasts from ARMA and ARIMA models. This conclusion
was obtained for an oil price series and a Euro/Dollar exchange rate. Alrumaih and
Al-Fawzan (2002) came to similar conclusions for the Saudi Stock Index. Finally,
Ferbar et al. (2009) showed in a simulation study that wavelet thresholding is an
attractive alternative to exponential smoothing for forecasting with an asymmetric
loss function.
The second approach to wavelet forecasting makes use of the wavelet multires-
olution analysis defined via Eqs. (5) and (6) and consists in a separate modeling
and forecasting of details and smooths. The final forecast is obtained as a sum
of predictions for the component series. The approach was suggested in Arino
(1995) and further applied in different forms by, among others, Yu et al. (2001),
Zhang et al. (2001), Wong et al. (2003) and Fernandez (2008). In particular, Arino
(1995) analyzes monthly car sales in Spain by defining two component processes:
seasonal and irregular fluctuations D1 C D2 C D3 C D4 and the trend component S4 .
(S)ARIMA models are then estimated for each of these series. By contrast, Zhang
et al. (2001) use MODWT-based multiresolution analysis with five decomposition
levels and combine it with neural network processing, while in Yu et al. (2001)
2
For a presentation and discussion of thresholding rules and methods of threshold selection, see
Ogden (1997), Chaps. VII and VIII; Vidakovic (1999), Chap. VI; Percival and Walden (2000),
Chap. X; and Nason (2008), Chap. III.
194 J. Bruzda
Signal estimation based on wavelet thresholding assumes that the observed process
is composed of a deterministic signal and a random (white or colored) noise. Such an
assumption results, in practice, in elimination of the high frequency spectrum of the
process, i.e., the variance of wavelet coefficients from lower decomposition levels
(especially from the first level) is entirely attributed to the noise. This, however, will
be inappropriate when the estimated signal is stochastic (but see wavelet denoising
for the so-called sparse signals discussed in Percival and Walden 2000, Chap. X).
Forecasting via Wavelet Denoising: The Random Signal Case 195
yt D xt C t ; (9)
bjtx D aj Wjty ;
W (10)
xk2 ;
Ekx b (11)
where x and bx are column vectors of length N of the signal and its estimate, the
following solution is obtained:
2
y
E Wjtx Wjt E Wjtx
aj D
2 D
2 : (12)
y y
E Wjt E Wjt
The final signal estimate is computed via the inverse DWT, assuming that the
scaling coefficients of yt are left unchanged.
In Bruzda (2013a), in order to estimate the signal xt a similar idea to the one
above was applied to the MODWT coefficients, leading to two kinds of estimators.
The first among them has the following form for a given J J0 D log2 N:
X
J
b x x X bx bx
J X J
xQ D Q C VQ T b
Q jT W
W Q Q C SQ D Q y C bJ SQ y ;
j J VJ D D j J aj D j J (13)
j D1 j D1 j D1
196 J. Bruzda
x
b
Q jt is defined similarly to W
where W bjtx in Eq. (10), i.e.:
bQ x D a WQ y ;
W (10a)
jt j jt
x
while b
VQ J is again equal to VQ J (i.e., bJ is set to 1) or is obtained via rescaling VQ J as
y y
where this time we exclusively consider the Haar filters defined in Eq. (1). The
reasons for choosing the Haar wavelet are the following. First, other wavelet filters
are associated with larger phase shifts and do not necessarily generate decreasing
smoothing weights. Moreover, for the Haar wavelet the additive decomposition
defined in Eq. (7) holds. This results in leaving signals not corrupted by noise
unchanged.
The smoothing constants aj are obtained as in Eq. (12), although now it is more
convenient to define them on the basis of the non-decimated coefficients, i.e.:
2
2
E WQ jtx WQ jt E WQ jtx E WQ jt
y
aj D
2 D
2 D 1
2 : (15)
E WQ jt E WQ jt E WQ jt
y y y
Assuming that the wavelet variance of the noise at the first stage of the wavelet
decomposition (i.e., for scale 1 D 1) is given as:
2 y 2
Q
2 .1 / D E W Q
D h y2 .1 / D h E W
1t 1t
for certain h 2 Œ0; 1, from our assumption about the noise term t we finally
obtain:
" #
b x h y2 .1 /
WQ j D 1 j 1 2 WQ j ;
y
(16)
2 y j
2
where y2 j D E W Q jty is the wavelet variance of yt at scale j D 2j1 (j D 1, 2,
: : : , J) and it is assumed that the expression in the square brackets is positive.
A similar reasoning can be applied to the scaling coefficients. If the scaling
coefficients are (trend-)stationary with a mean value mt , the smoothing formula
becomes:
" #
b x h y2 .1 / y
VQ J t D mt C 1 J 1 2 VQJ t mt : (17)
2 E VQJ t mt
y
Forecasting via Wavelet Denoising: The Random Signal Case 197
Because the wavelet coefficients W Q jt are obtained via filtering which embeds
difference operators, while the scaling coefficients VQ Jt are computed via cascade
filtering with filters whose coefficients sum up to 1, for level J decomposition the
sum of the smoothing weights j is always equal to bJ . Because the details D Q jt
and smooths SQ Jt share the above-mentioned properties of W Q jt and VQ Jt , then also the
weights of the filter which defines the estimator xQ sum up to bJ , but this time the
filter is symmetric and has wider support. For example, for wavelet filters which
will be considered further in the simulation and in the empirical part of the paper––
provided in Eqs. (1) and (2)––level J D 1, 2, 3, 4 decomposition results in filters of
width 3, 7, 15, 31 in the case of the Haar wavelet and 7, 19, 43, 91 in the case of the
d4 wavelet, respectively. On the other hand, the width of the causal filter defining
the estimator in Eq. (14) is equal to 2J .
The scaling estimator of Percival and Walden requires that wavelet filters be
(relatively) good at decorrelating processes, both within and between scales.3 For
the estimators in Eqs. (13) and (14) we assume that the MODWT coefficients
corresponding to different dyadic frequency intervals can be treated as approximate
band-pass white noises and that between-scale wavelet decorrelation is relatively
effective.
In order to
use
the methods suggested here in real data applications, the wavelet
variance Y2 j must be replaced with its estimate. To this end we propose to use
an estimator utilizing, again, the MODWT coefficients and excluding all boundary
values. This is due to the fact that such an estimator is unbiased and efficient.4 It
was shown in Bruzda (2013a) via extensive computer simulations that combining
3
For a discussion of the Daubechies filters’ decorrelation properties, see, e.g., Bruzda (2013b),
Chap. II.
4
This estimator is more efficient than its DWT-based counterpart (for details see Percival 1995;
Percival and Walden 2000, pp. 308–310). There are, however, instances when a biased MODWT-
198 J. Bruzda
a y b y
2300 x H1 2300 x H3
x H2 x H4
2200 2200
2100 2100
2000 2000
1900 1900
20 30 40 50 60 70 20 30 40 50 60 70
c y d y
2300 x invH1 2300 x invH3
x invH2 x invH4
2200 2200
2100 2100
2000 2000
1900 1900
20 30 40 50 60 70 20 30 40 50 60 70
Fig. 1 Signal estimates obtained with the Haar wavelet and the estimators xQQ [graphs (a) and (b);
estimates H1–H4 correspond to J D 1, : : : , 4] and xQ [graphs (c) and (d); estimates invH1–invH4
correspond to J D 1, : : : , 4]; series N2883 from the M3-IJF-Competition database; h D 1
this estimator of the wavelet variance with the formulae in Eqs. (13) and (14) leads,
in small samples, to signal approximations which are often associated with smaller
MSEs than a signal extraction based on parametric models and maximum likelihood
estimation.
As an illustration we applied our two signal estimators, using the Haar filters in
each case, to one of the series analyzed further in Sect. 5––series N2883 of length
76 from the M3-IJF Competition database (see Makridakis and Hibon 2000)—
see Figs. 1, 2, and 3. For the estimator based on the inverse wavelet transform
the mean value of the series was used to extrapolate the data at the end of the
sample, while the values at the beginning of the signal were not estimated. Both
the wavelet and scaling coefficients were rescaled assuming that the smoothing
parameter h is equal to 1, 0.75 and 0.5, respectively. Up to four decomposition levels
were considered. Clearly, the ‘non-inverse’ method produces a phase shift which is,
however, relatively small. Besides, this example seems to suggest that assuming
h D 1 may result in too smooth signal estimates for short-term forecasting.
We end this section with an outline of the forecasting procedures that are used
further in Sects. 5 and 6. The ‘non-inverse’ estimator xQQ is directly applicable to
forecasting, i.e., after signal estimation we can, e.g., build some parametric models
and use them to compute forecasts, whereas the estimator xQ requires considering a
based estimator can outperform the unbiased estimator––see, e.g., Bruzda (2011) for time delay
estimation at higher scales as well as references in Percival and Walden (2000), p. 378.
Forecasting via Wavelet Denoising: The Random Signal Case 199
a y b y
2300 x H1 2300 x H3
x H2 x H4
2200 2200
2100 2100
2000 2000
1900 1900
20 30 40 50 60 70 20 30 40 50 60 70
c y c y
2300 x invH1 2300 x InvH3
x invH2 x invH4
2200 2200
2100 2100
2000 2000
1900 1900
20 30 40 50 60 70 20 30 40 50 60 70
Fig. 2 Signal estimates obtained with the Haar wavelet; h D 0.75; see Fig. 1 for more details
a y
b y
x H1 x H3
2300 2300
x H2 x H4
2200 2200
2100 2100
2000 2000
1900 1900
20 30 40 50 60 70 20 30 40 50 60 70
c y
d y
invH1 x invH3
2300 2300
invH2 x invH4
2200 2200
2100 2100
2000 2000
1900 1900
20 30 40 50 60 70 20 30 40 50 60 70
Fig. 3 Signal estimates obtained with the Haar wavelet; h D 0.5; see Fig. 1 for more details
suggested approach can reduce forecast error variance in relatively small samples,
short Daubechies wavelets seem to be the most promising devices in order to reach
this aim. In what follows, we refer to the suggested methods of signal estimation and
forecasting as wavelet smoothing because they resemble exponential smoothing but
use a specific weighting scheme depending, additionally, on the spectral content of
the processes under scrutiny.
In the following two sections we concentrate on examining the predictive abilities
of the procedures proposed here with some simulated and real data.
5 A Simulation Study
The simulation study aimed to examine if wavelet smoothing can help to reduce
forecast error variance in the case of short-term predictions based on relatively
small samples. Below we report our results obtained for samples of length N D
50, although a portion of all of the data-generating processes (DGPs) considered
here was also examined for N D 35 and 100, and we shortly comment on these
simulations as well. Forecasts up to five steps ahead were computed, i.e., series of
length N C 5 were generated and the last five observations were used in the forecast
evaluation. All computations were performed in Matlab R2008b with Optimization
Toolbox Version 4.1.5 The DGPs were the following:
yt D xt C t ;
t n:i:i:d: 0; 2 ;
(a) xt D 0:9xt 1 C t ;
and t n:i:i:d: .0; 1/. The parameter 2 took on two values: 1 and 4. Also, we
experimented with other parameter values (e.g., with 0.9 replaced by 0.6 in the DGP
(a)) as well as certain nonstationary specifications, especially the following signal
5
The Matlab codes used in the simulation and empirical studies are available upon request.
Forecasting via Wavelet Denoising: The Random Signal Case 201
models:
xt D 1.3
xt 1 0.5
xt 2 C t and
xt D
xt 1 0.5
xt 2 C t ,
where t n:i:i:d: .0; 1/). Each time all initial observations were set to 0.
The series yt was forecasted up to five steps ahead with the following methods:
(I) Simple exponential smoothing with a given/estimated value of the smoothing
parameter ’ and the starting value equal to the first observation in the sample;
(II) ARMA(p, q) models with (p, q) chosen according to the AIC (BIC) criterion
and maximum orders (1, 1) in the case of the signal (a) and (3, 3) for the
second DGP, estimated by the maximum likelihood method with the Matlab
function armax under the default settings6 ;
(III) ARIMA(p, 1, q) models, identified and estimated as in (II);
(IV) The naïve (no-change) method
and the procedures suggested here:
(V) The ‘non-inverse’ method with h D 1 (0.75) and the maximum level of
decomposition from 1 to 4, combined with (II)–(IV);
(VI) The ‘inverse’ method based on the Haar wavelet with h D 1 (0.75) and the
maximum level of decomposition from 1 to 4, where the same forecasting
methods were used in both steps, (II) and (III), respectively, or an ARIMA
model in the first step was combined with the no-change method (IV) in the
second;
(VII) The ‘inverse’ method based on the Daubechies d4 wavelet with h D 1 (0.75)
and the maximum level of decomposition 1 and 2, combined with (II)–(IV)
as in (VI).
Two methods of scaling coefficient treatment were considered: leaving them
unchanged and scaling according to the formula in Eq. (17), used under the
assumption of an estimated constant mean value. Since the latter approach produced
better outcomes on average, this one is exclusively reported here. Also, we
experimented with wavelet smoothing under h D 0.5, but the results were usually
less satisfying than those presented here. The MODWT-based estimators were used
in the estimation of the wavelet variance and the variance of scaling coefficients,
and they were computed after removing all of the boundary values. This means, in
particular, that estimation of the wavelet variance was unbiased. The ARMA models
in (II) as well as the ARMA models for the original data and the estimated signals in
(V)–(VII) were all estimated assuming a constant mean value, while the appropriate
ARIMA(p, 1, q) models––assuming no deterministic terms.
Tables 1, 2, 3, and 4 present the ratios of the forecast MSEs to the benchmark
forecast MSE, defined as the best result among those obtained with the benchmark
(I)–(IV) methods, computed on the basis of 5,000 and 2,000 replications for signal
(a) and (b), respectively. Exponential smoothing considered among the benchmark
models was the one with an estimated ’. The case with a constant ’ (set equal to 0.5
or 0.25) is presented in the notes to Tables 1, 2, 3, and 4. To save space, each time
6
The default settings mean, in particular, automatic choices of a search method and treatment of
initial conditions.
202 J. Bruzda
Table 1 Simulation results for the signal (a); ratios of forecast MSEs for forecasts from 1 to 5
steps ahead; 2 D 1
Forecasting methods (A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3)
Horizon H H D 1 h D 0.75 0.954 1.029 0.972 0.961 1.000 0.969 0.971 1.011
hD1 0.991 1.057 1.000 1.020 1.016 0.995 1.042 1.031
H D 2 h D 0.75 0.943 1.024 0.955 0.951 0.984 0.956 0.959 0.997
hD1 0.961 1.071 0.966 0.987 0.996 0.969 1.004 1.007
H D 3 h D 0.75 0.940 0.997 0.950 0.950 0.965 0.954 0.955 0.975
hD1 0.944 1.050 0.946 0.969 0.974 0.958 0.981 0.983
H D 4 h D 0.75 0.937 0.982 0.944 0.947 0.951 0.951 0.952 0.960
hD1 0.931 1.026 0.939 0.955 0.958 0.952 0.965 0.969
H D 5 h D 0.75 0.982 0.982 0.987 0.990 0.966 0.993 0.992 0.976
hD1 0.968 1.022 0.978 0.989 0.971 0.992 0.995 0.980
Forecasting methods (C3) (A4) (B4) (C4) (D1) (E1) (F1) (D2)
Horizon H HD1 h D 0.75 0.982 0.974 1.023 0.987 0.962 1.059 0.993 0.959
hD1 1.012 1.047 1.038 1.022 0.974 1.065 1.011 0.972
HD2 h D 0.75 0.965 0.960 1.010 0.965 0.947 1.024 0.970 0.952
hD1 0.982 1.006 1.018 0.988 0.945 1.016 0.986 0.954
HD3 h D 0.75 0.961 0.954 0.986 0.960 0.943 0.991 0.962 0.953
hD1 0.966 0.981 0.989 0.969 0.932 0.981 0.971 0.947
HD4 h D 0.75 0.957 0.950 0.977 0.956 0.943 0.974 0.961 0.955
hD1 0.963 0.964 0.980 0.964 0.924 0.961 0.964 0.941
HD5 h D 0.75 0.996 0.990 0.989 0.994 0.986 0.974 1.003 1.001
hD1 0.999 0.993 0.990 1.000 0.960 0.964 0.997 0.981
Forecasting methods (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
Horizon H HD1 h D 0.75 1.015 0.981 0.964 1.014 0.986 0.967 1.028 0.988
hD1 1.023 0.993 0.981 1.028 1.000 0.994 1.037 1.010
HD2 h D 0.75 0.984 0.972 0.959 0.988 0.978 0.954 1.000 0.970
hD1 0.980 0.990 0.965 0.984 0.998 0.967 0.992 0.996
HD3 h D 0.75 0.961 0.969 0.961 0.964 0.977 0.952 0.974 0.969
hD1 0.952 0.991 0.958 0.956 0.999 0.955 0.960 0.992
HD4 h D 0.75 0.950 0.972 0.962 0.952 0.979 0.950 0.967 0.969
hD1 0.937 0.994 0.952 0.941 1.003 0.944 0.947 0.990
HD5 h D 0.75 0.964 1.019 1.009 0.969 1.024 0.995 0.983 1.011
hD1 0.955 1.037 0.992 0.960 1.047 0.982 0.965 1.032
Table 1 (continued)
Forecasting methods (G1) (H1) (I1) (G2) (H2) (I2)
H D 4 h D 0.75 0.950 0.974 0.982 0.961 0.949 1.000
hD1 0.932 0.950 1.012 0.951 0.937 1.051
H D 5 h D 0.75 0.996 0.975 1.025 1.010 0.962 1.049
hD1 0.972 0.956 1.048 0.994 0.958 1.109
Note: The forecast models were selected according to the AIC criterion; for the benchmark methods
(I)–(IV) the ratios of the forecast MSEs to the forecast MSE for the benchmark method are: for
H D 1––1, 1.041, 1.029, 1.153; for H D 2––1, 1.045, 1.035, 1.130; for H D 3––1, 1.020, 1.026,
1.114; for H D 4––1.003, 1, 1.034, 1.119; and for H D 5––1.051, 1, 1.077, 1.174, respectively, i.e.,
the best results among the benchmark (I)–(IV) methods were obtained with exponential smoothing
(I) for shorter-term predictions, while stationary ARMA models produced the best forecasts 4 and
5 steps ahead; the appropriate relative MSEs for forecasts from 1 to 5 steps ahead with exponential
smoothing with a constant ˛ equal to 0.5 (0.25) are, respectively: 0.972 (1.124), 0.981 (1.061),
0.982 (1.017), 0.984 (0.993), 1.036 (1.015); (A1), (A2), (A3), (A4)––(V) used in combination
with the naïve method (IV); (B1), (B2), (B3), (B4)––(V) used in combination with (II); (C1), (C2),
(C3), (C4)––(V) used in combination with (III); (D1), (D2), (D3), (D4)––(VI) used in combination
with the naïve method (IV); (E1), (E2), (E3), (E4)––(VI) used in combination with (II); (F1), (F2),
(F3), (F4)––(VI) used in combination with (III); (G1), (G2)––(VII) used in combination with the
naïve method (IV); (H1), (H2)––(VII) used in combination with (II); (I1), (I2)––(VII) used in
combination with (III); numbers in brackets in the heading row denote maximum decomposition
levels; all results better than the benchmark are in bold; the best results for each horizon are
underlined
Table 2 Simulation results for the signal (a); ratios of forecast MSEs for forecasts from 1 to 5
steps ahead; 2 D 4
Forecasting methods (A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3)
Horizon H H D 1 h D 0.75 0.993 1.031 1.007 0.967 0.992 0.981 0.969 0.995
hD1 1.011 1.069 1.024 0.989 1.002 0.990 0.999 1.008
H D 2 h D 0.75 0.982 1.047 0.991 0.964 0.996 0.968 0.966 1.000
hD1 0.987 1.096 0.994 0.974 1.001 0.976 0.984 1.004
H D 3 h D 0.75 0.972 1.016 0.980 0.964 0.979 0.970 0.968 0.988
hD1 0.963 1.061 0.975 0.964 0.987 0.966 0.975 0.994
H D 4 h D 0.75 0.965 0.997 0.969 0.959 0.966 0.961 0.961 0.977
hD1 0.949 1.031 0.961 0.951 0.967 0.953 0.959 0.978
H D 5 h D 0.75 0.977 0.996 0.982 0.976 0.973 0.975 0.979 0.982
hD1 0.953 1.024 0.965 0.962 0.973 0.964 0.972 0.983
Forecasting methods (C3) (A4) (B4) (C4) (D1) (E1) (F1) (D2)
Horizon H H D 1 h D 0.75 0.986 0.971 1.001 0.986 0.999 1.045 1.011 0.982
hD1 0.999 1.006 1.014 1.007 1.005 1.061 1.020 0.983
H D 2 h D 0.75 0.973 0.969 1.008 0.974 0.988 1.042 0.993 0.977
hD1 0.985 0.992 1.009 0.996 0.981 1.043 0.997 0.970
H D 3 h D 0.75 0.971 0.970 0.995 0.974 0.976 1.010 0.985 0.975
hD1 0.978 0.981 0.997 0.987 0.958 1.008 0.972 0.958
(continued)
204 J. Bruzda
Table 2 (continued)
Forecasting methods (C3) (A4) (B4) (C4) (D1) (E1) (F1) (D2)
H D 4 h D 0.75 0.963 0.960 0.983 0.963 0.969 0.991 0.974 0.968
hD1 0.963 0.962 0.981 0.968 0.944 0.986 0.956 0.945
H D 5 h D 0.75 0.981 0.977 0.988 0.979 0.982 0.990 0.986 0.987
hD1 0.976 0.971 0.987 0.981 0.949 0.985 0.963 0.958
Forecasting methods (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
Horizon H HD1 h D 0.75 1.005 0.990 0.982 1.004 0.989 0.990 1.011 0.998
hD1 1.023 0.993 0.986 1.022 0.996 1.010 1.032 1.020
HD2 h D 0.75 0.998 0.980 0.980 0.998 0.984 0.982 1.004 0.984
hD1 1.001 0.985 0.976 1.000 0.990 0.991 1.007 1.005
HD3 h D 0.75 0.980 0.980 0.981 0.982 0.984 0.979 0.987 0.981
hD1 0.977 0.976 0.968 0.977 0.986 0.977 0.982 0.994
HD4 h D 0.75 0.968 0.972 0.973 0.971 0.976 0.970 0.975 0.972
hD1 0.959 0.960 0.953 0.960 0.971 0.960 0.962 0.977
HD5 h D 0.75 0.971 0.989 0.995 0.976 0.999 0.988 0.976 0.990
hD1 0.962 0.980 0.971 0.964 0.994 0.973 0.963 0.995
we report only the outcomes obtained with one of the two information criteria—
those producing the smaller forecast MSE for the best benchmark method (or the
subsequent best when the best was exponential smoothing or the naïve approach).
The other results, also for our experimentations with other data lengths (as well as
parameter values), are available upon request.
The most important finding from the simulations is that wavelet smoothing is
able to produce lower forecast MSEs of both short- and longer-term predictions than
Forecasting via Wavelet Denoising: The Random Signal Case 205
Table 3 Simulation results for the signal (b); ratios of forecast MSEs for forecasts from 1 to 5
steps ahead; 2 D 1
Forecasting methods (A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3)
Horizon H H D 1 h D 0.75 0.962 0.978 0.945 0.939 0.953 0.938 0.941 0.965
hD1 0.997 0.965 0.942 0.978 0.951 0.944 0.984 0.950
H D 2 h D 0.75 1.023 0.972 0.973 0.996 0.960 0.980 1.000 0.974
hD1 0.967 0.968 0.942 0.940 0.946 0.958 0.949 0.953
H D 3 h D 0.75 1.011 0.995 0.992 0.997 0.976 0.994 1.000 0.990
hD1 0.962 0.998 0.967 0.951 0.978 0.983 0.958 0.983
H D 4 h D 0.75 0.967 0.989 0.963 0.963 0.969 0.971 0.963 0.979
hD1 0.937 1.003 0.957 0.938 0.984 0.971 0.942 0.981
H D 5 h D 0.75 1.073 1.002 1.021 1.062 0.988 1.047 1.059 0.995
hD1 1.039 1.012 1.002 1.032 1.009 1.030 1.032 1.006
Forecasting methods (C3) (A4) (B4) (C4) (D1) (E1) (F1) (D2)
Horizon H HD1 h D 0.75 0.936 0.942 0.974 0.941 0.939 0.996 0.959 0.936
hD1 0.947 0.989 0.957 0.956 0.948 0.997 0.962 0.946
HD2 h D 0.75 0.976 0.999 0.994 0.994 1.032 0.988 1.001 1.025
hD1 0.966 0.950 0.957 0.976 0.964 0.982 0.964 0.958
HD3 h D 0.75 1.008 0.997 0.987 0.994 1.023 0.995 1.007 1.023
hD1 0.984 0.956 0.986 1.000 0.964 0.999 0.978 0.966
HD4 h D 0.75 0.978 0.960 0.996 0.959 0.974 0.987 0.979 0.978
hD1 0.963 0.940 0.995 0.981 0.932 1.001 0.960 0.940
HD5 h D 0.75 1.049 1.055 1.017 1.031 1.060 0.999 1.044 1.066
hD1 1.022 1.028 1.022 1.036 1.007 1.008 0.995 1.018
Forecasting methods (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
Horizon H HD1 h D 0.75 0.977 0.954 0.938 0.977 0.961 0.935 0.982 0.959
hD1 0.978 0.962 0.952 0.979 0.969 0.961 0.986 0.975
HD2 h D 0.75 0.985 1.021 1.031 0.990 1.027 1.018 1.000 1.017
hD1 0.971 0.991 0.967 0.976 0.995 0.962 0.973 0.989
HD3 h D 0.75 0.982 1.026 1.028 0.990 1.040 1.015 0.999 1.023
hD1 0.994 0.996 0.975 1.001 1.008 0.968 0.993 1.001
HD4 h D 0.75 0.979 0.989 0.982 0.981 1.007 0.969 0.988 0.977
hD1 0.997 0.980 0.947 0.992 0.987 0.940 0.996 0.980
HD5 h D 0.75 0.997 1.062 1.071 0.997 1.083 1.054 1.009 1.053
hD1 1.011 1.040 1.026 1.002 1.044 1.016 1.009 1.034
Table 3 (continued)
Forecasting methods (G1) (H1) (I1) (G2) (H2) (I2)
H D 4 h D 0.75 0.987 1.000 0.993 0.989 0.990 1.007
hD1 0.944 1.029 0.962 0.952 1.058 0.981
H D 5 h D 0.75 1.081 1.004 1.047 1.082 0.997 1.076
hD1 1.029 1.031 1.023 1.037 1.080 1.039
Note: The forecast models were selected according to the BIC criterion; for the benchmark methods
(I)–(IV) the ratios of the forecast MSEs to the forecast MSE for the benchmark method are: for
H D 1––1, 1.012, 1.028, 1.217; for H D 2––1.025, 1, 1.071, 1.475; for H D 3––1.034, 1, 1.062,
1.420; for H D 4––1.016, 1, 1.049, 1.317; and for H D 5––1.112, 1, 1.126, 1.441, respectively, i.e.,
the best results among the benchmark (I)–(IV) methods were obtained with the ARMA models
(II), except for horizon 1, in which case exponential smoothing produced better outcomes; the
appropriate relative MSEs for forecasts from 1 to 5 steps ahead with exponential smoothing with a
constant ˛ equal to 0.5 (0.25) are, respectively: 0.991 (0.982), 1.064 (0.959), 1.063 (0.978), 1.032
(0.974), 1.163 (1.055); see note to Table 1 for the description of forecast methods and other details
Table 4 Simulation results for the signal (b); ratios of forecast MSEs for forecasts from 1 to 5
steps ahead; 2 D 4
Forecasting methods (A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3)
Horizon H HD1 h D 0.75 0.980 0.967 0.987 0.977 0.965 0.987 0.980 0.973
hD1 0.977 0.967 0.962 0.983 0.970 0.979 0.993 0.978
HD2 h D 0.75 1.031 0.981 1.004 1.026 0.975 0.990 1.027 0.983
hD1 0.985 0.980 0.966 0.987 0.974 0.982 0.994 0.969
HD3 h D 0.75 1.022 0.974 0.972 1.022 0.963 0.973 1.031 0.969
hD1 0.969 0.983 0.955 0.978 0.977 0.967 0.996 0.988
HD4 h D 0.75 1.013 0.988 0.992 1.007 0.981 0.985 1.009 0.988
hD1 0.976 0.995 0.970 0.976 0.989 0.973 0.985 0.989
HD5 h D 0.75 1.056 0.995 1.018 1.052 0.986 1.015 1.059 0.993
hD1 1.004 0.996 0.983 1.009 0.998 1.004 1.023 0.999
Forecasting methods (C3) (A4) (B4) (C4) (D1) (E1) (F1) (D2)
Horizon H H D 1 h D 0.75 0.996 0.989 0.988 1.008 0.975 0.973 0.984 0.976
hD1 0.987 1.013 0.994 1.011 0.962 0.978 0.965 0.969
H D 2 h D 0.75 1.007 1.032 0.988 1.028 1.030 0.978 0.993 1.032
hD1 0.985 1.008 0.983 0.999 0.977 0.981 0.972 0.984
H D 3 h D 0.75 0.992 1.039 0.986 1.009 1.025 0.976 0.973 1.026
hD1 0.987 1.014 0.998 1.011 0.967 0.984 0.953 0.972
H D 4 h D 0.75 0.999 1.012 0.985 0.999 1.014 0.984 0.993 1.012
hD1 0.986 0.996 0.990 1.000 0.970 0.992 0.967 0.972
H D 5 h D 0.75 1.031 1.068 0.998 1.057 1.055 0.991 1.020 1.058
hD1 1.013 1.042 1.005 1.045 0.996 0.995 0.983 1.004
Forecasting methods (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
Horizon H H D 1 h D 0.75 0.968 0.995 0.980 0.979 1.001 0.985 0.992 1.010
hD1 0.976 0.975 0.978 0.977 0.984 1.000 0.995 1.007
(continued)
Forecasting via Wavelet Denoising: The Random Signal Case 207
Table 4 (continued)
Forecasting methods (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
H D 2 h D 0.75 0.978 1.004 1.032 0.991 1.013 1.033 0.992 1.025
hD1 0.978 0.987 0.988 0.976 0.990 1.003 0.984 1.009
H D 3 h D 0.75 0.968 0.985 1.029 0.974 0.990 1.032 0.987 1.005
hD1 0.974 0.968 0.981 0.979 0.968 0.999 0.991 0.996
H D 4 h D 0.75 0.982 0.998 1.012 0.986 1.005 1.014 0.989 0.999
hD1 0.989 0.979 0.977 0.993 0.982 0.992 1.000 0.994
H D 5 h D 0.75 0.985 1.025 1.062 0.988 1.036 1.064 0.996 1.044
hD1 0.992 0.997 1.015 1.001 1.011 1.031 1.004 1.024
the AR(I)MA models. Applying the wavelet signal estimation before constructing
AR(I)MA models or forecasting via the no-change method can lead to a ca. 5 %
reduction of the forecast MSEs relative to the best conventional approach. On the
other hand, when comparing the forecast MSEs of the ordinary ARMA (ARIMA)
models with those for the ARMA (ARIMA) models estimated on smoothed data,
we can find reductions even up to 8–10 %.
To assess if the predictive gains are statistically significant, chosen forecast
outcomes were examined with the Diebold and Mariano (DM) test for equal pre-
dictive ability (see Diebold and Mariano 1995), assuming quadratic loss functions.
For example, for our first simulation, whose outcomes are presented in Table 1,
comparing the method denoted in the tables as (D1) under h D 0.75 (which produced
the following relative forecast MSEs for forecasts from 1 to 5 steps ahead: 0.962,
0.947, 0.943, 0.943, 0.986––see Table 1) with exponential smoothing with an
estimated parameter ’ (and the relative forecast MSEs equal to 1 for forecasts
208 J. Bruzda
from 1 to 4 steps ahead and 1.051 for predictions five steps ahead), the following
DM statistics (and the corresponding p-values) are obtained: 5.486 (0.000), 8.964
(0.000), 10.45 (0.000), 12.12 (0.000), and 13.37 (0.000). Comparing the same
implementation scheme of wavelet smoothing with exponential smoothing with a
constant ’ equal to 0.5 (with the relative forecasts MSEs, reported in the note to
Table 1, equal to 0.972, 0.981, 0.982, 0.984, 1.034) gives the following test results:
1.412 (0.158), 5.310 (0.000), 6.830 (0.000), 7.743 (0.000), and 9.305 (0.000). This
means that wavelet signal estimation (alone or combined with some parametric
models) is able to produce significantly better results than the other smoothing
methods that are used in forecasting as well as AR(I)MA models, which in the first
half of our simulation exercises were generally outperformed at shorter horizons by
exponential smoothing.
Depending on the DGP and the particular way of implementation of wavelet
smoothing, the decrease in the forecast MSE can result from very good signal
estimation with wavelets in terms of the MSE in small samples, and, in particular,
from good signal estimation under the assumption of the smallest possible value of
the variance of the signal’s error term (see the simulation studies in Bruzda 2013a),
better performance of information criteria in discovering the true structure of the sig-
nal, and in some cases also better small-sample properties of maximum likelihood
estimators of parameters (for example, quite often a lower bias of the estimators of
autoregressive parameters) or a combination of these phenomena. However, using
wavelet smoothing in practice will usually require many arbitrary decisions or a
sort of optimizing the settings as for the value of the smoothing constant h and the
number of decomposition stages J as well as the other implementation details of
the method (i.e., the choice of the wavelet signal estimator, the kind of parametric
model, the method of scaling coefficient treatment and the wavelet filter).
Other conclusions from the experiments can be summarized as follows:
• In the case of shorter-term forecasts (usually 1 and 2 steps ahead) the simple
‘non-inverse’ approach is able to outperform the ‘inverse’ method. On the
other hand, the ‘inverse’ estimator produces better forecasts at longer horizons.
Moreover, this approach slightly beats the ‘non-inverse’ method in samples of
length 100.
• The optimal value of the smoothing constant h depends inter alia on the
prediction horizon. When forecasting 1–2 steps ahead, better outcomes are
obtained under h D 0.75. Longer prediction horizons are often associated with
higher optimal values of h.
• In our experiments we do not observe systematic predictive gains from using
the longer wavelet filter than the Haar, i.e., the Daubechies d4 filter. However, it
is worth adding that the Haar filter embeds just one difference operator and, as
such, it may not always be well suited to the case of nonstationary signals, also
if it is used in the ‘inverse’ method. Its application is limited not only due to the
high dynamic ranges (i.e., the ratios of the maximum and minimum values of the
spectral densities) of the conventional within-scale Haar wavelet coefficients, but
also because of the high between-scale correlations. Also, the other Daubechies
Forecasting via Wavelet Denoising: The Random Signal Case 209
filters, such as d4 and la8, give then worse results as compared to the stationary
case. However, as was noted by Craigmile and Percival (2005), in the class of
processes with stationary backward differences the between-scale correlation
of wavelet coefficients diminishes as the width of the wavelet filter increases.
Nevertheless, the within-scale correlations remain substantial (see the discussion
in Bruzda 2013b, Chap. II).
• Most often the lowest (1–2) maximum decomposition levels J produce the
best outcomes. This results from a decreasing number of non-boundary wavelet
coefficients available at higher decomposition stages and, at the same time, an
increasing number of parameters to be estimated. The first part of this remark
suggests that backcasting applied prior to wavelet signal estimation may further
improve the relative performance of our methods. Our experimentation with
nonstationary DGPs points out that, in this case, often higher decomposition
stages should be considered.
• The relative forecast MSEs on average clearly increase (and the relative per-
formance of wavelet smoothing on average deteriorates) with the length of the
series. However, the gains from wavelet smoothing are present even in samples
of length 100. By contrast, for samples of length 35 the relative MSEs are often
smaller than those for N D 50.
• For stationary processes, smoothing of scaling coefficients usually lowers the
forecast MSEs. The problem with high dynamic ranges of the coefficients
VQ Jt can, to some extent, be mitigated by applying higher maximum levels of
decomposition.
• The predictive gains from our methods applied to nonstationary DGPs are often
rather modest. Because of this we consider wavelet smoothing as an alternative to
(and a generalization of) exponential smoothing in the case of forecasting mainly
stationary processes, but not exclusively then.7
6 An Empirical Illustration
To practically verify the approach suggested here, wavelet smoothing in its two
variants was applied to compute forecasts of 16 time series from the M3-IJF
Competition database (see Makridakis and Hibon 2000). To simplify matters, we
chose series N2868–N2883 of a length in the range 76–79 without a clear increasing
or decreasing tendency (see Fig. 4). As previously, two approaches to handling
the scaling coefficients were considered, and we report the (slightly) better results
obtained via smoothing the coefficients. This remains in accordance with our
previous findings, especially as a more careful examination with unit root tests
points to the stationarity of about 12 of the series. Each series was forecasted up to
7
For example, under structural instability short wavelet filters may provide a better tool for
smoothing than other procedures.
210 J. Bruzda
5000 5000
6000 5000
4000 4000
4000 4000
3000 3000
5000 6000
4000 6000
4000 5000
2000 4000
3000 4000
2300
2200 2200 2200
2200
2000 2100 2000
2100
five steps ahead, starting with forecasts computed on the basis of samples of length
52–55 and then increasing the samples by one observation, recomputing all of the
wavelet quantities and re-estimating the parametric models. In this way, 20 forecasts
for each horizon, method and time series were obtained. The forecasting methods
were as in the simulation study, except for the fact that the smoothing constant h
took on the values: 0.5, 0.75 and 1, whereas the maximum orders p and q for the
ARMA(p, q) and ARIMA(p, 1, q) models were set to 4.8 All the other settings
(e.g., the numbers of decomposition levels and the information criteria used) were
as in the simulation study, except that the function armax was run with the option
‘InitialState D Backcast’, which on average gave better forecasts from
the ordinary AR(I)MA models.
Table 5 summarizes the results obtained with the BIC criterion and h D 0.75.
It contains the ratios of the average MSPEs (mean squared percentage errors) to
the average benchmark MSPE chosen as the best result among the four standard
approaches, denoted in the table as ‘RW’ (the random walk––no-change model),
8
In our earlier study based on a similar dataset (see Bruzda 2013b, Chap. VI), we exclusively
considered purely autoregressive specifications for the original and smoothed data, thus reaching
conclusions that differed from those here in some respects (in particular, the average predictive
gains from wavelet smoothing clearly rose with the forecast horizon). Here we evaluate the
suggested forecasting procedures in a different class of processes, utilizing a different estimation
procedure, and we also deepen our analysis of the empirical results.
Table 5 Ratios of MSPEs; real data example; BIC criterion and h D 0.75
(A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3) (C3) (A4) (B4) (C4)
(D1) (E1) (F1) (D2) (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
(G1) (H1) (I1) (G2) (H2) (I2) ES1 ES2 RW ARMA ARIMA ES
All (16)
1-step 1.208 0.957 0.968 1.230 0.999 0.990 1.217 0.950 0.908 1.216 0.974 0.953
1.035 0.991 0.977 1.044 0.968 0.977 1.043 0.962 0.931 1.037 0.960 0.934
1.049 0.970 0.983 1.056 0.999 0.989 1.276 1.571 1.017 1 1.044 1.015
2-step 1.112 0.981 1.015 1.099 0.996 1.045 1.092 0.966 0.988 1.090 1.004 1.069
1.058 1.005 1.025 1.056 0.977 1.033 1.055 0.966 0.999 1.043 1.008 0.992
1.087 0.995 1.125 1.082 1.006 1.123 1.080 1.054 1.147 1 1.096 1.145
3-step 1.117 0.964 0.987 1.090 0.969 1.047 1.088 0.951 1.027 1.086 0.970 1.119
1.102 0.995 0.978 1.096 0.964 1.012 1.096 0.942 1.036 1.083 1.032 1.028
1.083 1.032 1.028 1.130 1.207 1.114 1.057 0.961 1.225 1 1.068 1.225
4-step 1.153 0.975 1.013 1.119 0.971 1.036 1.120 0.965 1.064 1.117 0.979 1.147
1.158 1.002 0.966 1.150 0.979 1.035 1.150 0.952 1.083 1.137 1.063 1.078
Forecasting via Wavelet Denoising: The Random Signal Case
1.196 0.990 1.074 1.183 0.967 1.089 1.086 0.975 1.311 1 1.057 1.305
5-step 1.071 0.988 1.020 1.053 0.969 0.993 1.054 0.958 1.052 1.051 0.980 1.079
1.099 1.017 0.985 1.094 0.982 1.027 1.095 0.957 1.084 1.083 1.048 1.063
1.125 0.989 1.057 1.118 0.962 1.044 1.040 0.968 1.247 1 1.027 1.238
All stationary models
1-step (6) 1.444 0.992 1.070 1.466 1.058 1.099 1.447 1.044 1.069 1.446 1.034 1.080
1.219 1.004 1.013 1.228 1.006 1.032 1.224 0.976 1.014 1.216 1.001 1.014
1.243 0.925 0.934 1.252 1.042 1.080 1.509 1.868 1.183 1 1.137 1.191
(continued)
211
212
Table 5 (continued)
(A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3) (C3) (A4) (B4) (C4)
(D1) (E1) (F1) (D2) (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
(G1) (H1) (I1) (G2) (H2) (I2) ES1 ES2 RW ARMA ARIMA ES
2-step (8) 1.178 0.999 1.081 1.177 1.014 1.099 1.167 0.974 1.056 1.166 1.027 1.145
1.111 1.031 1.062 1.118 1.012 1.092 1.117 0.975 1.061 1.100 1.046 1.036
1.145 0.977 1.203 1.149 0.977 1.173 1.164 1.127 1.190 1 1.150 1.191
3-step (8) 1.203 0.944 1.060 1.168 0.977 1.153 1.168 0.988 1.120 1.166 1.013 1.211
1.197 0.993 1.012 1.191 0.974 1.042 1.191 0.952 1.094 1.174 1.098 1.103
1.242 0.970 1.096 1.229 1.036 1.124 1.131 1.021 1.341 1 1.143 1.340
4-step (10) 1.269 0.943 1.122 1.230 0.975 1.194 1.235 1.013 1.216 1.232 1.040 1.317
1.296 1.014 1.072 1.289 1.007 1.110 1.291 0.996 1.236 1.270 1.169 1.262
1.341 0.995 1.148 1.327 1.125 1.200 1.203 1.083 1.486 1 1.210 1.481
5-step (9) 1.339 0.898 1.203 1.327 0.943 1.229 1.336 0.993 1.325 1.333 0.961 1.347
1.376 1.008 1.156 1.382 0.987 1.183 1.387 0.980 1.358 1.360 1.047 1.295
1.420 0.940 1.172 1.419 0.976 1.103 1.339 1.247 1.600 1 1.280 1.586
First part (8)
1-step 1.228 0.960 0.972 1.251 1.004 0.992 1.237 0.952 0.907 1.236 0.976 0.955
1.046 0.995 0.983 1.055 0.971 0.983 1.054 0.965 0.933 1.047 0.963 0.937
1.060 0.974 0.988 1.067 1.004 0.997 1.298 1.606 1.015 1 1.053 1.018
2-step 1.119 0.983 1.016 1.105 0.998 1.048 1.098 0.968 0.989 1.096 1.006 1.074
1.063 1.006 1.027 1.061 0.979 1.037 1.059 0.966 1.001 1.047 1.009 0.995
1.093 0.997 1.127 1.088 1.002 1.131 1.083 1.056 1.147 1 1.099 1.148
3-step 1.120 0.964 0.984 1.091 0.968 1.045 1.089 0.950 1.026 1.087 0.967 1.121
1.104 0.995 0.976 1.099 0.963 1.011 1.098 0.940 1.035 1.085 1.030 1.028
1.143 0.993 1.090 1.132 0.957 1.116 1.055 0.957 1.226 1 1.067 1.227
J. Bruzda
4-step 1.152 0.974 1.009 1.118 0.969 1.032 1.119 0.963 1.061 1.116 0.975 1.148
1.158 1.002 0.960 1.149 0.977 1.031 1.149 0.948 1.081 1.136 1.059 1.077
1.197 0.985 1.070 1.183 0.954 1.089 1.082 0.970 1.307 1 1.053 1.307
5-step 1.068 0.987 1.016 1.049 0.968 0.987 1.051 0.956 1.048 1.048 0.976 1.077
1.096 1.017 0.980 1.091 0.980 1.022 1.092 0.953 1.081 1.080 1.042 1.060
1.123 0.983 1.052 1.116 0.944 1.040 1.035 0.964 1.239 1 1.023 1.237
Second part (8)
1-step 0.986 0.993 0.998 0.997 1.011 1.044 0.998 1.013 1.017 1.006 1.034 1.013
0.963 1.022 0.976 0.964 1.002 0.985 0.969 1.005 0.988 0.971 1.001 0.973
0.960 0.991 0.996 0.967 1.025 0.948 1.030 1.124 1.152 1.098 1 1.067
2-step 0.918 0.922 0.990 0.951 0.939 0.974 0.948 0.935 0.970 0.950 0.982 0.958
0.923 0.974 0.976 0.937 0.935 0.951 0.941 0.954 0.946 0.938 1.004 0.935
0.924 0.968 1.063 0.940 1.106 0.934 0.998 1.009 1.140 1.005 1 1.051
3-step 1.037 0.965 1.103 1.052 0.990 1.092 1.047 0.991 1.080 1.046 1.046 1.051
1.012 1.001 1.052 1.029 0.999 1.041 1.033 1.004 1.053 1.027 1.080 1.017
1.024 1.048 1.115 1.040 1.328 1.035 1.120 1.094 1.195 1 1.097 1.169
4-step 1.176 0.987 1.172 1.156 1.031 1.175 1.147 1.030 1.163 1.145 1.099 1.121
1.158 1.023 1.164 1.161 1.050 1.165 1.163 1.086 1.168 1.152 1.173 1.124
Forecasting via Wavelet Denoising: The Random Signal Case
1.174 1.161 1.214 1.171 1.434 1.081 1.229 1.121 1.445 1 1.192 1.245
5-step 1.194 1.008 1.179 1.181 1.039 1.218 1.169 1.026 1.193 1.165 1.148 1.161
1.197 1.013 1.176 1.197 1.068 1.211 1.199 1.097 1.201 1.188 1.271 1.175
1.209 1.211 1.276 1.209 1.650 1.191 1.250 1.124 1.537 1 1.207 1.273
(continued)
213
Table 5 (continued)
214
(A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3) (C3) (A4) (B4) (C4)
(D1) (E1) (F1) (D2) (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
(G1) (H1) (I1) (G2) (H2) (I2) ES1 ES2 RW ARMA ARIMA ES
Nonstationary models––first part
1-step (3) 1.194 1.053 1.011 1.222 1.086 1.028 1.212 0.997 0.888 1.211 1.052 0.971
1.027 1.111 1.078 1.040 1.064 1.064 1.040 1.078 0.983 1.034 1.052 0.991
1.036 1.137 1.157 1.043 1.097 1.050 1.280 1.582 1 1.124 1.113 1.004
2-step (2) 0.996 0.987 0.875 0.943 1.007 0.953 0.944 1.004 0.841 0.940 0.992 0.912
0.974 0.982 0.970 0.945 0.932 0.927 0.941 0.994 0.872 0.945 0.944 0.924
0.995 1.118 0.953 0.959 1.142 1.052 0.898 0.898 1.092 1.053 1 1.087
3-step (2) 1.042 1.150 0.915 1.032 1.078 0.895 1.022 0.975 0.907 1.018 0.970 1.022
0.997 1.136 1.014 0.993 1.065 1.074 0.990 1.036 1.018 0.986 0.978 0.965
1.022 1.200 1.231 1.017 0.874 1.242 1.000 0.917 1.082 1.133 1.000 1.080
4-step (2) 1.228 1.373 1.058 1.197 1.272 0.958 1.188 1.149 1.008 1.183 1.122 1.078
1.179 1.294 0.999 1.163 1.221 1.176 1.159 1.136 1.031 1.159 1.109 0.946
1.210 1.299 1.234 1.196 0.843 1.149 1.136 1.011 1.280 1.321 1.000 1.280
5-step (3) 1.041 1.324 1.066 1.011 1.231 0.976 1.004 1.148 1.008 1.001 1.229 1.051
1.066 1.271 1.037 1.048 1.213 1.108 1.045 1.157 1.048 1.044 1.287 1.068
1.080 1.281 1.188 1.065 1.163 1.229 0.969 0.899 1.164 1.240 1.000 1.163
Nonstationary models––second part
1-step (7) 0.998 1.016 1.004 0.991 1.010 1.048 0.988 1.009 1.010 0.999 1.034 1.006
0.979 1.052 1.004 0.970 1.018 0.995 0.971 1.019 0.994 0.977 1.014 0.980
0.976 1.015 1.010 0.969 1.041 0.958 1.009 1.089 1.188 1.142 1 1.030
2-step (6) 0.880 0.988 0.998 0.903 0.962 0.938 0.910 0.971 0.963 0.918 1.060 0.954
0.904 1.069 1.043 0.912 0.984 0.925 0.916 1.015 0.952 0.923 1.104 0.944
0.903 1.011 1.109 0.910 1.217 0.996 0.932 1.001 1.084 1.120 1.035 1
J. Bruzda
3-step (6) 0.923 0.997 1.022 0.933 0.981 0.975 0.939 0.989 1.007 0.942 1.089 0.961
0.877 1.013 0.961 0.886 0.960 0.894 0.890 0.979 0.940 0.896 1.097 0.894
0.885 0.984 1.003 0.891 1.371 1.034 0.977 1.041 1 1.058 1.066 1.036
4-step (4) 1.069 1.198 1.078 0.998 1.174 1.065 0.997 1.188 1.124 1.008 1.426 1.040
1.082 1.220 1.058 1.042 1.228 1.063 1.041 1.353 1.142 1.059 1.640 1.051
1.082 1.133 1.084 1.041 1.616 1.168 0.998 0.959 1.360 1.275 1.084 1
5-step (4) 0.955 1.191 1.057 0.926 1.156 1.037 0.932 1.173 1.107 0.938 1.516 1.024
0.997 1.208 1.054 0.970 1.228 1.002 0.974 1.325 1.086 0.990 1.825 1.046
0.990 1.140 1.135 0.967 1.843 1.253 0.908 0.924 1.193 1.232 1.046 1
Note: Forecast methods are as defined below Table 1, except for RW––the naïve method; ARMA––ARMA(p, q) models; ARIMA––ARIMA (p.1, q) models;
ES––exponential smoothing with an estimated smoothing constant; ES1––exponential smoothing with ’ D 0.5, ES2––exponential smoothing with ’ D 0.25
(the case of ’ D 0.75 was excluded from the presentation––generally, it produced worse results than the other values of ’); all results that are better than the
benchmark are underlined; the four benchmark outcomes are in bold, and ES1 and ES2 are presented in bold italics; the numbers of the series are given in
brackets
Forecasting via Wavelet Denoising: The Random Signal Case
215
216 J. Bruzda
‘ARMA’, ‘ARIMA’ and ‘ES’ (exponential smoothing with an estimated ’). The
MSPE for a single series was computed as:
X
20
2
yt ytp
t D1
MSPE D ;
X
20
yt2
t D1
i.e., as Theil’s U coefficient.9 The smoothing constant h set equal to 0.5 or 1 resulted
in a smaller number of cases when wavelet smoothing beat the other methods,
although, especially if h was set to 1, the predictive gains from wavelet smoothing
were sometimes higher (see the results collated in Table 6). Neither of the two
information criteria uniformly outperformed the other in choosing better forecast
models, although a proportion of the targeted indications (to models producing
lower MSPEs among the best benchmark ARMA/ARIMA models) somewhat
favored the BIC criterion (with the proportion being about 9:7). Because of this, the
outcomes obtained with the BIC criterion are exclusively presented here. We note
in passing that the AIC criterion gave slightly more arguments in favor of wavelet
smoothing, thus leading more often to predictive gains from the wavelet approach.10
Because a more careful examination pointed out that the individual MSPEs in
the first half of our data are much higher than those in the second half, the empirical
results are also presented separately for these two groups of series. In fact, the
aggregated outcomes for all 16 series closely resemble those for the first eight
series only (see Tables 5 and 6). It is also worth mentioning that the series in these
two groups seem to have different dynamic properties. In particular, the majority
(5 out of 8) of the series in the second dataset can be identified as ARMA(1.1)
with a negative MA parameter, while those in the first group are either purely
autoregressive, sometimes exhibiting a sort of periodicity (i.e., certain significant
higher-order autoregressive terms), or have a more complicated structure than the
simple ‘AR C noise’. Also, we note that in the second dataset there are series with
evident level shifts. Besides, to have some further insight into possible gains from
wavelet smoothing, the outcomes are also presented separately for the series for
which the best benchmark forecasts were obtained with ARMA models and for
all the others divided into those from the first and second half of the series. The
subgroups are denoted as: ‘All stationary models’, ‘Nonstationary models––first
part’ and ‘Nonstationary models––second part’, respectively.
9
The change of evaluation criterion from MSE to MSPE was dictated by the aggregation of the
forecasting results.
10
Detailed results are available upon request.
Table 6 Ratios of MSPEs; real data example; BIC criterion and h D 1
(A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3) (C3) (A4) (B4) (C4)
(D1) (E1) (F1) (D2) (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
(G1) (H1) (I1) (G2) (H2) (I2) ES1 ES2 RW ARMA ARIMA ES
All (16)
1-step 1.347 0.955 0.967 1.399 0.981 1.003 1.381 0.943 0.960 1.381 0.936 0.955
1.076 1.012 0.985 1.092 1.013 1.008 1.092 1.017 1.024 1.088 1.025 0.989
1.085 0.993 1.001 1.098 0.969 1.009 1.276 1.571 1.017 1 1.044 1.015
2-step 1.137 0.993 1.031 1.131 0.968 1.053 1.122 0.923 1.019 1.120 0.931 1.030
1.045 1.022 1.026 1.045 0.970 1.057 1.044 0.957 1.069 1.033 1.005 0.994
1.080 1.002 1.107 1.075 1.014 1.074 1.080 1.054 1.147 1 1.096 1.145
3-step 1.110 0.991 1.022 1.082 0.964 1.083 1.079 0.922 1.075 1.076 0.942 1.086
1.074 1.022 0.980 1.068 0.944 1.099 1.068 0.943 1.117 1.056 1.042 1.038
1.120 1.070 1.131 1.109 1.027 1.126 1.057 0.961 1.225 1 1.068 1.225
4-step 1.126 1.005 1.055 1.088 0.980 1.103 1.089 0.939 1.126 1.085 0.951 1.130
1.119 1.014 1.035 1.109 0.946 1.132 1.109 0.975 1.149 1.096 1.075 1.068
Forecasting via Wavelet Denoising: The Random Signal Case
1.166 1.201 1.136 1.150 1.024 1.142 1.086 0.975 1.311 1 1.057 1.305
5-step 1.036 0.997 1.061 1.018 0.961 1.053 1.020 0.927 1.084 1.016 0.934 1.095
1.060 1.039 1.093 1.055 0.982 1.127 1.057 1.072 1.152 1.044 1.083 1.054
1.093 1.365 1.135 1.084 1.077 1.102 1.040 0.968 1.247 1 1.027 1.238
All stationary models
1-step (6) 1.616 1.084 1.129 1.674 1.105 1.176 1.649 1.060 1.123 1.648 1.083 1.118
1.263 1.043 1.064 1.279 1.047 1.086 1.275 1.056 1.091 1.270 1.084 1.078
1.288 1.064 1.128 1.305 1.028 1.114 1.509 1.868 1.183 1 1.137 1.191
(continued)
217
218
Table 6 (continued)
(A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3) (C3) (A4) (B4) (C4)
(D1) (E1) (F1) (D2) (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
(G1) (H1) (I1) (G2) (H2) (I2) ES1 ES2 RW ARMA ARIMA ES
2-step (8) 1.212 0.999 1.073 1.221 0.984 1.114 1.207 0.954 1.073 1.205 0.980 1.098
1.102 1.030 1.080 1.113 0.968 1.130 1.112 0.964 1.144 1.096 1.034 1.036
1.141 1.014 1.176 1.149 1.020 1.102 1.164 1.127 1.190 1 1.150 1.191
3-step (8) 1.184 1.004 1.105 1.146 0.992 1.138 1.146 0.955 1.142 1.144 0.986 1.154
1.161 0.931 1.021 1.155 0.913 1.145 1.155 0.940 1.139 1.139 1.081 1.141
1.218 1.021 1.255 1.203 1.006 1.146 1.131 1.021 1.341 1 1.143 1.340
4-step (10) 1.223 1.005 1.160 1.180 1.008 1.191 1.186 0.982 1.227 1.183 1.003 1.252
1.244 0.949 1.128 1.237 0.933 1.250 1.240 1.020 1.257 1.220 1.167 1.241
1.301 1.200 1.301 1.285 1.142 1.207 1.203 1.083 1.486 1 1.210 1.481
5-step (9) 1.281 0.954 1.229 1.274 0.965 1.265 1.286 0.960 1.293 1.282 0.968 1.351
1.315 0.954 1.286 1.325 0.973 1.348 1.333 1.208 1.409 1.306 1.271 1.339
1.369 1.236 1.463 1.370 0.918 1.282 1.339 1.247 1.600 1 1.280 1.586
First part (8)
1-step 1.374 0.957 0.972 1.427 0.986 1.010 1.408 0.946 0.965 1.407 0.938 0.957
1.088 1.017 0.992 1.105 1.019 1.017 1.104 1.024 1.032 1.099 1.031 0.996
1.098 0.997 1.010 1.111 0.970 1.008 1.298 1.606 1.015 1 1.053 1.018
2-step 1.146 0.994 1.034 1.137 0.968 1.055 1.128 0.922 1.022 1.125 0.932 1.032
1.050 1.024 1.029 1.050 0.971 1.061 1.048 0.957 1.073 1.037 1.006 0.996
1.086 1.002 1.112 1.081 1.004 1.062 1.083 1.056 1.147 1 1.099 1.148
3-step 1.112 0.991 1.023 1.082 0.962 1.082 1.079 0.920 1.075 1.076 0.940 1.086
1.076 1.023 0.979 1.069 0.943 1.101 1.068 0.942 1.119 1.056 1.044 1.038
1.124 1.067 1.134 1.111 1.012 1.108 1.055 0.957 1.226 1 1.067 1.227
J. Bruzda
4-step 1.125 1.004 1.054 1.087 0.978 1.101 1.088 0.936 1.125 1.084 0.948 1.129
1.119 1.013 1.034 1.109 0.943 1.132 1.109 0.973 1.148 1.096 1.076 1.067
1.167 1.198 1.137 1.151 1.006 1.118 1.082 0.970 1.307 1 1.053 1.307
5-step 1.033 0.996 1.060 1.015 0.958 1.049 1.017 0.924 1.081 1.013 0.931 1.092
1.058 1.039 1.093 1.053 0.980 1.125 1.054 1.072 1.151 1.042 1.082 1.051
1.091 1.364 1.136 1.083 1.055 1.075 1.035 0.964 1.239 1 1.023 1.237
Second part (8)
1-step 1.025 1.002 0.964 1.056 0.995 0.993 1.065 0.994 0.985 1.081 0.996 1.022
0.975 1.025 0.969 0.983 1.003 0.965 0.993 1.008 0.983 1.004 1.024 0.973
0.967 1.038 0.952 0.990 1.034 1.129 1.030 1.124 1.152 1.098 1 1.067
2-step 0.917 0.974 0.950 0.972 0.954 1.008 0.975 0.941 0.968 0.982 0.933 0.978
0.908 0.964 0.955 0.932 0.953 0.936 0.940 0.952 0.967 0.944 0.980 0.941
0.906 1.002 0.978 0.938 1.269 1.428 0.998 1.009 1.140 1.005 1 1.051
3-step 1.051 1.008 1.006 1.082 1.026 1.119 1.081 1.006 1.073 1.084 1.000 1.091
1.003 0.996 1.011 1.031 0.977 1.037 1.040 0.983 1.064 1.040 0.999 1.022
1.017 1.151 1.051 1.049 1.522 1.748 1.120 1.094 1.195 1 1.097 1.169
4-step 1.149 1.035 1.074 1.132 1.063 1.179 1.126 1.046 1.155 1.126 1.044 1.156
1.111 1.032 1.078 1.120 1.028 1.143 1.125 1.039 1.169 1.120 1.057 1.121
Forecasting via Wavelet Denoising: The Random Signal Case
1.131 1.289 1.106 1.136 1.644 1.952 1.229 1.121 1.445 1 1.192 1.245
5-step 1.143 1.026 1.085 1.137 1.081 1.214 1.126 1.061 1.183 1.124 1.074 1.189
1.133 1.042 1.086 1.139 1.076 1.187 1.143 1.076 1.214 1.138 1.119 1.173
1.148 1.408 1.120 1.157 1.902 2.147 1.250 1.124 1.537 1 1.207 1.273
(continued)
219
Table 6 (continued)
220
(A1) (B1) (C1) (A2) (B2) (C2) (A3) (B3) (C3) (A4) (B4) (C4)
(D1) (E1) (F1) (D2) (E2) (F2) (D3) (E3) (F3) (D4) (E4) (F4)
(G1) (H1) (I1) (G2) (H2) (I2) ES1 ES2 RW ARMA ARIMA ES
Nonstationary models––first part
1-step (3) 1.330 0.965 0.957 1.386 1.006 0.994 1.371 0.966 0.949 1.371 0.931 0.939
0.995 1.044 1.045 1.001 1.022 1.049 1.001 1.051 1.079 1.004 1.052 1.058
0.995 1.085 1.074 0.998 1.097 1.058 1.021 1.153 1 1.262 1.164 1.029
2-step (2) 1.000 1.030 0.970 0.940 0.977 0.934 0.942 0.880 0.920 0.938 0.832 0.886
0.946 1.058 0.924 0.912 1.034 0.902 0.907 0.988 0.909 0.911 0.970 0.928
0.973 1.019 0.969 0.929 1.029 0.996 0.898 0.898 1.092 1.053 1 1.087
3-step (2) 1.064 1.085 0.928 1.057 1.009 1.075 1.043 0.948 1.037 1.037 0.940 1.046
0.983 1.418 0.994 0.979 1.155 1.127 0.974 1.074 1.215 0.971 1.075 0.893
1.012 1.343 0.944 1.007 1.187 1.153 1.000 0.917 1.082 1.133 1 1.080
4-step (2) 1.242 1.314 1.119 1.207 1.213 1.248 1.192 1.119 1.240 1.187 1.109 1.190
1.158 1.509 1.129 1.138 1.279 1.194 1.132 1.165 1.243 1.132 1.172 0.961
1.196 1.593 1.071 1.179 1.020 1.269 1.134 0.993 1.290 1.309 1 1.290
5-step (3) 1.022 1.281 1.135 0.987 1.185 1.075 0.978 1.109 1.118 0.974 1.115 1.080
1.041 1.380 1.147 1.019 1.225 1.161 1.014 1.178 1.153 1.013 1.130 0.998
1.058 1.839 1.053 1.038 1.487 1.121 0.969 0.899 1.164 1.240 1 1.163
Nonstationary models––second part
1-step (7) 1.032 1.025 0.977 1.038 0.992 0.980 1.041 0.988 0.974 1.061 0.994 1.015
0.983 1.036 0.972 0.979 1.014 0.964 0.984 1.014 0.978 1.001 1.035 0.970
0.976 1.057 0.955 0.984 1.028 1.160 1.009 1.089 1.188 1.142 1 1.030
2-step (6) 0.891 1.073 0.989 0.936 0.989 0.999 0.951 0.981 0.977 0.967 0.983 0.984
0.904 1.051 1.001 0.920 1.029 0.962 0.927 1.036 0.989 0.945 1.083 0.971
0.900 1.104 1.012 0.922 1.332 1.709 0.932 1.001 1.084 1.120 1.035 1
J. Bruzda
3-step (6) 0.934 1.029 0.969 0.958 1.000 1.021 0.970 0.988 0.985 0.978 0.998 0.991
0.899 1.018 0.975 0.915 0.982 0.956 0.923 0.993 0.977 0.938 1.017 0.937
0.907 1.178 1.001 0.926 1.477 1.951 0.942 1.004 0.965 1.021 1.028 1
4-step (4) 1.063 1.289 1.110 0.969 1.296 1.161 0.970 1.301 1.181 0.990 1.387 1.175
1.060 1.349 1.136 1.008 1.310 1.086 1.006 1.336 1.135 1.038 1.461 1.118
1.059 1.380 1.150 1.009 1.819 3.172 0.998 0.959 1.360 1.275 1.084 1
5-step (4) 0.955 1.259 1.076 0.917 1.283 1.139 0.926 1.317 1.141 0.940 1.417 1.148
0.994 1.327 1.107 0.958 1.346 1.090 0.963 1.394 1.133 0.993 1.543 1.121
0.984 1.432 1.103 0.957 2.305 3.433 0.908 0.924 1.193 1.232 1.046 1
See note to Table 5
Forecasting via Wavelet Denoising: The Random Signal Case
221
222 J. Bruzda
holds true even for purely autoregressive processes. This has been confirmed in
another simulation exercise in which we generated nearly nonstationary AR(1)
processes (with the autoregressive parameter set to 0.9 and 0.95) and, under
h D 1, obtained gains from wavelet smoothing for forecasts from 2–3 to 5 steps
ahead (with the maximum horizon set to 5). This may result from the fact that
setting h equal to 1 leads to very good estimates of signals with the lowest
possible variance of the error term,11 which makes it possible to reduce the
forecast error variance. Finally, the good outcomes obtained for the first portion
of our data, consisting mainly of purely autoregressive (and sometimes periodic)
processes, or processes with a more complicated structure than ‘AR C noise’,
can also be explained by the fact that wavelet smoothing captures the influence
of higher-order terms in a representation of these processes, thus leading to less
complicated parametric models.
It is also worth adding that a portion of all the forecasting results obtained with
the wavelet methods was tested for equal predictive ability with those produced
by the best benchmark approaches, using the DM test under the assumption of the
quadratic loss function, but no significant predictive gains were found. This, at least
partially, results from a relatively small number (20) of forecasts considered for each
series and forecast horizon and, at the same time, from large forecast error variances.
On the other hand, the repeatability of changes in forecast MSEs across different
methods and forecast horizons, which we discussed above, observed especially at
the aggregate level, seems to support our findings from the simulation studies.
7 Conclusions
Random signal estimation based on wavelet shrinkage combined with the MODWT
can be interesting for extracting components of economic processes as well as for
forecasting purposes. It relies, however, on the assumptions that the time series
under study are composed of a stochastic signal and an observational (white)
noise and that the conventional wavelet transform is relatively effective at within-
and between-scale decorrelation of these series. Although these assumptions are
certainly restrictive, both our empirical and simulation studies document that
wavelet random signal estimation applied prior to constructing parametric AR(I)MA
models can moderately reduce the forecast MSEs in the case of short- and medium-
sized samples. Because of the conceptual simplicity of the approach and due to the
fact that the computational complexity of the pyramid algorithm used to perform
the MODWT is quite low (strictly speaking, it is of the same order as the famous
fast Fourier transform), the method may be useful in automatic forecasting systems
11
Under h D 1 our signal estimates roughly correspond to those obtained via the so-called
canonical decomposition (see, e.g., Kaiser and Maravall 2005).
224 J. Bruzda
applied to large datasets comprising time series with relatively similar dynamic
properties. In fact, it takes only about 10 h to compute one million forecasts on
a medium-class personal computer. However, any application of wavelet smoothing
will usually require making many arbitrary decisions concerning the procedure’s
configuration (e.g., choosing the value of the smoothing constant and the maximum
level of decomposition). Splitting a single time series into estimation and validation
subsamples or, alternatively, applying a wavelet classification based on the (normal-
ized) wavelet variance at different scales to find the most similar cluster of historical
time series should help to optimize the settings in practical applications.
The forecasting procedures based on random signal estimation with wavelets
can be compared with analogous methods relying on wavelet thresholding (see
Alrumaih and Al-Fawzan 2002; Ferbar et al. 2009; Schlüter and Deuschle 2010). In
our opinion, the methods suggested here are better suited for short-term forecasting
of economic time series because wavelet thresholding builds on the assumption of
deterministic signals, transforms Gaussian processes into non-Gaussian ones, pre-
serves outliers and reduces the high frequency spectra to 0. The latter characteristic
is certainly problematic in the presence of certain high frequency oscillations in
the data. By contrast, wavelet smoothing is based on linear time-invariant filters
and offers more flexibility as to the level of noise reduction. In conclusion, we
recommend this approach for short-term forecasting based on wavelet denoising.
Acknowledgments The author acknowledges financial support from the Polish National Science
Center (Decision no. DEC-2013/09/B/HS4/02716). The author would also like to thank the
anonymous Reviewer for valuable comments and suggestions which helped improve the paper.
References
Alrumaih RM, Al-Fawzan MA (2002) Time series forecasting using wavelet denoising. J King
Saud Univ Eng Sci 14:221–234
Arino M (1995) Time series forecasts via wavelets: an application to car sales in the Spanish
market. Discussion Paper No. 95-30, Institute of Statistics and Decision Sciences, Duke Univer-
sity. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.9279&rep=rep1&type=pdf.
Accessed 18 Feb 2014
Bruzda J (2011) Some aspects of the discrete wavelet analysis of bivariate spectra for business
cycle synchronisation. Economics 16:1–46
Bruzda J. (2013a) On simple wavelet estimators of random signals and their small-sample
properties. Journal of Statistical Computation and Simulation, in press, http://dx.doi.org/10.
1080/00949655.2014.941843.
Bruzda J (2013b) Wavelet analysis in economic applications. Toruń University Press, Toruń
Chen H, Nicolis O, Vidakovic B (2010) Multiscale forecasting method using ARMAX models.
Curr Dev Theory Appl Wavelets 4:267–287
Craigmile PF, Percival DB (2005) Asymptotic decorrelation of between-scale wavelet coefficients.
IEEE T Inf Theory 51:1039–1048
Diebold FX, Mariano RS (1995) Comparing predictive accuracy. J Bus Econ Stat 13:253–263
Donoho DL, Johnstone IM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika
81:425–455
Forecasting via Wavelet Denoising: The Random Signal Case 225
Donoho DL, Johnstone IM (1995) Adapting to unknown smoothness via wavelet shrinkage. J Am
Stat Assoc 90:1200–1224
Ferbar L, Čreslovnik D, Mojškerc B, Rajgelj M (2009) Demand forecasting methods in a supply
chain: smoothing and denoising. Int J Prod Econ 118:49–54
Fernandez V (2008) Traditional versus novel forecasting techniques: how much do we gain? J
Forecasting 27:637–648
Fryźlewicz P, Van Bellegem S, von Sachs R (2003) Forecasting nonstationary time series by
wavelet process modelling. Ann I Stat Math 55:737–764
Kaboudan M (2005) Extended daily exchange rates forecasts using wavelet temporal resolution.
New Math Nat Comput 1:79–107
Kaiser R, Maravall A (2005) Combining filter design with model-based filtering (with an
application to business-cycle estimation). Int J Forecasting 21:691–710
Li TH, Hinich MJ (2002) A filter bank approach for modeling and forecasting seasonal patterns.
Technometrics 44:1–14
Makridakis S, Hibon M (2000) The M3-competition: results, conclusions and implications. Int J
Forecasting 16:451–476
Minu KK, Lineesh MC, Jessy John C (2010) Wavelet neural networks for nonlinear time series
analysis. Appl Math Sci 4:2485–2495
Nason GP (2008) Wavelet methods in statistics with R. Springer-Business Media, New York
Ogden RT (1997) Essential wavelets for statistical applications and data analysis. Birkhäuser,
Boston
Percival DB (1995) On estimation of the wavelet variance. Biometrika 82:619–631
Percival DB, Walden AT (2000) Wavelet methods for time series analysis. Cambridge University
Press, Cambridge
Renaud O, Starck J-L, Murtagh F (2002) Wavelet-based forecasting of short and long memory
time series. Working Paper No. 2002.04, University of Geneva. http://www.unige.ch/ses/metri/
cahiers/2002_04.pdf. Accessed 18 Feb 2014
Ramsey JB (1996) If nonlinear models cannot forecast, what use are they? Stud Nonlinear Dyn E
1:65–86
Ramsey JB (2002) Wavelets in economics and finance: past and future. Stud Nonlinear Dyn E
6(3):1–27, article 1
Ramsey JB, Lampart C (1998) The decomposition of economic relationships by time scale using
wavelets: expenditure and income. Stud Nonlinear Dyn E 3:23–42
Schlüter S, Deuschle C (2010) Using wavelets for time series forecasting––does it pay off?
Diskussionspapier No. 4/2010, Institut für Wirtschaftspolitik und Quantitative Wirtschafts-
forschung, Friedrich-Alexander-Universität. http://www.econstor.eu/obitstream/10419/36698/
1/626829879.pdf. Accessed 18 Feb 2014
Vidakovic B (1999) Statistical modeling by wavelets. Wiley, New York
Wong H, Ip W-C, Xie Z, Lui X (2003) Modelling and forecasting by wavelets, and the application
to exchange rates. J Appl Stat 30:537–553
Yu P, Goldenberg A, Bi Z (2001) Time series forecasting using wavelets with predictor-corrector
boundary treatment. In: Proceedings of the 7th ACM SIGKDD international conference on
knowledge discovery and data mining, San Francisco, 26–29 August 2001
Zhang B-L, Coggins R, Jabri MA, Dersch D, Flower B (2001) Multiresolution forecasting for
futures trading using wavelet decompositions. IEEE T Neural Networ 12:765–775
Short and Long Term Growth Effects
of Financial Crises
Abstract Growth theory predicts that poor countries will grow faster than rich
countries. Yet, growth in developing countries has been consistently lower than
growth in developed countries. The poor economic performance of developing
countries coincides with both long-lasting and short-lived financial crises. In this
paper, we analyze to what extent financial crises can explain low growth rates in
developing countries. We distinguish between inflation, currency, banking, debt, and
stock-market crises and separate the short- and long-run effects of them. Our results
show that financial crises have reduced growth and that policy decisions have caused
them to be worsened and/or extended.
1 Introduction
From 1973 to 2007, the labor productivity growth of developed countries averaged
2 % per year. Over the same period, the average labor productivity growth in Africa
and Latin America averaged 0.5 and 0.8 % per year, respectively. Only developing
countries in Asia were able to match (and exceed) growth in the developed world
(3.2 % per year).1 During this period, Africa and Latin America, in particular, faced
several financial crises (Wilson et al. 2000; Reinhart and Rogoff 2011). For example,
Latin America suffered economically due to persistent financial crises throughout
most of the 1970s and the 1980s (De Gregorior and Guidotti 1995), while large
1
http://www.conference-board.org/.
F.N.G. Andersson () • P. Karpestam
Department of Economics, Lund University, P.O. Box 7082, 220 07 Lund, Sweden
e-mail: [email protected]; [email protected]
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 227
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__10,
© Springer International Publishing Switzerland 2014
228 F.N.G. Andersson and P. Karpestam
parts of Africa faced “near-permanent banking-stress” for 20 years (Kane and Rice
2001).
In this paper, we analyze to what extent the poor economic performance in 30
developing countries between 1973 and 2007 can be explained by the occurrence
of financial crises. Previous studies of the effects of financial crisis have given
inconclusive results: the short-run growth effects are mostly negative (Norman and
Romain 2006; Ramey and Ramey 1995; Hausmann and Gavin 1996; Easterly et
al. 2001), but estimates of the long-term growth effects of financial crises are less
conclusive. Some studies show that long-run growth is reduced by financial crises
(Englebrecht and Langley 2001; Rousseau and Wachtel 2002; Bordo et al. 2010;
Eichengreen and Hausman 1999) while other suggests that long-run growth effects
is even enhanced by financial crises (Bruno and Easterly 1998; Ranciere et al. 2008).
Arguably, how severe the growth effects are of a financial crisis depends on (1)
what kind of financial crisis it is, (2) if more than one crisis is coinciding, (3) through
which growth channel it affects the economy and (4) the time horizon. The literature
on financial crises commonly distinguishes between five different types of financial
crises: inflation, currency, banking, debt, and stock market crises (see e.g. Reinhart
and Rogoff 2011). The respective types of financial crises are sometimes linked.
Sovereign debt crises, for example, are often preceded by a banking crisis, forcing
the national government to take over debts in the banking sector (Velasco 1987;
Reinhart and Rogoff 2011). In turn, debt crises often spill over into currency crises
(Kaminsky and Reinhardt 1999) and countries facing insolvency sometimes inflate
the economy to reduce the debt burden (Labán and Sturzenegger 1994). This action
may, in turn, cause an inflation crisis as well.
Each type of financial crisis can affect economic growth, but how much they
affect growth is likely to be different. Stock markets crashes for example can
affect investments (Tobin 1969; von Furstenberg 1977) and/or private consumption
through a wealth effect (Friedman 1957; Paiella 2009). But, in stock markets in
developing countries, only a limited number of people own shares (Enisan and
Olufisayo 2009). This causes wealth effects to be small at the aggregate level.
A currency crisis, however, is likely to have more severe effects on the economy than
the stock market crash, especially for developing countries that are dependent on
foreign investment capital and technology. Similarly, a banking crisis that affects the
channeling of capital from savers to borrowers are likely to affect growth more than
an inflation crisis against which agents in the economy can hedge by for example
by price indexing contracts (McNelis 1988).
Bonfiglioli (2008) argue that financial crises that only affect productivity growth
are worse for a developing country than financial crises that only affect capital accu-
mulation. Most of the income difference among countries is due to differences in
productivity and not in capital intensity (Gourinchas and Jeanne 2006). A financial
crisis that affects productivity growth causes greater welfare effects than a financial
crisis that affects capital accumulation for a developing country because it reduces
that country’s ability to catch up economically with developed countries (Bonfiglioli
2008). Understanding through which growth channel the crises affects growth is
thus important.
Short and Long Term Growth Effects of Financial Crises 229
The time horizon is also likely to affect the impact of the financial crises on
economic growth. An inflation crisis is less likely to affect growth over the long-term
given that the agents index their contracts, but if the crisis is unexpected could have
negative effects over the short-term. A debt crisis on the other hand that only affects
the country’s access to capital for a year is unlikely to cause a major reduction
in capital accumulation while a debt crisis that continues to restrict the country’s
access to foreign capital for several years is likely to impact capital accumulation
over the long-term.
Most papers only consider the growth effects of one kind of financial crisis at the
time and either the short-run or the long-run growth effects of that specific kind of
crisis. In this paper, we analyze the growth effects of five different types of financial
crises (inflation, currency, banking, debt, and stock market crashes) simultaneously
on labor productivity growth and its two growth channels (total factor productivity
growth and capital accumulation). The growth effects are separated into short-run
effects and long-run effects using a Band Spectrum Regression model (see e.g.
Andersson 2011a). Our focus is on developing countries, but we compare and
contrast the results with developments among developed countries over the same
time period.
Our results show that financial crises have negative growth effects both in the
short-run and the long-run. Inflation crises have mostly only short-run effects on
growth and persistent inflation crises have no long-run growth effects. Persistent
debt and currency crises on the other hand are associated with a decline in the long-
run growth rates. The long-run growth effects mainly occur through the total factor
productivity channel, although there is an effect on capital accumulation as well.
Based on these results we find that growth in Latin America would have kept pace
with the growth in developed countries. Growth among African countries would
also have been higher had they not suffered from financial crises, but the average
growth rate would still have been lower than among developed countries.
The remainder of the paper is organized as follows: Sect. 2 presents the model,
Sect. 3 contains the empirical results, and Sect. 4 concludes the paper.
The starting point for our model is a Cobb–Douglas production function with Harrod
neutral technology and constant returns to scale,
whereyit D ln(Yit /Lit ) ln(Yit 1 /Lit 1 ), ait D ln(Ait ) ln(Ait 1 ) and kit D
ln(Kit /Lit ) ln(Kit 1 /Lit 1 ). In Eq. (2) we observe that labor productivity growth
comes from two channels: either total factor productivity growth (ait ) or capital
accumulation (kit ). Financial crises may thus either affect labor productivity
growth through total factor productivity, through capital accumulation or through
both.
We estimate three regression models using a band spectrum regression (see Engle
1974) to test which financial crises that affect labor productivity growth, through
which growth channel and over which time horizon they affect economic growth.
A band spectrum regression is an estimation technique to separate between effects
over different time horizons. The estimation is carried out in two steps. First, all
the variables are transformed to the frequency domain where each time-horizon (in
our case the short-run and the long-run) are easily identified. Second, parameter
estimates for each time horizon are then obtained by estimating the models on a sub-
set of frequencies representing a given horizon rather than on the entire frequency
band (for more information see Engle 1974; Andersson 2011a). In a small sample,
the band spectrum regression generally performs better compared to other methods
to distinguish between short-run and long-run effects such as cointegration analysis
(Corbae et al. 2002; Andersson 2008) or a simple moving average.
Any band pass filter can be used to transform the series to the frequency domain.
In this paper we use the Maximal Overlap Discrete Wavelet Transform (MODWT).2
This transform is chosen because it combines time and frequency resolution,
whereby the transform is suitable to transform series that contains nonrecurring
events such as structural breaks and outliers (Percival and Walden 2006).3
The band spectrum regression is not limited to two time horizons and the model
can include several time horizons (e.g. short-run, medium-run, long-run, etc.). But,
in line with standard economic theory, we limit the analysis to a short-run business
cycle component and the long-run trend component. Baxter and King (1999),
Assenmacher-Wesche and Gerlach (2008a, b) and Andersson (2011b) show that the
business cycle in general lasts between 4 and 8 years, and we consequently define
the short-run as fluctuations lasting up to 8 years.
Specifically, we estimate the following three regression models; one model for
labor productivity growth and one model for each of the two growth channels. The
model for labor productivity growth is given by,
2
To employ the maximal overlap discrete wavelet transform one can use several different sets
of basis functions. We chose to use Haar wavelet basis functions because they minimize the
potential effect of boundary coefficients (see Percival and Walden 2006). Alternative basis have
been employed such as the Daubechie (4) and Daubechie (6) wavelets but the results are similar
irrespective of filter.
3
For more information about the MODWT, see e.g., Ramsey and Lampart (1998), Percival and
Walden (2006), Crowley (2007), and Andersson (2008).
Short and Long Term Growth Effects of Financial Crises 231
y D ˇy1 C ˇy1 FitSR C ˇy2 FitLR C y1 CitSR C y1 CitLR C "yit ; (3)
ait D ˇa1 C ˇa1 FitSR C ˇa2 FitLR C a1 CitSR C a1 CitLR C "ait ; (4)
and one model for capital per employee growth is given by,
kit D ˇk1 C ˇk1 FitSR C ˇk2 FitLR C k1 CitSR C k1 CitLR C "kit ; (5)
where, Fit is a vector with dummy variables indicating the respective types of
financial crises and Cit is a vector with common used control variables (see below),
SR denotes the short-run, LR denotes the long-run, ˇ and are the parameters to be
estimated, and " is a stochastic error term.
The variables are decomposed into a short-run and a long-run component using
the MODWT. Applying the MODWT we get the following decomposition of F
and C,
and
where the wavelet details D1 and D2 represent 2–4 year-long and 4–8 years-long
cycles respectively, and the wavelet smooth S2 is the trend component representing
persistent long-run developments in the economy lasting 8 years and beyond.4 Given
our definition of the length of the business cycle, the short-run components of F and
C are represented the two wavelet details,
and
4
The decomposition of the variables is made variable-by-variable and country-by-country. Not just
the dependent is decomposed, but all variables are decomposed into time horizons.
232 F.N.G. Andersson and P. Karpestam
and
3 Empirical Analysis
3.1 Data
Our data set contains 51 countries (see Table 9 in Appendix) covering the period of
1973–2007. The final year is dictated by availability of real investment data (Penn
World Table 6.3) that is needed to generate national capital stock estimates. Of the
51 countries, the World Bank classifies 21 as developed countries, and 30 countries
are classified as developing countries.6 We rely on external data sources for labor
productivity, financial crises, institutions, education, and globalization. A detailed
description of the data and the data sources are available in Table 10 (Appendix).
Our indicators of financial crises are collected from Reinhart and Rogoff’s (2011)
database.7 This database distinguishes between five different types of crises that
are indicated with dummy variables: inflation, currency, banking, debt, and stock
5
Data availability makes it impossible to find external instruments for each of the five financial
crises, and we rely instead on internal instruments.
6
See http://data.worldbank.org/about/country-classifications.
7
The database can be obtained from Reinhart’s webpage: http://terpconnect.umd.edu/~creinhar/
Courses.html.
Short and Long Term Growth Effects of Financial Crises 233
where h denotes the time horizon (i.e. SR or LR)., Using Eq. (12) and the parameter
estimates, b
, total factor productivity is then estimated as,
aith D yith b
b yh2 kith (13)
For education, we use the total years of schooling among the labor force.10
Education data are only available at a 5-year interval, and without higher frequency
data, we cannot include the variable in the short-run models. Therefore, education
is only included in the long-run models. To capture the effect of globalization
on the financial system and the overall economy, we use the KOF index, which
is a combined measure of economic, social and political globalization (Dreher
2006). Recently, the KOF index has been used in empirical research to capture
the macroeconomic effects of the current globalization process (see e.g., Bergh
and Nilsson 2010). Based on Cavallo and Cavallo’s (2010) discussion of the link
between democratic institutions and financial crises, we use the Freedom House
political rights index to control for institutional quality. Each country is scored by
Freedom House between 1 and 7, where countries with a score between 1 and 2.5
are defined as free. Countries with a score between 3.0 and 5.0 are partly free, and
countries with a score between 5.5 and 7 are not free. Because we are modeling
growth rates, we use the percentage change in education, political rights and the
KOF index in the regression models.
8
We also tested alternative depreciation rates (3 and 7 %), but changing the depreciation rate has
only a minor effect on estimated capital output elasticity, and no significant effect on the estimates
of the effects of financial crises.
9
This regression model is derived from Eq (2).
10
Alternative measures, such as secondary schooling, were also considered, but models including
total schooling have better statistical properties than models using secondary schooling.
234 F.N.G. Andersson and P. Karpestam
Labor productivity growth rates: growth is, on average, the highest among
developing Asian countries with an average yearly growth rate of 3.19 %, while it is
the lowest among African countries, at 0.51 % per year. Among Latin American
countries, average labor productivity growth is 0.78 % per year and among
developed countries 2.00 %. Labor productivity growth is more volatile among
developing countries than among developed countries. While growth remains within
a span of 2 to 5 % per year among developed countries, among African countries
yearly growth fluctuations of ˙15 % points are common.
As can be seen in Table 1, developed countries have experienced fewer financial
crises than developing countries (0.54 per year). A stock market crisis is the most
common (0.27 per year), and a debt crisis is the least common (0 per year). Among
African countries, the average is 1.21 per year, and a stock market crash (0.34 per
year) is the most common followed by debt (0.28 per year), currency (0.23 per year)
and inflation crises (0.20 per year). Developing Asian countries experience 0.84
crises per year of which a stock market crash (0.27 per year) and bank crisis (0.23
per year) are the two most common types. Latin America has the highest frequency
of financial crises (1.73 per year). In Latin America, inflation crises are the most
common (0.45 per year), followed by debt (0.43 per year), and currency crises (0.42
per year).
Currency and debt crises often coincide in the long-run, see Table 2. The
correlation between inflation and currency crises is 0.77, and the correlation between
inflation and debt crises is 0.43. There is, however, no significant correlation
between any of the other financial crises. Over the short term (Table 3), the
highest correlation is between inflation and currency crises, at 0.14, but this is not
significantly different from zero. Although financial crises occur simultaneously
over the long term, they are independent over the short term.
The high long-term correlation between inflation and currency crises implies
that we can interpret these two crises as a joint monetary crisis instead of two
Short and Long Term Growth Effects of Financial Crises 235
separate crises (over the long term). The significant and positive correlation with
the Freedom House political rights index suggests that policy decisions are at least
in part responsible for causing the monetary crises.
For each regression model, we present two regression results: the results from a
complete model that includes all variables and the results from a reduced model
where the insignificant variables have been removed. The error term in the model
is specified as a two-way error component model that includes fixed effects for
both cross-sectional and time effects. We use robust standard errors to account for
236 F.N.G. Andersson and P. Karpestam
heteroskedasticity (see e.g., Arellano 1987; Baltagi 2008). The regression results
are available in Table 4 (long run) and Table 5 (short run).11
Labor productivity growth responds negatively to a financial crisis both over the
long term and the short term. However, the impacts of the different types of crises
are not the same in the short and the long run. Inflation, currency, and banking
crises affect growth in the short run, but in the long run, only currency and debt
crises have significant effects. Stock market crashes have no growth effect at all,
irrespective of the time horizon. Banking crises have the largest short-term effect
on growth, 1.33 % points per year, and currency crises have the largest long-term
effect, 1.27 %.
Overall, short-run growth models explain little of the variation in the data (R2
is 0.11). Short-term crises have no long-run effect, and their most negative effect
comes from increasing volatility in the economy. But, even if financial crises do
cause higher short-term volatility, as is indicated by the low R2 -values, most of the
short-term volatility in the data is due to other factors. Because of this, the impact
of financial crises over the short term is limited. Over the long term, the explanatory
power of the models is higher: R2 is between 0.36 and 0.39.
The high long-term correlation between inflation and currency crises creates a
multicolinearity problem in the model, and it is only possible to include one of
the two at a time. However, because of the high correlation between the two, we
interpret them as representing a monetary crisis. The effect of a long-run monetary
(currency) crisis reduces growth by 1.09 % points per year. When occurring jointly
with a debt crisis (which is often the case), growth is reduced by another 1.27 %
point. Combined, the two crises thus reduce growth by 2.34 % points per year.
Turning to the growth channels, we find a stronger effect of financial crises on
total factor productivity than on capital accumulation. This result is in accordance
with Bonfiglioli (2008), who found that financial development has a stronger effect
on productivity than on capital accumulation. In the short run, financial crises
have a negative impact on total factor productivity, but no effect on the capital
accumulation. Because these negative effects on productivity capture both demand
and productivity effects over the short term and capital accumulation is unaffected
by financial crises, these results indicate that aggregate demand is more important
than the aggregate supply side in the response to financial crises in the short-run.
In the long run, financial crises (i.e., a debt crisis) have a negative impact on
both capital accumulation and total factor productivity. Debt crises reduce capital
accumulation growth by 2.14 % points and total factor productivity by 0.98 %
points. Total factor productivity is also negatively affected by monetary crises
(currency crisis), at 1.18 % points. Considering that these crises often coincide,
11
We have assumed that the errors in the respective regression models are normally distributed
to perform inference on the parameters. The normality hypothesis is supported by a Jarque–
Bera normality test for all but one case—the long-run African labor productivity growth model.
However, once we include two dummy variables to control for outliers we do not reject the
normality assumption for this growth model either.
Table 4 Regression results of the long-run growth models
Labor productivity growth Capital growth Total factor productivity
Full model Reduced model Full model Reduced model Full model Reduced model
Capital 0.27***(0.05) 0.27***(0.06) – – – –
Inflation 0.08(0.92) – 0.38(1.28) – 0.10(0.73) –
Currency 1.39*(0.81) 1.27**(0.58) 1.80(1.42) – 1.28(0.81) 1.18**(0.58)
Banking 0.81(0.51) – 0.10(0.90) – 0.80(0.51) –
Debt 0.95**(0.47) 1.09**(0.48) 1.79**(0.81) 2.14***(0.73) 0.84*(0.46) 0.98**(0.47)
Stock market 0.96(0.83) – 1.02(1.46) – 0.89(0.84) –
Short and Long Term Growth Effects of Financial Crises
the combined effect on total factor productivity is 2.08 % points each year the
crisis lasts.
All countries have experienced short-run financial crises, but only Africa and
Latin America have experienced persistent long-run financial crises. To explore if
the crises effects are the same for both continents, we estimate two sub-panels using
long-run data: one for African countries and one for Latin American countries.
These long-run estimation results are presented in Table 6.
For Africa as well as Latin America, a debt crisis has a significant and negative
impact on capital accumulation. However, a debt crisis affects total factor produc-
tivity in Africa but not in Latin America. Instead, total factor productivity in Latin
America is affected negatively by inflation crises. Further, capital accumulation in
Latin America is negatively affected by banking crises, which is not the case for
Africa.
A positive and significant correlation between monetary crises and the Freedom
House index suggests that monetary crises are partially caused by monetary policy
decisions over the long term. For example, debt crises during the early 1980s
created a need for many developing countries to become less dependent on foreign
sources of capital and adjust their economies. Latin American economies postponed
this process by inflating their currency (Labán and Sturzenegger 1994). Not all
developing countries have followed this same path (Djikstra 1997). Consequently,
they have not suffered as much from the inflation and currency crises that resulted
from the policy response. For example, during the Southeast Asian crises in 1996–
1997, policy makers responded quickly and inflation never rose to the same levels
as in Latin America. As a result, Southeast Asia recovered quickly from the crisis
(Pilbeam 2006).
12
The sum of the capital accumulation effect and total factor productivity equals labor productivity
growth, see Eq. (2).
240
Table 6 Regression results of the long-run growth models for Africa and Latin America
Growth Capital growth Total factor productivity
Full model Reduced model Full model Reduced model Full model Reduced model
Africa Latin America Africa Latin America Africa Latin America
Capital 0.12(0.12) 0.35***(0.06) – – – –
Inflation – – – – – 2.00***(0.80)
Currency – 1.02**(0.48) – – – –
Banking – 1.77***(0.54) – 2.59**(1.21) – –
Debt 2.15***(0.73) – 2.17***(0.92) 1.94**(0.85) 1.79**(0.72) –
Stock market – – – – – –
Political rights 0.53**(0.25) 0.47***(0.24) – – 0.42*(0.25) 0.43***(0.18)
Education – – – – – 0.41**(0.20)
Globalization (KOF) – – – – – –
Adjusted R2 0.38 0.89 0.12 0.14 0.20 0.28
BIC 0.04 0.32 0.55 0.42 0.06 0.47
*Significant level at 10 %; **significant level at 5 %; ***significant level at 1 %
F.N.G. Andersson and P. Karpestam
Short and Long Term Growth Effects of Financial Crises 241
Table 7 Long run labor productivity-, capital- and total factor productivity growth for the average
African and the average Latin American country
Total Total
Capital factor Capital factor
Average Growth growth productivity Growth growth productivity
Latin America (%) Africa (%)
19731980 0.94 1.12 0.19 1.32 1.78 0.47
19811990 0.42 0.39 0.81 0.32 0.24 0.56
19912000 1.19 0.32 0.86 0.18 0.34 0.16
20012007 1.48 0.13 1.32 1.32 0.23 1.09
19732007 0.78 0.49 0.28 0.51 0.48 0.03
Latin America potential growth (%) Africa potential growth (%)
19731980 2.19 1.34 0.85 1.60 1.97 0.37
19811990 1.49 1.00 0.49 0.56 0.50 0.06
19912000 2.45 0.77 1.68 0.60 0.14 0.74
20012007 2.00 0.33 1.67 2.00 0.44 1.56
19732007 2.03 0.86 1.17 1.19 0.69 0.50
Table 8 Long run labor productivity-, capital- and total factor productivity growth for the average
developed and the average Asian country
Total Total
Capital factor Capital factor
Average Growth growth productivity Growth growth productivity
Developed countries (%) Asia (%)
19731980 2.54 1.63 0.91 3.08 2.56 0.53
19811990 1.98 1.10 0.88 2.76 1.87 0.89
19912000 1.93 0.87 1.06 3.01 1.60 1.41
20012007 1.51 0.79 0.72 3.50 1.15 2.35
19732007 1.99 1.10 0.89 3.19 1.80 1.40
2.5
1.5
0.5
1972 1977 1982 1987 1992 1997 2002 2007
Developed Countries Long-run Producvity Asia Long-run Producvity
Africa Potenal Producvity Lan America Potenal Producvity
Fig. 1 Estimated long-run labor productivity growth for the average developed, African, Asian
and Latin American country
Financial crises are estimated to have reduced average growth in Latin America
by 1.25 % points per year and African countries by 0.68 % points per year. The
average potential growth for the entire period of 1973–2007 in Latin America is
equal to the observed growth for the developed countries: 2.03 % compared to
2.00 %, respectively. Average potential African growth is lower, at 1.19 %. From
2000 to 2007, however, potential African growth exceeded observed growth among
developed countries (2.00 % compared to 1.51 %).
On average, growth is the highest in Asia. Table 8 shows that the high Asian
growth rates of 50 % can be explained by capital accumulation. Latin American
growth is lagging behind observed Asian growth due to lower capital accumulation
rates. Additionally, Africa is trailing Asia because of lower potential capital
accumulation rates and lower potential total factor productivity growth.
In relation to developed countries, these results show that Latin America would
have been falling behind during the 1970s and 1980s had there been no financial
crises. They also show that they would have been catching up from the 1990s
and onward. Similarly, Africa would have been falling behind from the 1970s and
throughout the 1990s but catching up thereafter. Without the financial crises, growth
would have been higher, but limited investments (due to other factors than financial
crises) would still prevent African and Latin American countries from catching up
to developed countries and developing Asian countries.
In Fig. 1, potential African and Latin American labor productivity level is plotted
together with the observed long run labor productivity level for Asia and the other
developed countries. Because our data set begins in 1973, we set the productivity
level to 1 in 1972. As can be seen in Fig. 1, developing Asian countries outpace all
other countries. Latin American countries catch up with developed countries in the
late 1980s, and both set of countries double their productivity level between 1972
and 2007. African countries, however, still lag behind.
Short and Long Term Growth Effects of Financial Crises 243
2.5 2.5
2.3 2.3
2.1 2.1
1.9 1.9
1.7 1.7
1.5 1.5
1.3 1.3
1.1 1.1
0.9 0.9
0.7 0.7
0.5 0.5
1972 1977 1982 1987 1992 1997 2002 2007 1972 1977 1982 1987 1992 1997 2002 2007
Fig. 2 Observed and potential long-run labor productivity growth for the average African and
Latin American country. (Panel a) Africa long-run productivity level. (Panel b) Latin America
long-run productivity level
The difference between the estimated long-run productivity level and the esti-
mated long-run potential labor productivity level are shown in Fig. 2: Africa is in
Panel A, and Latin America is in Panel B. As can be seen in the figure, this difference
grows persistently over time. In 2007, the actual productivity level was 36.2 % below
the potential in Latin America and 22.2 % in Africa. Considering that productivity
has been below the potential level since the 1970s, we define, similar to Boyd et al.
(2002), the cost of financial crises as the cumulative difference between potential
and the actual productivity level,
X2007
ln .potential productivity leveli / ln .long run productivity leveli / :
i D1973
(17)
For African countries, the cumulative cost of financial crises equals 3.92 years
of production per employee between 1973 and 2007 and 9.14 years of production
per employee for Latin American countries. Despite the fact that financial crises
cannot fully explain why Latin American and African countries are lagging behind
productivity in developed countries and developing Asian countries, the cost of long
term financial crises are substantial over time.
4 Conclusions
Our results show that long-run financial crises can in part explain the poor economic
performance of African and Latin American developing countries since the 1970s.
Without financial crises over the entire period of 1972–2007, Latin American
growth would have equaled that of developed countries. However, Africa would
have still lagged behind. Our results suggests that the most influential of all crises are
244 F.N.G. Andersson and P. Karpestam
debt crises, which have affected both African and Latin American countries over the
long term. Debt crises are also significantly correlated with inflation and currency
crises. Moreover, inflation and currency crises are correlated with Freedom House’s
political rights index, which suggest that the policy response to the debt crises of the
early 1980s made the economic growth consequences of the debt crises worse.
These results also show that even without financial crises, African and Latin
American capital accumulation rates would have been lagging behind the rates of
developed countries and, in particular, the capital accumulation rates of developing
Asian countries. Over the considered period, Asian countries have grown the fastest.
Additionally, more than 50 % of their growth is explained by capital growth. Even
if financial crises can explain part of the African and Latin American countries
poor economic performance, other factors affecting capital growth have contributed
significantly.
Bonfiglioli (2008) and Gourinchas and Jeanne (2006) have argued that low
productivity growth is worse for a developing country than low capital accumulation
rates, as the potential to catch up with rich countries is conditioned on the same level
of productivity. Our results show that financial crises, over the long term, affect both
capital accumulation and total factor productivity. Our results thus indicate that the
crises and their subsequent policy responses have had a severe negative impact on
the ability of developing countries to catch up with developed countries.
Financial crises have both short- and long-term economic effects. However,
financial crises explain little of the short-term variation in the data. Although finan-
cial crises have a negative impact on all countries (not just developing countries),
compared to the “normal” short term volatility in the data (caused by non-crises
factors), financial crises generate little volatility. The short-term consequences are
consequently small compared to the long-term consequences.
Appendix
Table 10 (continued)
Variable Description
Financial crisis We rely on Reinhart and Rogoff’s (2011) database of financial crises. The
database distinguishes between five different crises (inflation,
currency, debt, banking and stock market crises). An inflation crisis is
defined as annual inflation exceeding 20 %. A currency crisis is
defined as the domestic currency losing 15 % of its value against the
USD or another relevant currency. A banking crisis is defined as a
bank run leading to a bank closure, merger or takeover by the public
sector. A banking crisis is also when a bank needs assistance, which
spreads to other banking institutions. A debt crisis is when a country
defaults on its external debtThis database can be found here: http://
terpconnect.umd.edu/~creinhar/Courses-html. The link also contains
a detailed description of the data
Education The education variable measures the increase in the total number of years
of schooling among the labor force. The data are collected from the
World Development Indicators (http://data.worldbank.org/indicator)
Political rights We use Freedom House’s political rights index. The database can be
found here: www.freedomhouse.org
Globalization To measure globalization, we use the KOF index by Dreher (2006),
which combines three dimensions of globalization (economic, social,
and political). Economic globalization accounts for 36 % of the index,
social globalization for 38 % of the index, and political globalization
for 26 % of the index. The database is available from: http://
globalization.kof.ethz.ch/
References
Bordo MD, Meissner CM, Stuckler D (2010) Foreign currency debt, financial crises and economic
growth: a long-run view. J Int Money Finance 29:642–665
Boyd JH, Kwak S, Smith B (2002) The real output losses associated with modern banking crises.
J Money Credit Bank 37:977–999
Bruno M, Easterly W (1998) Inflation crises and long-run growth. J Monetary Econ 41:3–26
Cavallo AF, Cavallo EA (2010) Are crises good for long-term growth? The role of political
institutions. J Macroecon 32:838–857
Corbae D, Ouliaris S, Phillips PCB (2002) Band spectral regression with trending data. Economet-
rica 70:1067–1109
Crowley PM (2007) A guide to wavelets for economists. J Econ Surv 21:207–267
De Gregorior J, Guidotti P (1995) Financial development and economic growth. World Dev
23(1):433–448
Djikstra AG (1997) Fighting inflation in Latin America. Dev Change 28:531–557
Dreher A (2006) Does globalization affect growth? empirical evidence from a new index. Appl
Econ 38:1091–1110
Easterly W, Islam R, Stiglitz J (2001) Volatility and macroeconomic paradigm for rich and poor
countries: advances in macroeconomic theory. In: Drèze JD (ed) Advances in macroeconomic
theory. Palgrave, New York
Eichengreen B, Hausman R (1999) Exchange rates and financial fragility. In: Proceedings Federal
Reserve Bank of Kansas City, pp 329–368
Englebrecht H-J, Langley C (2001) Inflation crisis, deflation, and growth: further evidence. Appl
Econ 33:1157–1165
Engle RF (1974) Bandspectrum regressions. Int Econ Rev 15(1):1–11
Enisan AA, Olufisayo AO (2009) Stock market development and economic growth: evidence from
seven sub-Saharan African countries. J Econ Bus 61:162–171
Friedman M (1957) A theory of the consumption function. Princeton University Press, Princeton
von Furstenberg GM (1977) Corporate investment: does market valuation matter in the aggregate?
Brookings Papers Econ Act 8:347–408
Gourinchas P, Jeanne O (2006) The elusive gains from international financial integration. Rev Econ
Stud 73(3):715–741
Hausmann R, Gavin M (1996) Securing stability and growth and in a shock prone region. The
Policy Challenge for Latin America. Inter-American Development Bank Research Department
Working Paper 315
Kaminsky GL, Reinhardt CM (1999) The twin crises: the causes of banking and balance-of-
payments problems. Am Econ Rev 89:473–500
Kane EJ, Rice T (2001) Bank runs and banking policies: lessons for African policy makers. J Afr
Econ 10:36–71
Labán R, Sturzenegger F (1994) Fiscal conservatism as a response to the debt crisis. J Dev Econ
45:305–324
Larson DF, Butzer R, Mundlak Y, Crego A (2000) A cross-country database for sector investment
and capital. World Bank Econ Rev 14:371–91
McNelis PD (1988) Indexation and stabilization: theory and experience. World Bank Res Obs
3:157–169
Norman VL, Romain R (2006) Financial development, financial fragility, and growth. J Money
Credit Bank 38:1051–1076
Paiella M (2009) The stock market, housing and consumption spending: a survey of the evidence
on wealth effects. J Econ Surv 23:947–973
Percival D, Walden T (2006) Wavelet methods for time series analysis. Cambridge University
Press, New York
Pilbeam K (2006) International Finance. Palgrave McMillan, New York
Ranciere R, Tornell A, Westermann F (2008) Systemic crises and growth. Q J Econ 123:359–406
Ramey V, Ramey R (1995) Cross country evidence on the link between volatility and growth. Am
Econ Rev 85:1138–1159
248 F.N.G. Andersson and P. Karpestam
Ramsey JB, Lampart C (1998) The decomposition of economic relationships by time scale using
wavelets: expenditure and income. Stud Non-Linear Dyn Econom 3:23–42
Reinhart C, Rogoff KS (2011) From financial crisis to debt crisis. Am Econ Rev 101(5):1676–1706
Rousseau PL, Wachtel P (2002) Inflation thresholds and the finance-growth nexus. J Int Money
Finance 21:777–793
Rodrik D (2000) Institutions for high-quality growth: what they are and how to acquire them.
NBER working paper 7540.
Tobin J (1969) A general equilibrium approach to monetary theory. J Money Credit Bank 1:15–29
Tommasi M (2004) Crisis, political institutions, and policy reform: the good, the bad, and the ugly.
In: Tungodden B, Stern N, Kolstad I (eds) Annual World Bank conference on development
economic–Europe 2003: toward pro-poor policies: aid, institutions and globalization. World
Bank and Oxford University Press, Oxford
Velasco A (1987) Financial crises and balance of payments crises. A simple model of the southern
cone experience. J Dev Econ 27:263–283
Wilson B, Saunders A, Gerard CJR (2000) Financial fragility and Mexico’s 1994 Peso Crisis: an
event-window analysis of market-valuation effects. J Money Credit Bank 32(3):450–468
Measuring Risk Aversion Across Countries
from the Consumption-CAPM: A Spectral
Approach
A recurrent puzzle in the macroeconomics and finance literature has been the failure
of financial theory to explain the magnitude of excess stock returns by the covariance
between the return on stocks and consumption growth over the same period, termed
as the “equity premium puzzle” (Mehra and Prescott 1985). Standard asset pricing
models, like the Consumption Capital Asset Pricing Model (henceforth C-CAPM),
can only match the data if investors are extremely risk averse in order to reconcile
the large differential between real equity returns and real returns available on
E. Panopoulou ()
Kent Business School, University of Kent, Canterbury CT2 7PE, UK
e-mail: [email protected]
S. Kalyvitis
Department of International and European Economic Studies, Athens University of Economics
and Business, Patision Str 76, Athens 10434, Greece
e-mail: [email protected]
M. Gallegati and W. Semmler (eds.), Wavelet Applications in Economics and Finance, 249
Dynamic Modeling and Econometrics in Economics and Finance 20,
DOI 10.1007/978-3-319-07061-2__11,
© Springer International Publishing Switzerland 2014
250 E. Panopoulou and S. Kalyvitis
short-term debt instruments.1 Much of the resulting empirical literature has focused
on the US markets where longer data series exist, whereas Campbell (1996, 2003)
focuses on some smaller stock markets and finds evidence that the “equity premium
puzzle” persists. Specifically, Campbell (2003) reports evidence from 11 countries
that imply extremely high values of risk aversion, which usually exceed many times
the value of 10 considered plausible by Mehra and Prescott (1985), and claims
“: : :that the equity premium puzzle is a robust phenomenon in international data”.
Most empirical studies on the “equity premium puzzle” have focused on
relatively short horizons; however, examining the long-run components (“low
frequencies”) of the puzzle is important because the majority of investors typically
have long holding horizons. Indeed, Brainard et al. (1991) have shown that the
performance of the C-CAPM improves as the horizon increases, a finding confirmed
by Daniel and Marshall (1997) who have found that at lower frequencies aggregate
returns and consumption growth are more correlated and the behavior of the equity
premium becomes less puzzling. In a series of papers, Parker (2001, 2003) and
Parker and Julliard (2005) have allowed for long-term consumption dynamics by
focusing on the ultimate risk to consumption, defined as the covariance between an
asset’s return during a quarter and consumption growth over the quarter of the return
and several following quarters, and have found that it explains the cross-sectional
variation in returns surprisingly well, but also show that the “equity premium
puzzle” is not eliminated.
In this paper we follow step-by-step the approach adopted by Campbell (2003)
using the same model and data, in order to re-evaluate over the frequency domain
his assessment that the standard, representative agent, consumption-based asset
pricing theory based on constant relative risk aversion utility fails to explain the
average returns of risky assets in international markets. We choose to proceed using
Campbell’s (2003) theoretical setup and dataset in order to make our results as
comparable as possible and we adopt a spectral approach to re-estimate the values
of risk aversion over the frequency domain. According to the spectral representation
theorem (Granger and Hatanaka 1964) a time series can be seen as the sum of
waves of different periodicity and, hence, there is no reason to believe that economic
variables should present the same lead/lag cross-correlation at all frequencies. We
incorporate this rationale into Campbell’s (2003) approach and dataset in order
to separate different layers of dynamic behavior of “equity premium puzzle” by
distinguishing between the short run (fluctuations from 2 to 8 quarters), the medium
run or business cycle (lasting from 8 to 32 quarters), and the long run (oscillations of
duration above 32 quarters). Our findings indicate that in the short run and medium
run, the coefficients of risk aversion for the countries at hand are implausibly
high, confirming the evidence reported by Campbell (2003). However, at lower
frequencies risk aversion falls substantially across countries, thus yielding in many
cases reasonable values of the implied coefficient of risk aversion.
1
Mehra (2003) and Cochrane (2005) provide extensive surveys of the relevant literature.
Measuring Risk Aversion Across Countries from the Consumption-CAPM: . . . 251
Our results are in line with evidence from long-run asset pricing. Bansal and
Yaron (2004), Bansal et al. (2005) and Hansen et al. (2008) have shown that
when consumption risk is measured by the covariance between long-run cashflows
from holding a security and long-run consumption growth in the economy, the
differences in consumption risk provide useful information about the expected
return differentials across assets. Theoretical research on asset pricing using loss
aversion theory suggests that time-varying expected asset returns follow a low
frequency movement (Barberis et al. 2001; Grüne and Semmler 2008). Semmler
et al. (2009) have shown that when there are time-varying investment opportunities,
due to low frequency movements in the returns, a buy and hold strategy is not
optimal. Readjustments of consumption and rebalancing of the portfolio should
therefore follow the low frequency component of the returns from the financial
assets in order to increase wealth and welfare.
It is worth noting that the spectral estimation of consumption-based models has
also been considered by Berkowitz (2001) and Cogley (2001). Berkowitz (2001)
has proposed a one-step Generalized Spectral estimation technique for estimating
parameters of a wide class of dynamic rational expectations models in the frequency
domain. By applying his method to the C-CAPM he finds that when the focus is
oriented towards lower frequencies, risk aversion attains more plausible values at
the cost of a risk-free rate puzzle generated by low estimates of the discount factor.
Cogley (2001) decomposes approximation errors over the frequency domain from
a variety of stochastic discount factor models and finds that their fit improves at
low frequencies, but only for high degrees of calibrated risk aversion. Recently,
Kalyvitis and Panopoulou (2013) show how low frequencies of consumption risk
can be incorporated in the standard (Fama and French 1992) two-step estimation
methodology and find that its lower frequencies can explain the cross-sectional
variation of expected returns in the U.S. and eliminate the “equity premium puzzle”.
In this paper we show how low frequencies of consumption risk can be incorporated
in Campbell’s (2003) empirical setup in an easily implementable way, in order to
separate and compare different layers of dynamic behavior of the “equity premium
puzzle” across countries by distinguishing between the short run, the medium run
(business cycle), and the long run.
We close the introductory section by noting that our approach complements
standard time-domain analysis by interpreting (high) low-frequency estimates as
the (short) long-run component of the “equity premium puzzle”. Yet we stress that
the maintained hypothesis is that over any subsegment of the observed time series
the precise same frequencies hold at the same amplitudes, resulting in a signal
that is homogeneous over time. A straightforward extension therefore to address
the empirical limitations of the standard model is to consider state-dependent
preferences.2 As is well known, equity risk premia are higher at business cycles
troughs compared to peaks (Campbell and Cochrane 1999). In turn, a number of
papers have explored the implications for asset pricing of allowing the coefficient of
2
We thank an anonymous Referee for pointing out to us this extension.
252 E. Panopoulou and S. Kalyvitis
relative risk aversion to vary with key macroeconomic aggregates. Danthine et al.
(2004) allow the pricing kernel to depend on the level of consumption, in addition
to its growth rate. In a similar vein, Gordon and St-Amour (2004) provide strong
empirical evidence for countercyclical risk aversion, rising during recessions and
falling during expansions, by postulating a model with time varying risk aversion
depending on per capita consumption. Lettau and Ludvigson (2009) show that the
leading asset pricing models fundamentally mischaracterize the observed positive
joint behavior of consumption and asset returns in recessions, when aggregate
consumption is falling. Another related extension involves the differential impact of
structural breaks, crises or ‘rare events’ in the ex post equity risk premium, which
can be correlated in their timing across countries (Barro 2006; Ghosh and Julliard
2012; Nakamura et al. 2013).
Relaxing the assumption of time invariance and allowing for a decomposition of
a series into orthogonal components according to scale (time components) gives
rise to the wavelet approach, recently applied to economics and finance in the
pioneering papers by Ramsey and Lampart (1998a), Ramsey (1999, 2002). Wavelet
analysis encompasses both time or frequency domain approaches and can assess
simultaneously the strength of the comovement at different frequencies and how
such strength has evolved over time.3 In the context of asset pricing, Gençay
et al. (2003, 2005) and Fernandez (2006) have established that the predictions of
the Capital Asset Pricing Model model are more relevant at the medium-term,
rather than at short-time horizons. Our approach provides a further step towards
understanding the frequency components of the “equity premium puzzle” and
additional research is warranted to integrate our findings with their time-domain
counterpart in the context of wavelet analysis.
The structure of the paper is as follows. Section 2 describes briefly the methodol-
ogy employed, while Sect. 3 presents the empirical results. Section 4 concludes the
paper.
3
Crowley (2007) and Rua (2012) provide excellent surveys on wavelet analysis, which has been
applied to, among others, the examination of foreign exchange data using waveform dictionaries
(Ramsey and Zhang 1997), decomposition of the economic relationships of expenditure and
income (Ramsey and Lampart 1998a,b), decomposition of the relationship between wage inflation
and unemployment (Gallegati et al. 2011) the analysis of the relationship between stock market
returns and economic activity (Kim and In 2003).
Measuring Risk Aversion Across Countries from the Consumption-CAPM: . . . 253
where ı is the discount factor. The left-hand side of Eq. (1) is the marginal utility
cost of consumption while the right-hand side is the expected marginal utility benefit
of investing in asset i at time t; selling it at time t C 1 and consuming the profits.
Given that the investor equates marginal cost and marginal benefit, Eq. (1) describes
the optimum. Dividing Eq. (1) by U 0 .Ct / yields
U 0 .Ct C1 /
Et .1 C Ri;t C1 /ı D1 (2)
U 0 .Ct /
U 0 .C /
where ı U 0 .Ct C1
t/
is the intertemporal marginal rate of substitution of the investor, or
the stochastic discount factor. Following Rubinstein (1976), Lucas (1978), Breeden
(1979), Grossman and Shiller (1981), Mehra and Prescott (1985) and Campbell
1
Ct
(2003), we employ a time-separable power utility function U.Ct / D 1 where
is the coefficient of relative risk aversion and we get from (2) that:
Ct C1
Et .1 C Ri;t C1 /ı. / D1 (3)
Ct
The power utility specification has many desirable features. Firstly, it is scale-
invariant when returns have constant distributions implying that risk premia are not
influenced by increases in aggregate wealth or the scale of the economy. Secondly,
even when individuals have different initial wealth, we can still aggregate them
in a power utility function as long as each individual can be characterized by the
same power utility function. The major shortcoming of this traditionally adopted
utility function is that it restricts the elasticity of intertemporal substitution to be
the reciprocal of the coefficient of relative risk aversion. Weil (1989) and Epstein
and Zin (1991) have proposed an alternative utility specification that retains the
property of scale invariance without placing any restrictive linkages between the
coefficient of relative risk aversion and the elasticity of intertemporal substitution.
However, in this study, we concentrate on the power utility specification in order to
aid comparison with other studies on developed markets. Furthermore, Kocherlakota
(1996) reports that modifications to preferences such as those proposed by Epstein
and Zin, habit formation due to Constantinides (1990) or “keeping up with the
Joneses” as proposed by Abel (1990) fail to resolve the puzzle.
Following Hansen and Singleton (1983), we assume that the joint conditional
distribution of asset returns and consumption is lognormal. With time-varying
volatility we get after taking logs that:
where ct log.Ct /, ri;t log.1 C Ri;t /, and i2 and c2 denote the unconditional
variances of log stock return innovations and log consumption innovations respec-
tively, and i;c represents the unconditional covariance of innovations between
log stock returns and consumption growth. Consider now that an asset with a
riskless return, rf;t C1 , exists. For this asset the return innovation variance f2 and
the unconditional covariance of innovations between the log risk free return and
consumption growth, f;c ; are both zero. Equation (4) becomes:
Letting then ei;t C1 Et Œri;t C1 rf;t C1 denote the excess return over the riskfree
rate and subtracting Eq. (5) from Eq. (4) we get:
i2
ei;t C1 C D i;c (6)
2
Equation (6) suggests that the excess return on any asset over the riskless rate
is constant and therefore the risk premium on all assets is linear in expected
consumption growth with the slope coefficient, , given by:
ei;t C1 C 0:5i2
D (7)
i;c
Now, departing from the time domain to the frequency domain we can rewrite (7)
for each frequency. After dropping the time subscript for simplicity, we get that the
coefficient of risk aversion over the whole band of frequencies !, where ! is a real
variable in the range 0
!
, is given by:
e C 0:5fee .!/
! D (8)
fec .!/
where e denotes the excess log return of the stock market over the risk-free rate,
fee .!/ denotes the spectrum of excess returns, and fec .!/ denotes the co-spectrum
of consumption and excess returns.
The spectrum shows the decomposition of the variance of a series and is defined
as the discrete Fourier transform of its autocovariance function:
1
X
fee .!/ D 'k e ik!
kD1
4
For a detailed analysis, see Hannan (1969), Anderson (1971), Koopmans (1974) and Priestley
(1981).
Measuring Risk Aversion Across Countries from the Consumption-CAPM: . . . 255
'k D 'k along with the trigonometric property that e i ! C e i ! D 2 cos.!/; the
spectrum can be rewritten as:
1
X
fee .!/ D '0 C 2 'k cos.k!/
kD1
Consider now the bivariate spectrum Fec .!/ for a bivariate zero mean covariance
stationary process Zt D Œet ; ct > with covariance matrix ˆ./; which is the
frequency domain analogue of the autocovariance matrix. The diagonal elements
of Fec .!/ are the spectra of the indvidual processes, fee .!/ and fcc .!/; while the
off-diagonal ones refer to the cross-spectrum or cross spectral density matrix of et
and ct . In detail:
1
where Fec .!/ is an Hermitian, non-negative definite matrix, i.e. Fec .!/ D Fec .!/,
where F is the complex conjugate transpose of F since fec .!/ D fce .!/: As is
well known, the cross-spectrum, fec .!/, between e and c is complex-valued and
can be decomposed into its real and imaginary components, given here by:
where Cec .!/ denotes the co-spectrum and Qec .!/ the quadrature spectrum. The
measure of comovement between returns and consumption over the frequency
domain is then given by:
where 0 cec 2
.!/ 1 is the squared coherency, which provides a measure of
the correlation between the two series at each frequency and can be interpreted
intuitively as the frequency-domain analog of the correlation coefficient.5
The spectra and co-spectra of a vector of time-series for a sample of T observa-
tions can be estimated for a set of frequencies !n D 2 n=T , n D 1; 2; : : : ; T =2.
The relevant quantities are estimated through the periodogram, which is based
on a representation of the observed time-series as a superposition of sinusoidal
waves of various frequencies; a frequency of corresponds to a time period
5
Engle (1976) gives an early treatment on the frequency-domain analysis and its time-domain
counterpart.
256 E. Panopoulou and S. Kalyvitis
X
T 1
1
fee .!/ D w.k/'bk e ik!
2
kD.T 1/
where the kernel, w.k/; is a series of lag windows. We use the Bartlett’s win-
dow which assigns linearly decreasing weights to the autocovariances and cross-
covariances in the neighborhood of the frequencies consideredpand zero weight
thereafter. The number of ordinates, m, is set using the rule m D 2 T ; as suggested
by Chatfield (1989), where T is the number of observations.
3 Empirical Findings
To calculate the coefficient of risk aversion from (8) we use the Campbell (2003)
dataset, which combines quarterly data for consumption, interest rates and prices.
More in detail, returns are calculated from stock market data sourced from Morgan
Stanley Capital International (MSCI), while macroeconomic data on consumption,
short-term interest rates and price levels are sourced from the International Financial
Statistics (IFS).7 We present our estimates only for the countries for which at
least 100 observations are available in the dataset, namely Australia (1970:1–
1998:4), Canada (1970:1–1998:4), France (1973:2–1998:3), Italy (1971:2–1998:1),
Japan (1970:2–1998:4), Sweden (1970:1–1999:2), UK (1970:1–1999:1), and the US
(1947:2–1998:3 and 1970:1–1998:3). To allow for a direct comparison with the evi-
dence in Campbell (2003), we present two measures of risk aversion. The first one,
termed RRA(1), is calculated directly from (8), whereas the second one, denoted by
RRA(2), assumes a unitary correlation of excess returns with consumption growth.
Although this is a counterfactual exercise, we follow closely Campbell (2003) and
we postulate a unitary elasticity between returns and consumption growth to account
for the sensitivity of the implied risk aversion on the smoothness of consumption
rather than its low correlation with excess returns. We then identify the short-run
TP
1
6
For example, the periodogram of fee .!/ is given by Iee .!/ D g0 C 2 gk cos.k!/:
kD1
7
The data are available from http://scholar.harvard.edu/campbell/data. Details on sources and data
transformations are given in Campbell (2003).
Measuring Risk Aversion Across Countries from the Consumption-CAPM: . . . 257
continues to hold under the assumption of a unitary elasticity between excess returns
and consumption growth and is in line with the findings typically reported in the
literature on the C-CAPM.
Table 2 performs the same exercise for the medium-run or business-cycle
frequencies. As the time horizon increases, the variabilities of consumption growth
and returns along with their correlation in general increase. The medium-run
coherency exceeds 0.59 for all the countries at hand and reaches 0.80 for the US.
As a result risk aversion is in general lower, but is still found to be implausibly
high exceeding the value of 10, even when a unitary correlation is imposed. The
lower estimate is 21.9 and 17.3 for Italy, respectively. Thus we find that the equity
premium puzzle persists at business-cycle frequencies.
Next, we turn our attention to the long-run, i.e. low frequencies where we find
that the performance of the C-CAPM improves substantially. As shown in Table 3,
the coefficients of risk aversion now range between 5.0 (Australia) to 28.5 (Sweden).
When a unitary correlation coefficient is imposed, these estimates are slightly
reduced for all the countries at hand and range from 4.1 to 28.1. This improvement
in the low-frequency estimates of relative risk aversion is driven by the spectral
properties of the data at hand. As we move to lower frequencies, the variability of
Measuring Risk Aversion Across Countries from the Consumption-CAPM: . . . 259
consumption growth, c2 , increases significantly, reaching even ten times its high-
frequency value matching the variability of log excess returns, e2 , and as such the
covariance of returns and consumption increases. This property is coupled for most
of the countries at hand with a rise in the estimated coherency between consumption
and returns.8
4 Conclusions
The paper attempts to re-address the empirical issue of implausibly high risk
aversion within the context of the C-CAPM by looking at the pattern of risk aversion
over the frequency domain. Our results show that as lower frequencies are taken
into account, risk aversion falls substantially across countries and, in many cases,
is consistent with more reasonable values of the coefficient of risk aversion. This
evidence shows some improvement towards understanding the dynamics of the C-
CAPM by reconciling its standard single-factor version with lower values of risk
aversion and thus the equity premium over the frequency domain appears to be less
of a puzzle.
However, we emphasize that a limitation of our paper is that the point estimates
of long-run risk aversion remain relatively high and a more in-depth analysis is
warranted to align the model with reasonable coefficients of risk aversion across
countries. To this end, a number of studies provide interesting insights on the cross-
country aspects of the ‘equity premium puzzle’. For instance, Bekaert (1995) and
Henry (2000) point out that the cost of capital decreases as markets get integrated
due to risk sharing that implies a lower required risk premium. Wavelet analysis,
which can assess simultaneously the strength of the comovement at different
frequencies and how this strength has evolved over time, offers a promising route
for further research in this area.
References
Abel AB (1990) Asset prices under habit formation and catching up with the Joneses. Am Econ
Rev 80:38–42
Anderson T (1971) The statistical analysis of time series. Wiley, New York
Bansal R, Yaron A (2004) Risks for the long run: a potential resolution of asset pricing puzzles. J
Financ 59:1481–1509
Bansal R, Dittmar RF, Lundblad CT (2005) Consumption, dividends, and the cross section of
equity returns. J Financ 60:1639–72
8
It is worth mentioning that the coherency is not maximised at the lowest frequency for all
countries. The coherency reaches its maximum in the short-run (2–6 quarters) for Italy and Japan
and in the medium-run (2–4 years) for Australia and Canada.
260 E. Panopoulou and S. Kalyvitis
Barberis N, Huang M, Santos R (2001) Prospect theory and asset prices. Q J Econ 116:1–53
Barro RJ (2006) Rare disasters and asset markets in the twentieth century. Q J Econ 121:823–66
Bekaert G (1995) Market integration and investment barriers in emerging equity markets. World
Bank Econ Rev 9:75–107
Berkowitz J (2001) Generalized spectral estimation of the consumption-based asset pricing model.
J Econom 104:269–288
Brainard WC, Nelson WR, Shapiro MD (1991) The consumption beta explains expected returns at
long horizons. mimeo, Yale University
Breeden DT (1979) An intertemporal asset pricing model with stochastic consumption and
investment opportunities. J Financ Econ 7:265–296
Campbell JY (1996) Consumption and the stock market: interpreting international experience.
NBER Working Papers 5610
Campbell JY (2003) Consumption-based asset pricing. In: Constantinides G, Harris M, Stulz R
(eds). Handbook of the economics of finance. Amsterdam, North-Holland
Campbell JY, Cochrane J (1999) Force of habit: a consumption-based explanation of aggregate
stock market behavior. J Polit Econ 107:205–251
Chatfield C (1989) The analysis of time series. Chapman and Hall, London
Cochrane J (2005) Financial markets and the real economy. Found Trends Financ 1:1–101
Cogley T (2001) A frequency decomposition of approximation errors in stochastic discount factor
models. Int Econ Rev 42:473–503
Constantinides G (1990) Habit formation: a resolution to the equity premium puzzle. J Polit Econ
98:519–543
Crowley P (2007) A guide to wavelets for economists. J Econ Surv 21:207–264
Daniel K, Marshall DA (1997) Equity-premium and risk free-rate puzzles at long horizons.
Macroecon Dyn 1:452–484
Danthine J-P, Donaldson JB, Giannikos C, Guirguis H (2004) On the consequences of state
dependent preferences for the pricing of financial assets. Financ Res Lett 1:143–153
Engle RF (1976) Interpreting spectral analysis in terms of time-domain models. Ann Econ Soc
Meas 5:89–109
Epstein L, Zin SE (1991) Substitution, risk aversion and the temporal behaviour of consumption
growth and asset returns: an empirical investigation. J Polit Econ 99:263–286
Fama EF, French KR (1992) The cross-section of expected stock returns. J Financ 47:427–465
Fernandez V (2006) The CAPM and value at risk at different time-scales. Int Rev Financ Anal
15:203–219
Gallegati M, Gallegati M, Ramsey JB, Semmler, W (2011) The US wage Phillips curve across
frequencies and over time. Oxford Bull Econ Stat 73:489–508
Gençay R, Whitcher B, Selçuk F (2003) Systematic risk and time scales. Quant Financ 3:108–16
Gençay R, Whitcher B, Selçuk F (2005) Multiscale systematic risk. J Int Money Financ 24:55–70
Ghosh A, Julliard C (2012) Can rare events explain the equity premium puzzle? Rev Financ Stud
25:3037–3076
Gordon S, St-Amour P (2004) Asset returns with state-dependent risk preferences. J Bus Econ Stat
22:241–252
Granger CWJ, Hatanaka M (1964) Spectral analysis of economic time series. Princeton University
Press, Princeton
Grossman SJ, Shiller RJ (1981) The determinants of the variability of stock market prices. Am
Econ Rev 71:222–227
Grüne L, Semmler W (2008) Asset pricing with loss aversion. J Econ Dyn Control 32:3253–3374
Hannan EJ (1969) Multiple time series. Wiley, New York
Hansen LP, Singleton KJ (1983) Stochastic consumption, risk aversion and the temporal behavior
of asset teturns. J Polit Econ 91:249–268
Hansen LP, Heaton JC, Li N (2008) Consumption strikes back? measuring long run risk. J Polit
Econ 116:260–302
Henry P (2000) Market integration, economic reform, and emerging market equity prices. J Financ
55:529–564
Measuring Risk Aversion Across Countries from the Consumption-CAPM: . . . 261
Kalyvitis S, Panopoulou E (2013) Estimating C-CAPM and the equity premium over the frequency
domain. Stud Nonlinear Dyn Econom 17(5):551–572
Kim S, In FH (2003) The relationship between financial variables and real economic activity:
evidence from spectral and wavelet analyses. Stud Nonlinear Dyn Econom 7:1–18
Kocherlakota NR (1996) The equity premium: its still a puzzle. J Econ Lit 34:42–71
Koopmans LH (1974) The spectral analysis of time series. Academic, New York
Lettau M, Ludvigson SC (2009) Euler equation errors. Rev Econ Dyn 12:255–283
Lucas R (1978) Asset prices in an exchange economy. Econometrica 46:1429–1445
Mehra R (2003) The equity premium: why is it a puzzle? Finan Anal J 59:54–69
Mehra R, Prescott EC (1985) The equity premium: a puzzle. J Monet Econ 15:145–161
Nakamura E, Steinsson J, Barro RJ, Ursúa J (2013) Crises and recoveries in an empirical model of
consumption disasters. Am Econ J Macroecon 5:35–74
Parker JA (2001) The consumption risk of the stock market. Brookings Pap Econ Act 2:279–348
Parker JA (2003) Consumption risk and expected stock returns. Am Econ Rev Pap Proc 93:376–
382
Parker JA, Julliard C (2005) Consumption risk and the cross-section of expected returns. J Polit
Econ 113:185–222
Priestley MB (1981) Spectral analysis and time series, vol I and II. Academic, New York
Ramsey JB (1999) The contribution of wavelets to the analysis of economic and financial data.
Phil Trans R Soc A Math Phys Eng Sci 357:2593–2606
Ramsey JB (2002) Wavelets in economics and finance: past and future. Stud Nonlinear Dyn Econ
6(3). doi:10.2202/1558-3708.1090
Ramsey JB, Lampart C (1998a) Decomposition of economic relationships by time scale using
wavelets. Macroecon Dyn 2:49–71
Ramsey JB, Lampart C (1998b) The decomposition of economic relationship by time scale using
wavelets: expenditure and income. Stud Nonlinear Dyn Econ 3:23–42
Ramsey JB, Zhang Z (1997) The analysis of foreign exchange data using waveform dictionaries. J
Empir Financ 4:341–372
Rua A (2012) Wavelets in economics. Economic Bulletin and Financial Stability Report Articles,
Bank of Portugal
Rubinstein M (1976) The valuation of uncertain income streams and the pricing of options. Bell J
Econ 7:407–425
Semmler W, Grüne L Örlein C (2009) Dynamic consumption and portfolio decisions with time
varying asset returns. J Wealth Manage 12:21–47
Weil P (1989) The equity pemium puzzle and the risk-free rate puzzle. J Monet Econ 24:401–421