Complete Coin Teg Example

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

Long-Run Covariance Estimation

The long-run (variance) covariance matrix (LRCOV) occupies an important


role in modern econometric analysis. This matrix is, for example, central to
calculation of efficient GMM weighting matrices (Hansen 1982),
heteroskedastic and autocorrelation (HAC) robust standard errors (Newey
and West 1987), and is employed in unit root (Phillips and Perron 1988) and
cointegration analysis (Phillips and Hansen 1990, Hansen 1992b).
EViews offers tools for computing symmetric LRCOV and the one-sided
LRCOV using nonparametric kernel (Newey-West 1987, Andrews 1991),
parametric VARHAC (Den Haan and Levin 1997), and prewhitened kernel
(Andrews and Monahan 1992) methods. In addition, EViews supports
Andrews (1991) and Newey-West (1994) automatic bandwidth selection
methods for kernel estimators, and information criteria based lag length
selection methods for VARHAC and prewhitening estimation.

Technical Discussion
Nonparametric Kernel
Kernel Functions
Bandwidth
Automatic Bandwidth Selection
Andrews Automatic Selection
Newey-West Automatic Selection
Parametric VARHAC
Prewhitened Kernel

Our basic discussion and notation follows the framework of Andrews (1991)
and Hansen (1992a).

Consider a sequence of mean-zero random -vectors that may


depend on a -vector of parameters , and let where is the
true value of . We are interested in estimating the LRCOV matrix ,

(F.1)

where

(F.2)
is the autocovariance matrix of at lag . When is second-order
stationary, equals times the spectral density matrix of evaluated at
frequency zero (Hansen 1982, Andrews 1991, Hamilton 1994).

Closely related to are two measures of the one-sided LRCOV matrix:

(F.3)

The matrix , which we term the strict one-sided LRCOV, is the sum of the
lag covariances, while the also includes the contemporaneous
covariance . The two-sided LRCOV matrix is related to the one-sided
matrices through and .
Despite the important role the one-sided LRCOV matrix plays in the
literature, we will focus our attention on , since results are generally
applicable to all three measures; exception will be made for specific issues
that require additional comment.
In the econometric literature, methods for using a consistent

estimator and the corresponding to form a consistent estimate


of are often referred to as heteroskedasticity and autocorrelation
consistent (HAC) covariance matrix estimators.

There have been three primary approaches to estimating :


1.The nonparametric kernel approach (Andrews 1991, Newey-West 1987) forms
estimates of by taking a weighted sum of the sample autocovariances of
the observed data.
2.The parametric VARHAC approach (Den Haan and Levin 1997) specifies and
fits a parametric time series model to the data, then uses the estimated
model to obtain the implied autocovariances and corresponding .
3.The prewhitened kernel approach (Andrews and Monahan 1992) is a hybrid
method that combines the first two approaches, using a parametric model to
obtain residuals that “whiten” the data, and a nonparametric kernel
estimator to obtain an estimate of the LRCOV of the whitened data. The
estimate of is obtained by “recoloring” the prewhitened LRCOV to undo
the effects of the whitening transformation.
Below, we offer a brief description of each of these approaches, paying
particular attention to issues of kernel choice, bandwidth selection, and lag
selection.

Nonparametric Kernel
The class of kernel HAC covariance matrix estimators in Andrews (1991)
may be written as:

(F.4)

where the sample autocovariances are given by

(F.5)

is a symmetric kernel (or lag window) function that, among other


conditions, is continous at the origin and satisfies for
all with , and is a bandwidth parameter. The
leading term is an optional correction for degrees-of-freedom
associated with the estimation of the parameters in .
The choice of a kernel function and a value for the bandwidth parameter
completely characterizes the kernel HAC estimator.
Kernel Functions
There are a large number of kernel functions that satisfy the required
conditions. EViews supports use of the following kernel shapes:

Truncated
uniform
Bartlett

Bohman

Daniell

Parzen

Parzen-
Riesz

Parzen-
Geometric

Parzen-
Cauchy

Quadratic
Spectral

Tukey-
Hamming
Tukey-
Hanning

Tukey-
Parzen

Note that for for all kernels with the exception of the Daniell
and the Quadratic Spectral. The Daniell kernel is presented in truncated form
in Neave (1972), but EViews uses the more common untruncated form. The
Bartlett kernel is sometimes referred to as the Fejer kernel (Neave 1972).
A wide range of kernels have been employed in HAC estimation. The
truncated uniform is used by Hansen (1982) and White (1984), the Bartlett
kernel is used by Newey and West (1987), and the Parzen is used by Gallant
(1987). The Tukey-Hanning and Quadratic Spectral were introduced to the
econometrics literature by Andrews (1991), who shows that the latter is
optimal in the sense of minimizing the asymptotic truncated MSE of the
estimator (within a particular class of kernels). The remaining kernels are
discussed in Parzen (1958, 1961, 1967).
Bandwidth
The bandwidth operates in concert with the kernel function to determine
the weights for the various sample autocovariances in Equation (F.4). While
some authors restrict the bandwidth values to integers, we follow Andrews
(1991) who argues in favor of allowing real valued bandwidths.
To construct an operational nonparametric kernel estimator, we must choose
a value for the bandwidth . Under general conditions (Andrews 1991),
consistency of the kernel estimator requires that is chosen so
that and as . Alternately, Kiefer and Vogelsang
(2002) propose setting in a testing context.

For the great majority of supported kernels for so that


the bandwidth acts indirectly as a lag truncation parameter. Relating to
the corresponding integer lag number of included lags requires, however,
examining the properties of the kernel at the endpoints . For
kernel functions where (e.g., Truncated, Parzen-Geometric, Tukey-
Hanning), is simply a real-valued truncation lag, with at
most autocovariances having non-zero weight. Alternately, for
kernel functions where (e.g., Bartlett, Bohman, Parzen), the
relationship is slightly more complex, with autocovariances
entering the estimator with non-zero weights.
The varying relationship between the bandwidth and the lag-truncation
parameter implies that one should examine the kernel function when
choosing bandwidth values to match computations that are quoted in lag
truncation form. For example, matching Newey-West’s (1987) Bartlett kernel
estimator which uses weighted autocovariance lags requires
setting . In contrast, Hansen’s (1982) or White’s (1984)
estimators, which sum the first unweighted autocovariances, should be
implemented using the Truncated kernel with .
Automatic Bandwidth Selection
Theoretical results on the relationship between bandwidths and the
asymptotic truncated MSE of the kernel estimator provide finer
discrimination in the rates at which bandwidths should increase. The optimal
bandwidths may be written in the form:

(F.6)

where is a constant, and is a parameter that depends on the kernel


function that you select (Parzen 1958, Andrews 1991). For the Bartlett and
Parzen-Geometric kernels should grow (at most) at the rate .
The Truncated kernel does not have an theoretical optimal rate, but Andrews
(1991) reports Monte Carlo simulations that suggest that works well.
The remaining EViews supported kernels have so their optimal
bandwidths grow at rate (though we point out that Daniell kernel does
not satisfy the conditions for the optimal bandwidth theorems).
While theoretically useful, knowledge of the rate at which bandwidths should
increase as does not tell us the optimal bandwidth for a given sample
size, since the constant remains unspecified.
Andrews (1991) and Newey and West (1994) offer two approaches to
estimating . We may term these techniques automatic bandwidth selection
methods, since they involve estimating the optimal bandwidth from the data,
rather than specifying a value a priori. Both the Andrews and Newey-West
estimators for may be written as:

(F.7)

where and the constant depend on properties of the selected kernel


and is an estimator of , a measure of the smoothness of the
spectral density at frequency zero that depends on the autocovariances
. Substituting into Equation (F.6), the resulting plug-in estimator for the
optimal automatic bandwidth is given by:

(F.8)

The that one uses depends on properties of the selected kernel function.
The Bartlett and Parzen-Geometric kernels should use since they
have . should be used for the other EViews supported kernels
which have . The Truncated kernel does not have a theoretically
proscribed choice, but Andrews recommends using . The Daniell kernel
has , though we remind you that it does not satisfy the conditions for
Andrews’s theorems.“Kernel Function Properties” summarizes the values
of and for the various kernel functions.
It is of note that the Andrews and Newey-West estimators both require an
estimate of that requires forming preliminary estimates of and the
smoothness of . Andrews and Newey-West offer alternative methods for
forming these estimates.
Andrews Automatic Selection
The Andrews (1991) method estimates parametrically: fitting a simple
parametric time series model to the original data, then deriving the
autocovariances and corresponding implied by the estimated
model.

Andrews derives formulae for several parametric models, noting that


the choice between specifications depends on a tradeoff between simplicity
and parsimony on one hand and flexibility on the other. EViews employs the
parsimonius approach used by Andrews in his Monte Carlo simulations,
estimating -univariate AR(1) models (one for each element of ), then
combining the estimated coefficients into an estimator for .
For the univariate AR(1) approach, we have:

(F.9)

where are parametric estimators of the smoothness of the spectral


density for the -th variable (Parzen’s (1957) -th generalized spectral

derivatives) at frequency zero. Estimators for are given by:

(F.10)

for and , where are the estimated


autocovariances at lag implied by the univariate AR(1) specification for
the -th variable.

Substituting the univariate AR(1) estimated coefficients and standard


errors into the theoretical expressions for , we have:

(F.11)

which may be inserted into Equation (F.8) to obtain expressions for the
optimal bandwidths.

Lastly, we note that the expressions for depend on the weighting

vector which governs how we combine the individual into a single


measure of relative smoothness. Andrews suggests using either for
all or for all but the instrument corresponding to the intercept in
regression settings. EViews adopts the first suggestion, setting for
all .
Newey-West Automatic Selection
Newey-West (1994) employ a nonparametric approach to estimating .
In contrast to Andrews who computes parametric estimates of the
individual , Newey-West uses a Truncated kernel estimator to estimate
the corresponding to aggregated data.
First, Newey and West define, for various lags, the scalar autocovariance
estimators:

(F.12)

The may either be viewed as the sample autocovariance of a weighted


linear combination of the data using weights , or as a weighted
combination of the sample autocovariances.

Next, Newey and West use the to compute nonparametric truncated


kernel estimators of the Parzen measures of smoothness:

(F.13)

for . These nonparametric estimators are weighted sums of the


scalar autocovariances obtained above for from to , where ,
which Newey and West term the lag selection parameter, may be viewed as
the bandwidth of a kernel estimator for .

The Newey and West estimator for may then be written as:

(F.14)

for . This expression may be inserted into Equation (F.8) to obtain


the expression for the plug-in optimal bandwidth estimator.
In comparing the Andrews estimator Equation (F.11) with the Newey-West
estimatorEquation (F.14) we see two very different methods of distilling
results from the -dimensions of the original data into a scalar
measure . Andrews computes parametric estimates of the generalized
derivatives for the individual elements, then aggregates the estimates into
a single measure. In contrast, Newey and West aggregate early, forming
linear combinations of the autocovariance matrices, then use the scalar
results to compute nonparametric estimators of the Parzen smoothness
measures.
To implement the Newey-West optimal bandwidth selection method we
require a value for , the lag-selection parameter, which governs how many
autocovariances to use in forming the nonparametric estimates of .
Newey and West show that should increase at (less than) a rate that
depends on the properties of the kernel. For the Bartlett and the Parzen-
Geometric kernels, the rate is . For the Quadratic Spectral kernel, the
rate is . For the remaining kernels, the rate is (with the
exception of the Truncated and the Daniell kernels, for which the Newey-
West theorems do not apply).

In addition, one must choose a weight vector . Newey-West (1987) leave


open the choice of , but follow Andrew’s (1991) suggestion of for
all but the intercept in their Monte Carlo simulations. EViews differs from this
choice slightly, setting for all .

Parametric VARHAC
Den Haan and Levin (1997) advocate the use of parametric methods,
notably VARs, for LRCOV estimation. The VAR spectral density estimator,
which they term VARHAC, involves estimating a parametric VAR model to
filter the , computing the contemporaneous covariance of the filtered
data, then using the estimates from the VAR model to obtain the implied
autocovariances and corresponding LRCOV matrix of the original data.

Suppose we fit a VAR( ) model to the . Let be the matrix of


estimated -th order AR coefficients, . Then we may define the
innovation (filtered) data and estimated innovation covariance matrix as:

(F.15)

and

(F.16)

Given an estimate of the innovation contemporaneous variance

matrix and the VAR coefficients , we can compute the implied


theoretical autocovariances of . Summing the autocovariances yields
a parametric estimator for , given by:
(F.17)

where

(F.18)

Implementing VARHAC requires a specification for , the order of the VAR.


Den Haan and Levin use model selection criteria (AIC or BIC-Schwarz) using
a maximum lag of to determine the lag order, and provide simulations
of the performance of estimator using data-dependent lag order.
The corresponding VARHAC estimators for the one-sided
matrices and do not have simple expressions in terms

of and . We can, however, obtain insight into the construction of


the one-sided VARHAC LRCOVs by examining results for the VAR(1) case.
Given estimation of a VAR(1) specification, the estimators for the one-sided
long-run variances may be written as:

(F.19)

Both estimators require estimates of the VAR(1) coefficient estimates , as


well as an estimate of , the contemporaneous covariance matrix of .
One could, as in Park and Ogaki (1991) and Hansen (1992b), use the

sample covariance matrix so that the estimates


of and employ a mix of parametric and non-parametric
autocovariance estimates. Alternately, in keeping with the spirit of the
parametric methodology, EViews constructs a parametric

estimator using the estimated VAR(1) coefficients and .

Prewhitened Kernel
Andrews and Monahan (1992) propose a simple modification of the kernel
estimator which performs a parametric VAR prewhitening step to reduce
autocorrelation in the data followed by kernel estimation performed on the
whitened data. The resulting prewhitened LRVAR estimate is
then recolored to undo the effects of the transformation. The Andrews and
Monahan approach is a hybrid that combines the parametric VARHAC and
nonparametric kernel techniques.
There is evidence (Andrews and Monahan 1992, Newey-West 1994) that this
prewhitening approach has desirable properties, reducing bias, improving
confidence interval coverage probabilities and improving sizes of test
statistics constructed using the kernel HAC estimators.
The Andrews and Monahan estimator follows directly from our earlier
discussion. As in a VARHAC, we first fit a VAR( ) model to the and obtain
the whitened data (residuals):

(F.20)

In contrast to the VAR specification in the VARHAC estimator, the


prewhitening VAR specification is not necessarily believed to be the true time
series model, but is merely a tool for obtaining values that are closer to
white-noise. (In addition, Andrews and Monahan adjust their VAR(1)
estimates to avoid singularity when the VAR is near unstable, but EViews
does not perform this eigenvalue adjustment.)
Next, we obtain an estimate of the LRCOV of the whitened data by applying
a kernel estimator to the residuals:

(F.21)

where the sample autocovariances are given by

(F.22)

Lastly, we recolor the estimator to obtain the VAR prewhitened kernel


LRCOV estimator:

(F.23)

The prewhitened kernel procedure differs from VARHAC only in the


computation of the LRCOV of the residuals. The VARHAC estimator
in Equation (F.17) assumes that the residuals are white noise so that
the LRCOV may be estimated using the contemporaneous variance

matrix , while the prewhitening kernel estimator


in Equation (F.21) allows for residual heteroskedasticity and serial
dependence through its use of the HAC estimator . Accordingly, it may be
useful to view the VARHAC procedure as a special case of the prewhitened
kernel with and for .
The recoloring step for one-sided prewhitened kernel estimators is

complicated when we allow for HAC estimation of (Park and Ogaki,


1991). As in the VARHAC setting, the expressions for one-sided LRCOVs are
quite involved but the VAR(1) specification may be used to provide insight.
Suppose that the VARHAC estimators of the one-sided LRCOV matrices

defined in Equation (F.19)are given by and , and let be


the strict one-sided kernel estimator computed using the prewhitened data:

(F.24)

Then the prewhitened kernel one-sided LRCOV estimators are given by:

References
Andrews, Donald W. K, and J. Christopher Monahan (1992). “An Improved
Heteroskedasticity and Autocorrelation Consistent Covariance Matrix
Estimator,” Econometrica, 60, 953-966.
Andrews, Donald W. K. (1991). “Heteroskedaticity and Autocorrelation
Consistent Covariance Matrix Estimation,” Econometrica, 59, 817-858.
den Haan, Wouter J. and Andrew Levin (1997). “A Practitioner’s Guide to Robust
Covariance Matrix Estimation,” Chapter 12 in Maddala, G. S. and C. R. Rao
(eds.), Handbook of Statistics Vol. 15, Robust Inference, North-Holland:
Amsterdam, 291-341.
Gallant, A. Ronald (1987). Nonlinear Statistical Models. New York: John Wiley &
Sons.
Hamilton, James D. (1994). Time Series Analysis, Princeton University Press.
Hansen, Bruce E. (1992a). “Consistent Covariance Matrix Estimation for
Dependent Heterogeneous Processes,” Econometrica, 60, 967-972.
Hansen, Bruce E. (1992b). “Tests for Parameter Instability in Regressions with
I(1) Processes,” Journal of Business and Economic Statistics, 10, 321-335.
Hansen, Lars Peter (1982). “Large Sample Properties of Generalized Method of
Moments Estimators,”Econometrica, 50, 1029-1054.
Kiefer, Nicholas M., and Timothy J. Vogelsang (2002). “Heteroskedasticity-
Autocorrelation Robust Standard Errors Using the Bartlett Kernel Without
Truncation,” Econometrica, 70, 2093-2095.
Neave, Henry R. (1972). “A Comparison of Lag Window Generators,” Journal of
the American Statistical Association, 67, 152-158.
Newey, Whitney K. and Kenneth D. West (1987). “A Simple, Positive Semi-
Definite, Heteroskedasticity and Autocorrelation Consistent Covariance
Matrix,” Econometrica, 55, 703-708.
Newey, Whitney K. and Kenneth D. West (1994). “Automatic Lag Length
Selection in Covariance Matrix Estimation,” Review of Economic Studies, 61,
631-653.
Park, Joon Y. and Masao Ogaki (1991). “Inferences in Cointegrated Models
Using VAR Prewhitening to Estimate Shortrun Dynamics,” Rochester Center
for Economic Research Working Paper No. 281.
Parzen, Emanual (1957). “Consistent Estimates of the Spectrum of a Stationary
Time Series,” The Annals of Mathematical Statistics, 28, 329-348.
Parzen, Emanuel (1958). “On Asymptotically Efficient Consistent Estimates of
the Spectral Density Function of a Stationary Time Series,” Journal of the
Royal Statistical Society, B, 20, 303-322.
Parzen, Emanual (1961). “Mathematical Considerations in the Estimation of
Spectra,” Technometrics, 3, 167-190.
Parzen, Emanual (1967). “On Empirical Multiple Time Series Analysis,” in Lucien
M. Le Cam and Jerzy Neyman (eds.), Proceedings of the Fifth Berkely
Symposium on Mathematical Statistics and Probability, 1, 305-340.
White, Halbert (1984). Asymptotic Theory for Econometricians. Orlando:
Academic Press.

You might also like