0% found this document useful (0 votes)
16 views24 pages

Econometrics Journal - 2002 - Lüutkepohl - Maximum Eigenvalue Versus Trace Tests For The Cointegrating Rank of A VAR

The properties of a range of maximum eigenvalue and trace tests for the cointegrating rank of a vector autoregressive process are compared. The tests operate under different assumptions regarding the deterministic part of the data generation process. The asymptotic distributions under local alternatives are given and the local power is derived. A Monte Carlo comparison shows that there may be differences in small sample properties between the tests.

Uploaded by

andrutsssss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views24 pages

Econometrics Journal - 2002 - Lüutkepohl - Maximum Eigenvalue Versus Trace Tests For The Cointegrating Rank of A VAR

The properties of a range of maximum eigenvalue and trace tests for the cointegrating rank of a vector autoregressive process are compared. The tests operate under different assumptions regarding the deterministic part of the data generation process. The asymptotic distributions under local alternatives are given and the local power is derived. A Monte Carlo comparison shows that there may be differences in small sample properties between the tests.

Uploaded by

andrutsssss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Econometrics Journal (2001), volume 4, pp. 287–310.

Maximum eigenvalue versus trace tests for the cointegrating


rank of a VAR process

H ELMUT L ÜTKEPOHL†, P ENTTI S AIKKONEN‡, AND C ARSTEN T RENKLER†


† Institutfür Statistik und Ökonometrie, Humboldt-Universität zu Berlin, Spandauer Str. 1,
D-10178 Berlin, Germany
E-mail: [email protected]; [email protected]
‡ Department of Statistics, P.O. Box 54, FIN-00014 University of Helsinki, Finland

E-mail: [email protected]

Received: February 2001

Summary The properties of a range of maximum eigenvalue and trace tests for the coin-
tegrating rank of a vector autoregressive process are compared. The tests are all likelihood-
ratio-type tests and operate under different assumptions regarding the deterministic part of
the data generation process. The asymptotic distributions under local alternatives are given
and the local power is derived. It is found that the local power of corresponding maximum
eigenvalue and trace tests is very similar. A Monte Carlo comparison shows, however, that
there may be differences in small samples. The trace tests tend to have more distorted sizes
whereas their power is in some situations superior to that of the maximum eigenvalue tests.

Keywords: Cointegration, Local power analysis, Vector autoregressive process.

1. INTRODUCTION

In empirical studies of systems of economic time series, the number of cointegrating relations is
often of major interest because it affects the model setup and inference procedures at other stages
of the analysis. Therefore, the cointegrating rank of a system is usually investigated at an early
stage. If a vector autoregressive (VAR) model is an adequate description of the data generation
process (DGP), likelihood ratio (LR) type tests as proposed by Johansen (1988, 1995) are the
most commonly used inference tools in this context. Two variants of these tests are available, the
so-called maximum eigenvalue tests and the trace tests. Both types of test are frequently applied
in empirical studies. Given the long-time coexistence of the two tests it is surprising that only
little is known about the relative performance of the two types of LR tests.
Toda (1994) reports on a Monte Carlo experiment comparing the small-sample properties
of the two types of test and finds that neither of the tests is uniformly superior but that the
trace tests perform better in some situations where the power is low. His simulation study is
limited in a number of respects, however. First, he considers bivariate DGPs only. For these
processes, maximum eigenvalue and trace tests differ only when the null hypothesis states that
the cointegrating rank is zero, which is clearly a special situation. Second, Toda considers only
tests which allow for a deterministic linear trend in the DGP. The performance of tests which
allow not for a linear trend but just for a mean term is known to be quite different. Further work
c Royal Economic Society 2001. Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street,
Malden, MA, 02148, USA.
1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
288 Helmut Lütkepohl et al.

on the powers of the maximum eigenvalue and trace tests is reported by Johansen (1995), Hansen
and Johansen (1998) and Paruolo (2001). These authors consider also the asymptotic local power
of the maximum eigenvalue tests for the case when no deterministic term is included. From
the point of view of applied work, this case is less important than the cases which allow for a
deterministic mean term or a linear trend. The local power of the maximum eigenvalue test for
the special case where a cointegrating rank of zero is tested against rank one is also treated by
Pesavento (2000). She compares the local power and small-sample performance of the maximum
eigenvalue test to other tests for a single cointegration relation and allows for the possibility of
deterministic mean and trend terms.
In this study we perform a local power and small-sample comparison between a more com-
plete set of maximum eigenvalue and trace tests. In particular, we compare tests based on differ-
ent assumptions regarding the deterministic part: that is, we consider tests for DGPs with deter-
ministic mean and linear trend terms. Different treatments of the mean and deterministic trend
term which have been proposed in the literature are also considered. More precisely, a number of
LR-type tests reviewed in Hubrich et al. (2001) will be included in the comparison. Moreover,
we compare the local power of the different tests for more general situations than previously. In
particular, tests for a cointegrating rank larger than zero with several cointegration relations are
examined. Finally, our small-sample Monte Carlo study takes into account higher-dimensional
processes as well.
The study is structured as follows. In the next section the model framework and the hypothe-
ses of interest are discussed. The tests and their asymptotic properties are presented in Section 3.
A local power comparison is performed in Section 4 and the results of a small-sample simula-
tion study are presented in Section 5. Conclusions are drawn in Section 6. Some mathematical
derivations are deferred to the Appendix.
The following terminology and notation is used throughout. The differencing operator is de-
noted by 1: that is, for a time series or stochastic process yt we have 1yt = yt − yt−1 . The
symbol I (d) denotes an integrated process of order d: that is, the purely stochastic part of the
process is stationary or asymptotically stationary after differencing d times while it is still non-
stationary after differencing just d − 1 times. Convergence in distribution or weak convergence is
d
signified by →. The trace, the rank and the maximal eigenvalue of the matrix A are denoted by
tr(A), rk(A) and λmax (A), respectively. If A is an (n × m) matrix of full-column rank (n > m),
we denote an orthogonal complement by A⊥ so that A⊥ is an (n ×(n −m)) matrix of full-column
rank and such that A0 A⊥ = 0. The orthogonal complement of a nonsingular square matrix is zero
and the orthogonal complement of a zero matrix is an identity matrix of suitable dimension. An
(n × n) identity matrix is denoted by In . GLS is used to abbreviate generalized least squares,
RR regression stands for reduced rank regression and r.h.s. is short for right-hand side. VECM
abbreviates vector error correction model. A sum is defined to be zero if the lower bound of the
summation index exceeds its upper bound.

2. MODELS AND HYPOTHESES OF INTEREST

2.1. The models

An observable n-dimensional time series yt = (y1t , . . . , ynt )0 (t = 1, . . . , T ) is considered


which is generated by the DGP
yt = µ0 + µ1 t + xt , t = 1, 2, . . . , (1)

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 289

where µ0 and µ1 are the (n × 1) parameter vectors of the deterministic term. The first one,
µ0 , will be referred to as the mean term and µ1 is called the trend parameter. The process xt
represents the stochastic part which has mean zero and is unobservable unless µ0 and µ1 are
known. Thus, the deterministic and stochastic parts are simply added in (1). It is assumed that
the components of xt are at most I (1) and do not have roots on the unit circle other than 1. This
implies that the same is true for yt . Of course, there may be cointegration among the variables of
xt and, hence, of yt . Moreover, xt is assumed to be generated by a VAR process. For convenience,
we write the process in VECM form,

p−1
X
1xt = 5xt−1 + 0 j 1xt− j + εt , t = 1, 2, . . . , (2)
j=1

where 5 and the 0 j are (n × n) matrices of unknown parameters. Assuming that rk(5) =
r , the matrix 5 can be written as a product 5 = αβ 0 , where α and β are (n × r ) matrices.
P p−1
Our requirement that all variables are at most I (1) implies that α⊥ 0 (I −
n j=1 0 j )β⊥ has to
be nonsingular (see Johansen (1995, p. 49)). The rank r is the cointegrating rank of xt and,
hence, of yt . It is the quantity of interest in what follows. The error process εt ∼ i.i.d.(0, )
is independently, identically distributed white noise with zero mean and nonsingular covariance
matrix . Moreover, in setting up LR-type tests it is assumed that the εt are Gaussian, that is
εt ∼ i.i.d.N (0, ) and, hence, xt and yt are also Gaussian processes. The sum on the r.h.s. of (2)
represents the short-term dynamics of the process. It disappears if p = 1. The initial values xt
(t = − p + 1, . . . , 0) are assumed to be zero for convenience. The asymptotic analysis remains
valid if they are assumed to be any random variables which do not depend on the sample size.
t− p
Defining Yt−1 = (yt−1 0 , . . . , y0
t− p ) , it is easy to see that the process yt also has a VECM
0

representation which can be written as


t− p+1 t− p+1
1yt = ν0 + ν1 t + 5yt−1 + 01Yt−1 + u t = ν + 5+ yt−1
+
+ 01Yt−1 + ut ,
t = p + 1, p + 2, . . . , (3)
P p−1
where ν0 = −5µ0 + (In + 5 − j=1 0 j )µ1 , ν1 = −5µ1 , ν = ν0 + ν1 , 5+ = [ν1 : 5],
0 ]0 , 0 = [0 : · · · : 0 t− p+1 t− p+1 t− p
p−1 ] and 1Yt−1
+
yt−1 = [t − 1 : yt−1 1 = Yt−1 − Yt−2 . Depending
on the assumptions for µ0 and µ1 , different restrictions will be imposed on ν, ν0 and ν1 . On
the other hand, the parameter µ0 cannot in general be recovered uniquely from (3) due to the
singularity of 5. A brief discussion of the different cases will be given in Section 3, where the
cointegration tests are presented. The hypotheses of interest are considered below.

2.2. Hypotheses of interest

It is well known (see, for example, Johansen (1995)) that the number of linearly independent
cointegrating relations of yt is equal to the rank of 5 which will be denoted by r in what follows.
This number is the quantity of interest in the tests considered in this study. We discuss tests
designed for checking the pairs of hypotheses

H (r0 ) : rk(5) = r0 vs. H̄ (r0 ) : rk(5) > r0 (4)

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
290 Helmut Lütkepohl et al.

Table 1. Models and LR-type tests.


Determ. Test
terms Model statistic Reference
t− p+1
µ0 = µ1 = 0 1yt = 5yt−1 + 01Yt−1 + ut L R0 Johansen (1988, 1995)

t− p+1
µ0 arbitrary 1yt = ν0 + 5yt−1 + 01Yt−1 + ut L R i0 Johansen (1991)
1
 
t− p+1
µ1 = 0 1yt = 5∗ + 01Yt−1 + ut L R∗ Johansen and Juselius (1990)
yt−1
t− p+1
1yt = 5(yt−1 − µ0 ) + 01Yt−1 + ut L RSL Saikkonen and Luukkonen (1997)
Saikkonen and Lütkepohl (2000a)

t −1
 
t− p+1
µ0 , µ1 1yt = ν + 5+ + 01Yt−1 + ut L R+ Johansen (1992, 1994, 1995)
yt−1
t− p+1
arbitrary 1yt = ν0 + ν1 t + 5yt−1 + 01Yt−1 + ut L R PC Perron and Campbell (1993)
1yt − µ1 = 5(yt−1 − µ0 − µ1 (t − 1)) L RG L S Saikkonen and Lütkepohl (2000a)
P p−1
+ j=1 0 j (1yt− j − µ1 ) + u t Lütkepohl and Saikkonen (2000)

and
H (r0 ) : rk(5) = r0 vs. H (r0 + 1) : rk(5) = r0 + 1. (5)
Tests of the former pair of hypotheses are often referred to as trace tests and those for the latter
pair are known as maximum eigenvalue tests.
We are also interested in the power of the tests. If H (r0 ) is true, the matrix 5 in the VECM
form (3) can be written as a product 5 = αβ 0 , where α and β are (n × r0 ) matrices of rank r0 .
Thus, we may write the null hypothesis as H (r0 ) : 5 = αβ 0 with α, β (n × r0 ) and rk(α) =
rk(β) = r0 . Furthermore, if rk(5) = r > r0 , there exist (n × r ) matrices [α : α1 ] and [β : β1 ] of
rank r , such that
β
 0 
5 = [α : α1 ] = αβ 0 + α1 β10 . (6)
β10
In the next section, tests for different assumptions regarding the deterministic terms will be dis-
cussed briefly.

3. THE TESTS

In this section we will consider LR-type tests for the hypotheses in (4), (5) under alternative
assumptions for the deterministic mean and trend terms. An overview of the assumptions for µ0
and µ1 and the associated models and tests is given in Table 1, which is adopted from Hubrich et
al. (2001). Under Gaussian assumptions, the LR-type statistics related to the models in Table 1
may be obtained by concentrating out the short-run dynamics and performing a RR regression.
The LR statistics of interest can then be obtained by solving a suitable eigenvalue problem. For
example, for the case where no deterministic terms appear in the model (µ0 = µ1 = 0) the
LR statistics can be obtained as follows. Denote the residuals from regressing 1yt and yt−1
t− p+1 PT
on 1Yt−1 by R0t and R1t , respectively, and define Si j = T −1 t=1 Rit R 0jt (i, j = 0, 1).

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 291

Moreover, denote the ordered (generalized) eigenvalues from solving det(λS11 − S10 S00 −1
S01 ) = 0
by λ1 ≥ · · · ≥ λn . Then the trace statistic for testing the pair of hypotheses in (4) can be shown
to be
Xn
0
L Rtrace (r0 ) = −T log(1 − λ j ) (7)
j=r0 +1

and the maximum eigenvalue statistic for testing (5) is


0
L Rmax (r0 ) = −T log(1 − λr0 +1 ) (8)

(see Johansen (1995, Ch. 6)).


If µ1 = 0 and µ0 is unrestricted, a DGP with possibly nonzero mean term is considered
whereas a deterministic linear trend term is excluded by assumption. In this case the VECM form
of the model for yt reduces to 1yt = 5∗ yt−1 ∗ + 01Y t− p+1 + u , where 5∗ is an (n × (n + 1))
t−1 t
∗ 0 0
matrix with rank r and yt = [1, yt ] . In other words, the mean term becomes an intercept in the
cointegration relations. Three variants of LR-type tests have been considered in the literature for
this situation. The first test does not take into account the fact that the mean term can be absorbed
into the cointegrating relations and includes an unrestricted intercept term in the VECM (see
Johansen (1991)). The second test enforces the restriction on the constant term by including it in
the cointegration relations (see Johansen (1995, Ch. 11)). RR regression can be applied for both
model variants in computing the corresponding LR statistics.
Finally, in the third testing setup the mean term µ0 is estimated in a first step by a GLS
procedure and then it is subtracted from yt . The tests are performed on the mean-adjusted data,
x̃t = yt − µ̃0 , noting that 1x̃t = 1yt . Suitable estimators µ̃0 are proposed by Saikkonen and
Luukkonen (1997) and Saikkonen and Lütkepohl (2000a). We use the latter variant which pro-
ceeds by estimating the VECM parameters by RR regression as in the previous model with the
intercept restricted to the cointegration relations and imposing the cointegrating rank which is
specified under H0 . These estimates are then used in accounting for the autocovariance structure
of xt in a feasible GLS estimation of µ0 in the model yt = µ0 + xt . Finally the cointegrating
rank test is performed using the VECM (2) where xt is replaced by the mean-adjusted obser-
vations x̃t = yt − µ̃0 . We use the estimator µ̃0 proposed by Saikkonen and Lütkepohl (2000a)
in this context in the simulations reported in Section 5 because it resulted in slightly better size
properties of the test than the Saikkonen and Luukkonen estimator in preliminary simulations.
If both µ0 and µ1 are unrestricted a deterministic linear trend is allowed for. Again three
different LR-type tests have been proposed for this situation. The first model is set up in such a
way so as to impose the linearity of the trend term as in the latter part of (3). This test version
was proposed and analysed by Johansen (1992, 1994, 1995). The second model includes the trend
term outside the cointegration term as in the first VECM version in (3). In principle, such a model
can generate quadratic trends if no restrictions are imposed on the deterministic parameters.
The corresponding test was considered by Perron and Campbell (1993). In both models the LR
statistics for the cointegrating rank are readily available by RR regression.
Finally, the last test in Table 1 is based on prior trend adjustment via a feasible GLS pro-
cedure. For this purpose the model is estimated by RR regression under the null hypothesis
with restricted trend term as in the Johansen test. The resulting parameter estimators of 5
and 0 are used to construct the feasible GLS estimators for µ0 and µ1 based on the model
yt = µ0 + µ1 t + xt . Denoting the estimators by µ̃0 and µ̃1 , the cointegration test is then based
on RR regression applied to the model (2) with xt replaced by the trend-adjusted observations

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
292 Helmut Lütkepohl et al.

R 0  R −1  R 
1 0 1 0 ds 1 0 .
Table 2. Limiting Distributions of Tests Based on 0 FdN 0 FF 0 FdN
Test statistic F(s) Ft
L R0 N(s) Nt−1
R1 PT
L R i0 N(s) − 0 N(u)du Nt−1 − T −1 t=1 Nt−1
L R∗ [N(s)0 : 1]0 [N0t−1 : 1]0
L RSL N(s) Nt−1
h i0 h i0
(N(s) − 01 N(u)du)0 : s − 12
PT
L R+ (Nt−1 − T −1 t=1 Nt−1 )0 : s − 12
R

L R PC trend adjusted N(s) trend adjusted Nt−1

x̃t = yt − µ̃0 − µ̃1 t. This test was first proposed by Saikkonen and Lütkepohl (2000a). Critical
values for all these tests may be found in the references given in Table 1. The notation for the
different tests also corresponds to that used in the survey by Hubrich et al. (2001).
It may be worth noting that there is also another group of tests which is applied frequently
in practice. They assume that the variables may have deterministic linear trends whereas a trend
in the cointegrating relations is excluded. These tests are discussed in detail in Saikkonen and
Lütkepohl (2000b). Despite their popularity in practice, it is questionable that the necessary
assumptions can be justified easily in many situations. Moreover, their asymptotic distributions
under local alternatives result in more complicated expressions. Therefore, we do not include
them in the present comparison.
The limiting distributions of the test statistics listed in Table 1, under the local alternatives

1
HT (r0 ) : 5 = αβ 0 + α1 β10 , (9)
T
have been derived for DGPs with order p = 1 only. The results are also valid for higher-order
processes. The proofs can be extended to that case along the lines of Hansen and Johansen (1998,
Ch. 12) who consider the case of a process without deterministic terms. For our VAR(1) case we
assume that the parameter matrices α, β, α1 and β1 are such that the eigenvalues of Ir0 + β 0 α and
Ir + [β : β1 ]0 [α : α1 ] are less than one in modulus. This assumption ensures that all variables are
at most I (1) under the null and alternative hypotheses (see Johansen (1995, p. 202)).
Under HT (r0 ), most trace and maximum eigenvalue statistics have limiting distributions
d d
L Rtrace (r0 ) → tr(D) and L Rmax (r0 ) → λmax (D), respectively, where
Z 1 0  Z 1 −1  Z 1 
D= FdN 0 0
FF ds FdN ,
0
0 0 0

F(s) is an (n − r0 )-dimensional stochastic


Rs process and N(s) is the Ornstein–Uhlenbeck pro-
cess defined by N(s) = B(s) + ab0 0 N(u)du, where B(s) is a standard Brownian motion,
a = (α⊥0 α )−1/2 α 0 α and b = (α 0 α )1/2 (β 0 α )−1 β 0 β . The specific form of F(s) varies
⊥ ⊥ 1 ⊥ ⊥ ⊥ ⊥ ⊥ 1
with the assumptions regarding the deterministic term. The different F(s) processes are given in
Table 2 together with discrete counterparts which will be discussed in the following section when
0 (r ) → tr(D 0 ) and L R 0 (r ) → λd d
max (D )
simulations are considered. For example, L Rtrace 0
0 max 0

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 293

where
Z 1 0  Z 1 −1  Z 1 
D0 = NdN0 NN0 ds NdN0
0 0 0

under the local alternatives (9). Obviously, the local power depends on the number of com-
mon trends, n − r0 , under the null hypothesis and, via a and b, on the ‘distance’ from the null
hypothesis. The null distribution results by setting a = 0.
The asymptotic distributions of the other statistics, except L R G L S , are obtained in an anal-
ogous way. Most of the limiting distributions in Table 2 were derived by Johansen (1995) and
Saikkonen and Lütkepohl (1999). The other limiting distributions for trace statistics are also
given in the references presented in Table 1. The corresponding results for some of the maximum
eigenvalue statistics have not been derived explicitly to the best of our knowledge (see, however,
Hansen and Johansen (1998, Exercise 12.1) for the case of no deterministic term). Therefore
proofs for those cases which are not available in the literature are given in the Appendix.
For the L R G L S tests, the asymptotic distributions are based on
Z 1 0  Z 1 −1  Z 1 
N∗ dN0∗ N∗ N0∗ ds N∗ dN0∗ ,
0 0 0
R1 R1 R1 R1
where N∗ (s) = N(s)−sN(1) and 0 N∗ dN0∗ abbreviates 0 NdN0 − 0 NdsN(1)0 −N(1) 0 sdN0 +
R1 0
0 sdsN(1)N(1) . This limiting distribution is given in Lütkepohl and Saikkonen (2000).

4. LOCAL POWER COMPARISON

From the limiting distributions in Table 2 it follows that the local power, that is, the relative
rejection frequency if H (r0 ) is tested and HT (r0 ) is true, depends on α, β, , α1 and β1 only
through a and b. This implies, for instance, for the case r − r0 = 1, where α1 and β1 are (n × 1)
vectors, that the limiting distributions can be written as functions of the two parameters

l 2 = a 0 ab0 b and d 2 = (b0 a)2 /(a 0 ab0 b).

A different parametrization is used in Johansen (1995) and Saikkonen and Lütkepohl (1999). The
present one is used in Hubrich et al. (2001) because it simplifies the interpretation
√ of the results.1
2 2
Note that l = 0 if and only if the null hypothesis holds. Hence, l = l may be thought of
as the distance of the local alternative from the null hypothesis. Moreover, 0 < d 2 ≤ 1. Recall
our assumption that all eigenvalues of Ir + [β : β1 ]0 [α : α1 ] are less than one in modulus
which ensures that the variables are at most I (1). In turn, if this eigenvalue condition is not
satisfied, the process has components with more than one root on the complex unit circle. If the
eigenvalues are all within the complex unit circle, then the matrix b0 a = β10 β⊥ (β⊥ 0 α )−1 α α
⊥ ⊥ 1
is nonsingular (see Johansen (1995, p. 204)). Hence, in our specific case where b0 a is a scalar,
b0 a 6= 0 is necessary to ensure that yt is an I (1) process, as required by our assumptions. In
this sense, d 2 close to zero may be viewed
√ as representing processes close to ones with higher-
order integration. The quantity d = d 2 may therefore be interpreted as the distance from the
1 Using the notation in Johansen (1995), the relation to the parameters used in the present study is as follows: l 2 =
g 2 + f 2 , d 2 = f 2 /( f 2 + g 2 ).

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
294 Helmut Lütkepohl et al.

parameter space associated with higher-order integration. We will use the quantities l and d in
comparing the local power of the tests.
For a local power comparison we consider first the case where α1 and β1 are (n × 1) vectors
and simulate the discrete time counterpart of the (n − r0 )-dimensional Ornstein–Uhlenbeck pro-
cess N(s) as 1Nt = T1 ab0 Nt−1 + t (t = 1, . . . , T = 1000) with t ∼ i.i.d.N (0, In−r0 ), N0 = 0,

(1, 0)

0 for n − r0 = 2
b =
(1, 0, 0) for n − r0 = 3

and ( √ p
− √l 2 d 2 , pl 2 (1 − d 2 )

0 for n − r0 = 2
a =
− l 2 d 2 , l 2 (1 − d 2 ), 0 for n − r0 = 3.


PT PT
From the Nt we compute A T = T −2 t=1 Ft F0t and BT = T −1 t=1 Ft 1N0t , where the Ft are
the discrete counterparts of the F(s) given in Table 2. The limiting distributions of the trace and
maximum eigenvalue test statistics are then simulated as tr(BT0 A−1 T BT ) and λmax (BT A T BT ),
0 −1

respectively. Furthermore, using


T
" t−1 #" t−1 #0
1 X X X
AT = 2 (1Nk − 1N) (1Nk − 1N)
T
t=1 k=1 k=1

and
T
" t−1 #
1 X X
BT = (1Nk − 1N) (1Nt − 1N)0 ,
T
t=1 k=1
PT
t=1 1Nt ) gives the limiting distributions of the L R
with (1N = T −1 G L S statistics in an analo-

gous fashion. The resulting rejection frequencies for the cases n −r0 = 2 and n −r0 = 3 obtained
from 10 000 replications for different values of d and l are plotted in Figures 1 and 2. Note that
the two types of tests are identical for n − r0 = 1 and, hence, a comparison is not meaningful for
this case. Note also that L R 0 and L R S L have the same asymptotic distributions. Therefore only
the local power curves for the latter tests are depicted in the figures.
Figures 1 and 2 reveal that the results for the trace and the maximum eigenvalue versions
of the tests are quite similar. Clearly, this result is not surprising because, if there is just one
additional cointegration relation, both tests check against the appropriate alternative hypothe-
sis. Notice, however, that due to the large number of replications used in evaluating the re-
jection probabilities, even relatively small power differences in these figures are significant.
Using 10√000 replications, the standard error of an estimator of a true rejection probability P
is s P = P(1 − P)/10 000 so that, in the worst case where P = 0.5, s0.5 = 0.005, ignoring the
Monte Carlo variation due to estimation of the critical values. Moreover, for a given set of param-
eters, the simulations are based on the same random numbers and are therefore not independent.
As a consequence, quite small differences in the rejection probabilities are statistically signif-
icant. For practical purposes, the differences between the maximum eigenvalue and associated
trace tests are so small, however, that they are of little importance.
Comparing the different proposals for treating the deterministic terms, it can be seen that the
tests assuming no linear trend outperform the tests allowing for a trend. However, the relative
performance of the test variants is not influenced by the inclusion of a trend (see panels A and
B in Figures 1 and 2). The same can be said about the parameter d measuring the distance from

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 295

Panel A: Tests with a time trend Panel B: Tests without a time trend
1.0 1.0

0.8 0.8
LR PCmax LR iomax
0.6 0.6
d = 0.25

d = 0.25
LR PCtrace LR iotrace
LR +max LR *max
0.4 0.4
LR +trace LR *trace
LR GLSmax LR SLmax
0.2 0.2
LR GLStrace LR SLtrace

0.0 0.0
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
l l
1.0 1.0

0.8 0.8
LR PCmax LR iomax
0.6 0.6
d = 0.50

LR PCtrace LR iotrace

d = 0.50
LR +max LR *max
0.4 0.4
LR +trace LR *trace
LR GLSmax LR SLmax
0.2 0.2
LR GLStrace LR SLtrace

0.0 0.0
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
l l

1.0 1.0

0.8 0.8
LR PCmax LR iomax
0.6 0.6 LR iotrace
d = 0.75

LR PCtrace
d = 0.75

LR +max LR *max
0.4 0.4 LR *trace
LR +trace
LR GLSmax LR SLmax
0.2 0.2
LR GLStrace LR SLtrace

0.0 0.0
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
l l

Figure 1. Local power of LR-type tests for n − r0 = 2.

Panel A: Tests with a time trend Panel B: Tests without a time trend
1.0 1.0

0.8 0.8
LR PCmax LR iomax
d = 0.25

0.6
d = 0.25

0.6
LR PCtrace LR iotrace
LR +max LR *max
0.4 0.4
LR +trace LR *trace
0.2 LR GLSmax LR SLmax
0.2
LR GLStrace LR SLtrace
0.0 0.0
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
l l
1.0 1.0

0.8 0.8
LR PCmax LR iomax
0.6
d = 0.50
d = 0.50

LR PCtrace
0.6 LR iotrace
LR +max LR *max
0.4 0.4
LR +trace LR *trace

0.2 LR GLSmax LR SLmax


0.2
LR GLStrace LR SLtrace
0.0 0.0
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
l l
1.0 1.0

0.8 0.8
LR PCmax LR iomax
d = 0.75

0.6
d = 0.75

LR PCtrace 0.6 LR iotrace


LR +max LR *max
0.4 0.4
LR +trace LR *trace
LR GLSmax LR SLmax
0.2 0.2
LR GLStrace LR SLtrace
0.0 0.0
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
l l

Figure 2. Local power of LR-type tests for n − r0 = 3.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
296 Helmut Lütkepohl et al.

Panel A: Tests with a time trend Panel B: Tests without a time trend
1.0 1.0
LR PCmax
0.8 LR PCtrace
0.8
LR +max
LR iomax
c = 2.5

c = 2.5
0.6 LR +trace 0.6
LR GLSmax LR iotrace
0.4 LR GLStrace LR *max
0.4
LR *trace
0.2 0.2 LR SLmax
LR SLtrace
0.0 0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0

0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
l l
1.0 1.0

0.8 0.8
LR iomax
0.6 LR PCmax 0.6
c = 5.0

c = 5.0
LR iotrace
LR PCtrace
LR *max
0.4 LR +max 0.4
LR *trace
LR +trace
0.2 LR SLmax
LR GLSmax 0.2
LR SLtrace
LR GLStrace
0.0 0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0

0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
l l
1.0
1.0
0.8
0.8
LR PCmax LR iomax
0.6
c = 7.5

c = 7.5

0.6 LR PCtrace LR iotrace

LR +max LR *max
0.4
0.4 LR +trace LR *trace

LR GLSmax 0.2 LR SLmax


0.2 LR GLStrace LR SLtrace
0.0
0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0

l l

Figure 3. Local power of LR-type tests for n − r0 = 2 and d = 0.25.

the region with a higher order of integration. These results are in line with those of Saikkonen
and Lütkepohl (1999) and Hubrich et al. (2001). For the special case of a bivariate process,
n − r0 = 2 corresponds to testing H (0) : r0 = 0 which is also the case treated by Pesavento
(2000). She finds that under specific conditions, gains in local power are possible in this situation
if single-equation cointegration tests are used. In other words, the knowledge that there is at most
one cointegrating relation and both variables are I (1) can be used to obtain local power gains.
Because the maximum eigenvalue tests consider the alternative hypothesis that there is just
one extra cointegration relation whereas the trace variants test against a more general alternative,
we have also determined the local power for the case r − r0 = 2, (where under the local alter-
native, the process has two extra cointegration relations) to investigate whether our results are
robust with respect to this feature of the DGP. In this case we use

I for n − r0 = 2
b0 = 2
[I2 : 0] for n − r0 = 3
and  √ p
l 2 (1 − d 2 )

 − l 2d 2

 for n − r0 = 2
 c 0
a0 =  √ p
l 2 (1 − d 2 ) 0

 − l 2d 2

 for n − r0 = 3.
c 0 0

The resulting local power of the tests for d = 0.25 and different values of c and l is depicted in
Figures 3 and 4 where the number of replications is again 10 000 and, hence, the precision of the
simulations is as before.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 297

Panel A: Tests with a time trend Panel B: Tests without a time trend
1.0 1.0
LR PCmax
0.8 LR PCtrace 0.8
LR +max
LR iomax
0.6 LR +trace 0.6
c = 2.5

c = 2.5
LR iotrace
LR GLSmax
0.4 LR GLStrace 0.4 LR *max
LR *trace
0.2 0.2 LR SLmax
LR SLtrace
0.0 0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0

0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
l l
1.0 1.0

0.8 0.8
LR iomax
LR PCmax 0.6
0.6 LR iotrace
c = 5.0

c = 5.0
LR PCtrace
LR *max
0.4 LR +max 0.4
LR *trace
LR +trace
LR SLmax
0.2 LR GLSmax 0.2
LR SLtrace
LR GLStrace
0.0 0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0

0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
l l
1.0 1.0

0.8 0.8
LR PCmax LR iomax
0.6 0.6 LR iotrace
c = 7.5

LR PCtrace
c = 7.5

LR +max LR *max
0.4 0.4
LR +trace LR *trace
LR GLSmax LR SLmax
0.2 0.2
LR GLStrace LR SLtrace
0.0 0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0

0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
l l

Figure 4. Local power of LR-type tests for n − r0 = 3 and d = 0.25.

It can be seen that the trace and the maximum eigenvalue tests again perform quite similarly.
On the other hand, the differences between the alternative proposals for treating deterministic
terms are more pronounced now. L R + and L R S L have the highest local power among the tests
in their respective groups, whereas L R i0 is generally outperformed by its competitors. As in the
case of one extra cointegrating relation, the parameter d has no important effect on the relative
properties of the test versions. Therefore we do not present the results for values of d other than
0.25.
In summary, based on local power, no clear recommendations regarding the preferred use of
either the trace or the maximum eigenvalue tests can be given. Notice, however, that local power
properties are informative about the performance of the tests in large samples when alternatives
close to the null hypothesis are considered. Therefore a small-sample comparison of the tests is
performed in the next section in order to get further insights into the relative performance of the
two different test versions.

5. SMALL-SAMPLE COMPARISON OF TESTS

We use the following process xt to compare the maximum eigenvalue and trace tests in small
samples:
h 9 0 i h 0 i h I 2 i
r
xt = xt−1 + εt , εt ∼ i.i.d. N , , (10)
0 In−r 0 20 In−r

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
298 Helmut Lütkepohl et al.

Table 3. Relative rejection frequencies of tests for H (0) : r = 0 based on a bivariate DGP with cointegrating
rank r = 0, VAR order p = 1, θ = 0, sample size T = 100, nominal significance level 0.05.
Test Test
statistic statistic
i0 +
L Rmax 0.069 L Rmax 0.063
i0 +
L Rtrace 0.063 L Rtrace 0.055

L Rmax 0.062 PC
L Rmax 0.058

L Rtrace 0.062 PC
L Rtrace 0.058
SL
L Rmax 0.049 GLS
L Rmax 0.053
SL
L Rtrace 0.053 GLS
L Rtrace 0.046

where 9 = diag(ψ1 , . . . , ψr ) is an (r × r ) diagonal matrix and 2 is (r × (n − r )). This pro-


cess was also used in simulations by Toda (1994, 1995) and a number of other authors. For the
purposes of cointegration testing it represents a canonical form from which other processes can
be obtained by linear transformations. The LR tests for cointegration are invariant under such
transformations. (see Toda (1994)). Bivariate, three- and four-dimensional processes will be con-
sidered. As mentioned in the introduction, Toda (1994) reports results for bivariate processes
only. In the bivariate case, if r = 0, 9 and 2 vanish and the process consists of two nonsta-
tionary components. If the cointegrating rank is 1, 9 = ψ1 with |ψ1 | < 1 and 2 = θ is a
scalar which represents the instantaneous correlation between the two components. For three-
and four-dimensional processes the matrices 9 and 2 have analogous interpretations.
In the simulations the parameter values of the deterministic components are all zero, i.e.
µi = 0 (i = 0, 1) throughout. In other words, the deterministic part is actually zero. This choice
is not restrictive because the test statistics are invariant to the specific parameter values of their
respective deterministic terms. In other words, L R i0 , L R ∗ and L R S L are invariant to the choice
of µ0 and L R + , L R PC and L R G L S are invariant with respect to the values assigned to µ0 and
µ1 . Notice that this invariance holds also in small samples so that the same small-sample results
are obtained for any choice of deterministic parameters.
The sample size used in the simulations is T = 100. In addition, 50 presample values were
generated, starting with an initial value of zero. By discarding the start-up values we average
out the effects of initial values. This is done although the initial values may have an impact on
the test results if, for example, these values are very unusual. In practice, outliers at the sample
beginning can usually only be detected after a model has been fitted. If they are detected, it may
be preferable to exclude them from the analysis. Therefore, we shall not analyse the impact of
unusual initial values and prefer to average out the impact of initial values by discarding the
start-up values for each time series.
The number of replications is m = 10 000. The rejection frequencies given in Tables 3–5
and Figures 5–9 are based on asymptotic critical values for a test level of 5%. The rejection
frequencies are not size corrected because a size correction is generally not available in practice.
As in the local power simulations, the results for the test statistics are based on the same generated
time series for a given set of parameter values and a given sample size. Hence, the corresponding
entries in the tables and the figures are not independent.
√ Again the standard error of an estimator
of a true rejection probability P is s P = P(1 − P)/10 000 so that, in the worst case where

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 299

Table 4. Relative rejection frequencies of tests for three-dimensional DGPs with cointegrating rank r = 0
or 1, VAR order p = 1, sample size T = 100, nominal significance level 0.05.
Test r = 0a
statistic (H (0) : r = 0) r = 1 (H (1) : r = 1) r = 1 (H (1) : r = 1)
ψ1 = 1 ψ1 = 0.9 ψ1 = 0.8 ψ1 = 0.7 ψ1 = 0.9 ψ1 = 0.8 ψ1 = 0.7
2 = (0, 0) 2 = (0.4, 0.8)
i0
L Rmax 0.059 0.010 0.027 0.048 0.071 0.075 0.073
i0
L Rtrace 0.068 0.013 0.029 0.048 0.072 0.073 0.071

L Rmax 0.063 0.007 0.019 0.039 0.066 0.070 0.068

L Rtrace 0.067 0.009 0.025 0.043 0.077 0.076 0.072
SL
L Rmax 0.055 0.016 0.036 0.048 0.065 0.064 0.056
SL
L Rtrace 0.062 0.020 0.042 0.050 0.075 0.069 0.062
+
L Rmax 0.062 0.005 0.015 0.034 0.064 0.078 0.076
+
L Rtrace 0.069 0.009 0.021 0.037 0.074 0.082 0.078
PC
L Rmax 0.060 0.005 0.015 0.033 0.050 0.071 0.072
PC
L Rtrace 0.061 0.010 0.022 0.038 0.062 0.076 0.072
GLS
L Rmax 0.055 0.010 0.024 0.040 0.040 0.047 0.051
GLS
L Rtrace 0.056 0.008 0.022 0.037 0.032 0.039 0.044
a The results for r = 0 are invariant with respect to the innovation correlation.

Table 5. Relative rejection frequencies of tests for H (1) : r = 1 based on four-dimensional DGPs with
cointegrating rank r = 1, 2 = (0.4, 0.4, 0.8), VAR order p = 1, sample size T = 100, nominal signifi-
cance level 0.05.
Test r =1 Test r =1
statistic ψ1 = 0.9 ψ1 = 0.8 ψ1 = 0.7 statistic ψ1 = 0.9 ψ1 = 0.8 ψ1 = 0.7
i0 +
L Rmax 0.099 0.087 0.081 L Rmax 0.104 0.094 0.085
i0 +
L Rtrace 0.125 0.105 0.093 L Rtrace 0.144 0.119 0.106

L Rmax 0.101 0.089 0.083 PC
L Rmax 0.104 0.091 0.083

L Rtrace 0.125 0.103 0.092 PC
L Rtrace 0.118 0.104 0.092
SL
L Rmax 0.091 0.077 0.067 GLS
L Rmax 0.063 0.057 0.060
SL
L Rtrace 0.109 0.087 0.077 GLS
L Rtrace 0.059 0.053 0.058

P = 50%, a precision of ±1% is obtained, ignoring the Monte Carlo variation due to estimation
of the critical values.
Note also that the results for testing H (1) : rk(5) = 1 are not conditioned on the outcome
of the test of H (0) : rk(5) = 0 etc. In other words, the test results do not refer to sequential test
procedures. Notice that the properties of a sequential testing procedure as it is required here are
studied in general terms by Johansen (1992) and Paruolo (2001). The statistical properties of the

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
300 Helmut Lütkepohl et al.

Panel A: θ = 0 Panel C: θ = 0
1.0 1.0
LR PC
max
LRiomax
LRPCtrace LRiotrace
0.8 0.8
LR+max LR*max
LR*trace
LR+trace
0.6 0.6 LRSLmax
LRGLSmax
GLS LRSLtrace
LR trace
0.4 0.4

0.2 0.2

0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Panel B: θ = 0.8 Panel D: θ = 0.8


1.0 1.0

0.8 0.8

0.6 0.6
LRPCmax LRiomax
PC
0.4 LR trace 0.4 LRiotrace
LR+max LR*max
LR+trace LR*trace
0.2 0.2
LRGLSmax LRSLmax
LRGLStrace LRSLtrace
0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Figure 5. Relative rejection frequencies of tests for H (0) : r = 0 based on bivariate DGPs with r = 0 or
r = 1, T = 100.

Panel A: θ = 0 Panel C: θ = 0
1.2 1.2
LR+
LRio
PC
1.1 LR 1.1 LR*

LRGLS LRSL
1.0 1.0

0.9 0.9

0.8 0.8
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Panel B: θ = 0.8 Panel D: θ = 0.8


1.2 1.2
LR +
LRio

1.1 LR PC 1.1 LR*

LR GLS
LRSL
1.0 1.0

0.9 0.9

0.8 0.8
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Figure 6. Relative power of tests for H (0) : r = 0 based on bivariate DGPs with r = 0 or r = 1, T = 100.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 301

Panel A: Η = (0 0)’ Panel C: Η = (0 0)’


1.0 1.0
LRPCmax LRiomax
LRPCtrace LRiotrace
0.8 0.8
LR+max LR*max
LR*trace
LR+trace
0.6 0.6 LRSLmax
LRGLSmax
LRSLtrace
LRGLStrace
0.4 0.4

0.2 0.2

0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Panel B: Η = (0.4 0.8)’ Panel D: Η = (0.4 0.8)’


1.0 1.0

0.8 0.8

0.6 0.6
LRPCmax LRiomax
0.4 LRPCtrace 0.4 LRiotrace
LR+max LR*max
LR+trace LR*trace
0.2 0.2
LRGLSmax LRSLmax
LRGLStrace LRSLtrace
0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Figure 7. Relative rejection frequencies of tests for H (0) : r = 0 based on three-dimensional DGPs with
r = 1 or r = 2, ψ2 = 0.9, T = 100.

Panel A: Θ = (0 0)’ Panel C: Θ = (0 0)’


1.0 1.0
PC
LR max LRiomax
LRPCtrace LRiotrace
0.8 0.8
LR+max LR*max
LR+trace LR*trace
0.6 LRGLSmax 0.6 LRSLmax
LRGLStrace LRSLtrace

0.4 0.4

0.2 0.2

0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Panel B: Θ = (0.4 0.8)’ Panel D: Θ = (0.4 0.8)’


1.0 1.0
LRPCmax LRiomax
LRPCtrace LRiotrace
0.8 0.8
LR+max LR*max
LR+trace LR*trace
0.6 LRGLSmax 0.6 LRSLmax
LRGLStrace LRSLtrace

0.4 0.4

0.2 0.2

0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Figure 8. Relative rejection frequencies of tests for H (1) : r = 1 based on three-dimensional DGPs with
r = 1 or r = 2, ψ2 = 0.9, T = 100.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
302 Helmut Lütkepohl et al.

Panel A: H(0): r = 0 Panel C: H(1): r = 1


1.0 1.0
LRPCmax
LRPCtrace
0.8 0.8
LR+max
LR+trace
0.6 0.6 LRGLSmax
LRGLStrace
LRPCmax
0.4 0.4
LRPCtrace
LR+max
0.2 LR+trace 0.2
LRGLSmax
LRGLStrace
0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Panel B: H(0): r = 0 Panel D: H(1): r = 1


1.0 1.0
LRiomax
LRiotrace
0.8 0.8
LR*max
LR*trace
0.6 0.6 LRSLmax
LRSLtrace
LRiomax
0.4 LRiotrace 0.4
LR*max
LR*trace
0.2 0.2
LRSLmax
LRSLtrace
0.0 0.0
0.7 0.8 0.9 1.0 0.7 0.8 0.9 1.0

Figure 9. Relative rejection frequencies of tests for three-dimensional DGPs with r = 1 or r = 2, ψ2 = 0.7,
2 = (0, 0)0 , T = 100.

sequential procedure depend essentially on the properties of the individual tests. Consequently,
studying the tests individually may be viewed as a prerequisite for investigating the sequential
procedure. We will focus on the properties of the individual tests. The small-sample properties of
L R 0 are not presented here because the test’s assumptions regarding the deterministic terms are
not very realistic for applied work and are therefore not of much interest from that perspective.
The sizes of the tests for the bivariate DGP with r = 0 are shown in Table 3. Because
the trace and the maximum eigenvalue tests are identical for testing H (1) : r = 1 we just
present the results for H (0) : r = 0. All tests have roughly the correct size. Obviously, the
observed sizes for both the trace and the maximum eigenvalue tests are very similar irrespective
of the specific test proposal. Notice, however, that we are working under ideal conditions here
by considering VAR(1) processes only. Pesavento (2000) finds considerable size distortions of
the maximum eigenvalue tests in her special setup when the actual process order is unknown
and may be infinite. The same result was also found for three-dimensional DGPs when H (0) :
r = 0 or H (1) : r = 1 are tested (see Table 4). Generally, the sizes of the trace variants of
the L R + and L R PC tests exceed the sizes of the corresponding maximum eigenvalue versions
a bit for ψ1 = 0.9 and ψ1 = 0.8 in the case of large innovation correlation (2 = (0.4, 0.8)).
For 2 = (0.4, 0.8) we also observe that the trace variants of L R + and L R PC reject slightly too
often. Both the trace and the maximum eigenvalue versions of L R i0 and L R ∗ are affected by this
problem as well. In contrast, in the absence of innovation correlation (2 = (0, 0)) the tests are
quite conservative, especially the maximum eigenvalue variants. However, the L R S L tests have
reasonable size properties for both kinds of innovation correlation and therefore outperform the
other tests in this respect.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 303

In the absence of innovation correlation (2 = (0, 0, 0)) we observe similar size properties
of the tests for four-dimensional DGPs with r = 1 as for three-dimensional ones. However, in
the case of high innovation correlation (2 = (0.4, 0.4, 0.8)) the problem of an excessive size
distortion is much more severe, as shown in Table 5. All tests, except L R G L S , reject far too
often. Gonzalo and Pitarakis (1999) have pointed out that this kind of size distortion is a typical
problem of LR-type tests that seem to be emerging in large systems. However, the excessive size
distortion is less pronounced for the maximum eigenvalue tests than for the trace tests. So the
former have a slight advantage in this respect.
The four-dimensional case also allows us to compare the size of the test variants for processes
with two cointegration relations (r = 2). As the relative performance of the trace and maximum
eigenvalue tests is almost identical, we do not show the results here. Nevertheless, we mention
that all tests are very conservative in this case. Generally, the sizes of the tests do not exceed the
value of 0.02 and are often below 0.01. Only the L R S L tests are a bit less conservative.
To sum up, the size properties of the trace and the maximum eigenvalue variants are in general
rather similar for all test proposals. However, in specific situations the trace tests suffer more from
an excessive size distortion than the maximum eigenvalue tests.
The small-sample power results of the tests are depicted in the Figures 5–9. Figure 5 shows
that the trace and maximum eigenvalue tests perform quite similarly in the bivariate case with
r = 0 or 1 when H (0) : r = 0 is tested. In accordance with results by Toda (1994) for tests which
allow for a time trend, the trace tests are slightly better for alternatives close to the null hypothesis
and the maximum eigenvalue tests have an advantage in situations further away from the null
hypothesis. Interestingly, this pattern is observed independently of the innovation correlation and
the tests’ assumption regarding the deterministic terms. Obviously, the differences are quite small
and may be due to sampling variability in our simulations, although we use 10 000 replications.
To further investigate the significance of the differences in our results we also show the relative
powers of the maximum eigenvalue tests divided by the powers of the corresponding trace tests
in Figure 6 where statistically significant differences at the 5% level are indicated by coloured
circles, boxes, etc. The significance is checked using the statistic

{m/[ p̂ tr (1 − p̂ tr ) + p̂ max (1 − p̂ max ) − 2τ̂ ]}1/2 ( p̂ tr − p̂ max )

which is asymptotically standard normal if the underlying rejection probabilities of the two tests
are equal. Here m denotes the number of replications in our Monte Carlo experiment, as before
(i.e. m = 10 000), p̂ tr and p̂ max are the observed relative rejection frequencies of the trace and
maximum eigenvalue tests, respectively, and τ̂ = γ̂ − ( p̂ max · p̂ tr ), where γ̂ is the proportion
of joint rejections of both tests. The statistic was also used by Paruolo (2001) in comparing his
local power results. More details can be found in his study.
Obviously, Figure 6 shows that most of the power differences are statistically significant at
the 5% level. On the other hand, the differences between corresponding maximum eigenvalue
and trace tests are so small (usually less than 10%) that they are hardly of practical importance.
Because most curves cross the line at one, Figure 6 also shows that in most cases power advan-
tages are not uniform over the full range of parameter values.
It is worth noting that the tests assuming no linear trend have higher small-sample power
than those allowing for a trend (see Figure 5). Hence, it pays to specify the deterministic terms
properly. On the other hand, comparing the different test versions within the respective groups
(µ1 = 0 or µ1 6= 0), a clear winner for all situations is not found. Pesavento (2000) compares the
small-sample power of maximum eigenvalue tests for the case of testing H (0) : rk(5) = 0 in a

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
304 Helmut Lütkepohl et al.

bivariate setting to single-equation tests which are applicable if it is known that both variables are
I (1) and there is at most one cointegrating relation. If this knowledge is utilized, considerable
power gains may be possible by using single-equation cointegration tests.
Compared with the foregoing bivariate setup, all tests have a lower small-sample power when
the corresponding three- and four-dimensional DGPs with r = 0 or 1 are considered. Neverthe-
less, the relative characteristics of the test variants remain unchanged, so we do not present the
relevant graphs here.
We have also simulated the small-sample power for three- and four-dimensional processes
with two cointegration relations when testing H (0) and H (1). The qualitative results are largely
the same for both dimensions. Therefore we just refer to the three-dimensional case. In Figure 7,
the power for the case of testing H (0) is depicted. In this case, for |ψ1 | < 1, there are two more
cointegrating relations than are specified in the null hypothesis (i.e. r − r0 = 2). Hence, for the
maximum eigenvalue tests the null and alternative hypotheses are incorrect and, thus, the trace
tests may have an advantage. This is clearly reflected in the power curves. In the absence of
innovation correlation (2 = (0, 0)0 ) the small-sample power of the trace tests exceeds the power
of the maximum eigenvalue tests by up to about 20 percentage points. In some cases, the power
curves of the trace tests are clearly steeper than those of the maximum eigenvalue tests so that
the superior performance of the former is only partly attributable to their larger size.
When testing H (1), there is again just one extra cointegration relation under the alternative
(r − r0 = 1) and the power of all tests is rather low for ψ2 = 0.9, as can be seen in Figure 8.
Obviously, in some situations it is not very likely that a cointegrating rank of two will be found.
Setting the second autoregressive eigenvalue to ψ2 = 0.7 increases the small-sample power of
the tests for 2 = (0, 0)0 remarkably (see Figure 9). In the case of testing H (0) so that r − r0 = 2
(panels A and B), the advantage of the trace tests is even more pronounced than for the DGPs
with ψ2 = 0.9. When the null hypothesis is H (1) and, hence, r − r0 = 1, both test versions are
once more rather similar (panels C and D).
We have also extended our simulations in the different directions. First, we have generated
processes with fat-tailed, non-normal residuals. More precisely, we have used t-distributions
with 5 degrees of freedom. The results are very similar to those with normal residuals and are
therefore not shown.2 A second extension considered is processes with a component close to
I (2). The resulting test sizes were similar to the previously discussed I (1) cases. The power of
the tests applied to the near I (2) processes is not necessarily worse than that in the I (1) cases
because some of the power results for those cases are also very poor. Hence, there is little that
can be learnt in additionally from the near I (2) case. Therefore the results are also not shown.3
In summary, we can conclude that with respect to small-sample power, both the trace and the
maximum eigenvalue tests have similar properties, in line with the local power results. However,
in some cases, the trace tests are clearly superior to the corresponding maximum eigenvalue ver-
sions in terms of power. This happens in particular if the actual rank r exceeds the rank specified
in the null hypothesis, r0 , by more than one, r − r0 > 1. In those cases where the maximum
eigenvalue tests dominate, their power advantage is only minor. On the other hand, the latter
tests seem to have smaller size distortions than the trace tests. Still, our overall recommendation
is to use the trace tests if one wants to apply just one test version. Of course, there is nothing
wrong with the common practice of using both versions simultaneously.

2 Detailed results are posted on the internet at http://ise.wiwi.hu-berlin.de/oekonometrie/engl/addoneng.html


3 Details are also posted on the internet.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 305

6. CONCLUSIONS

In this study we have compared maximum eigenvalue and trace tests for the cointegrating rank
of a VAR process. The comparison is performed for test variants suitable for different types
of deterministic terms. More precisely, a couple of tests allowing for a nonzero mean and a
group of tests allowing in addition for a deterministic linear trend are considered. The asymptotic
distributions under local alternatives are given and a local power comparison is presented. In that
comparison no major differences between corresponding maximum eigenvalue and trace tests are
detected. In a small-sample simulation comparison it is found, however, that in some situations
trace tests tend to have more heavily distorted sizes whereas their power performance is superior
to that of the maximum eigenvalue competitors. In particular, the trace tests are advantageous if
there are at least two more cointegrating relations in the process than are specified under the null
hypothesis. Based on our simulations we have a preference for the trace tests. This result justifies
the common practice in empirical work of using either both types of tests simultaneously or
applying the trace tests exclusively.
In accordance with other authors (see, for example, Hubrich et al. (2001)) we also found
that the alternative LR-type test versions for a specific deterministic term sometimes differ more
substantially than the corresponding maximum eigenvalue and trace tests. Therefore, based on
our simulation results, it appears to be more sensible to apply different versions with respect to
the treatment of deterministic terms rather than the maximum eigenvalue and the trace variant of
one specific test version. The choice of the deterministic term may be based, for example, on the
tests presented in Johansen (1995, Ch. 11) for this purpose.
In special cases it may also be possible to increase the power by using other tests designed
for specific situations. For example, Pesavento (2000) shows that, in a bivariate system with two
I (1) variables, testing for a cointegrating rank of zero is perhaps better based on single-equation
tests. Also, if some cointegration relation is known, power improvements may be possible as in
Horvath and Watson (1995) by using this prior knowledge.

ACKNOWLEDGEMENTS

We thank Ralf Brüggemann, two anonymous referees and Karim Abadir for helpful comments
and we are grateful to the Deutsche Forschungsgemeinschaft, SFB 373, the Yrjö Jahnsson Foun-
dation and the European Commission under the Training and Mobility of Researchers Pro-
gramme (contract No. ERBFMRXCT980213), for financial support. The second author also
thanks the Alexander von Humboldt Foundation for financial support under a Humboldt research
award. Part of this research was done while he was visiting the Humboldt University in Berlin.

REFERENCES

Gonzalo, J. and J. -Y. Pitarakis (1999). Dimensionality effect in cointegration analysis. In R. Engle and H.
White (eds), Cointegration, Causality, and Forecasting. A Festschrift in Honour of Clive W. J. Granger,
pp. 212–29. Oxford: Oxford University Press.
Hansen, P. and S. Johansen (1998). Workbook on Cointegration. Oxford: Oxford University Press.
Horvath, M. T. K. and M. W. Watson (1995). Testing for cointegration when some of the cointegrating
vectors are prespecified. Econometric Theory 11, 984–1014.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
306 Helmut Lütkepohl et al.

Hubrich, K., H. Lütkepohl and P. Saikkonen (2001). A review of systems cointegration tests. Econometric
Reviews 20, 247–318.
Johansen, S. (1988). Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Con-
trol 12, 231–54.
Johansen, S. (1991). Estimation and hypothesis testing of cointegration vectors in Gaussian vector autore-
gressive models. Econometrica 59, 1551–80.
Johansen, S. (1992). Determination of cointegration rank in the presence of a linear trend. Oxford Bulletin
of Economics and Statistics 54, 383–97.
Johansen, S. (1994). The role of the constant and linear terms in cointegration analysis of nonstationary
time series. Econometric Reviews 13, 205–31.
Johansen, S. (1995). Likelihood Based Inference in Cointegrated Vector Autoregressive Models. Oxford:
Oxford University Press.
Johansen, S. and K. Juselius (1990). Maximum likelihood estimation and inference on cointegration—with
applications to the demand for money. Oxford Bulletin of Economics and Statistics 52, 169–210.
Lütkepohl, H. and P. Saikkonen (2000). Testing for the cointegrating rank of a VAR process with a time
trend. Journal of Econometrics 95, 177–98.
Paruolo, P. (2001). The power of lambda max. Oxford Bulletin of Economics and Statistics 63, 395–403.
Perron, P. and J. Y. Campbell (1993). A note on Johansen’s cointegration procedure when trends are present.
Empirical Economics 18, 777–89.
Pesavento, E. (2000). Analytical evaluation of the power of tests for the absence of cointegration. Discussion
Paper 2000-24. San Diego: University of California.
Saikkonen, P. and H. Lütkepohl (1999). Local power of likelihood ratio tests for the cointegrating rank of a
VAR process. Econometric Theory 15, 50–78.
Saikkonen, P. and H. Lütkepohl (2000a). Trend adjustment prior to testing for the cointegrating rank of a
vector autoregressive process. Journal of Time Series Analysis 21, 435–56.
Saikkonen, P. and H. Lütkepohl (2000b). Testing for the cointegrating rank of a VAR process with an
intercept. Econometric Theory 16, 373–406.
Saikkonen, P. and R. Luukkonen (1997). Testing cointegration in infinite order vector autoregressive pro-
cesses. Journal of Econometrics 81, 93–126.
Toda, H. Y. (1994). Finite sample properties of likelihood ratio tests for cointegrating ranks when linear
trends are present. Review of Economics and Statistics 76, 66–79.
Toda, H. Y. (1995). Finite sample performance of likelihood ratio tests for cointegrating ranks in vector
autoregressions. Econometric Theory 11, 1015–32.

APPENDIX. DERIVATION OF ASYMPTOTIC DISTRIBUTIONS OF TEST


STATISTICS

As mentioned in Section 3, some of the limiting distributions of the test statistics under consid-
eration have been given by Johansen (1995), Hansen and Johansen (1998) and Paruolo (2001).
In particular, the case when no deterministic term is present has been treated in detail by these
authors. Moreover, Saikkonen and Lütkepohl (1999) discuss the limiting distributions of the
trace test versions when deterministic terms are present. To derive the limiting distributions of
the maximum eigenvalue tests, we show that the general framework of the latter paper can be
used. This result does not follow automatically from Saikkonen and Lütkepohl (1999) because

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 307

the method of proof used in that paper was based on an idea in which only the limiting dis-
tribution of the sum of the relevant eigenvalues and not the limiting distribution of individual
eigenvalues was derived. Note also, that the proof is not a simple extension of the one used by
Johansen (1995) for the case without deterministic terms because we also include the tests which
are based on prior mean and trend adjustment. The following extension of Theorem 1 of Saikko-
nen and Lütkepohl (1999) shows that the joint limiting distribution of the relevant eigenvalues
can be obtained without strengthening the previously used assumptions.
Thus, consider the RR regression model

Yt = AB 0 X t + Z t , t = 1, . . . , T, (11)

where Yt and Z t are (n × 1) vectors, X t is an (m × 1) vector with m ≥ n, and A and B are


(n × r0 ) and (m × r0 ) coefficient matrices of full column rank, respectively. The error term Z t is
assumed to be of the form
Z t = T −1 A1 B10 X t + Et , (12)
where A1 and B1 are (n × (r − r0 )) and (m × (r − r0 )) matrices, respectively, with r − r0 > 0
and Et is the error term under the null hypothesis that (11) is a correctly specified model. The
matrices [A : A1 ] and [B : B1 ] are supposed to be of full-column rank unless the null hypothesis
holds, in which case A1 = 0 and B1 may also be zero.
It is well known that if Et is Gaussian white noise and X t is strictly exogenous or predeter-
mined, the statistical analysis of the RR regression model (11) is based on a generalized eigen-
value problem in which one solves the determinantal equation

Y MY X − l M X X ) = 0,
det(M X Y MY−1 (13)
PT PT PT
where M X X = T −1 t=1 X t X t0 , M X Y = MY0 X = T −1 t=1 X t Yt0 and MY Y = T −1 t=1 Yt Yt0 .
In particular, if lˆ1 ≥ · · · ≥ lˆn are the ordered solutions of (13) then the LR test for the null
hypothesis that the rank specification in (11) is correct is based on the trace test statistic analogous
to that given in (7). The alternative hypothesis of this LR test is that the column rank of A and
B is larger than r0 . If the alternative is considered that the parameter matrix has rank r0 + 1, the
corresponding LR test is based on the maximum eigenvalue test statistic analogous to (8).
Saikkonen and Lütkepohl (1999) obtain the limiting distribution of the trace statistics under
the following general assumptions.

PT p
Assumption 1. (i) T −1 t=1 B 0 X t X t0 B → 6 B B > 0
T 0 X X 0 B = O (1)
(ii) T −1 t=1
P
B⊥ t t p
T d
−2 0 for some (generally) random (m × m) matrix G with B⊥
0 GB >
P
(iii) T t=1 X t X t → G ⊥
0
0 and B G = 0 (a.s.)
T
(iv) T −1/2 t=1 Et X t0 B = O p (1)
P
PT d
(v) T −1 t=1 Et X t0 B⊥ → S for some random (n × (m − r0 )) matrix S
T
(vi) T −1 t=1 Et Et0 = 6EE + O p (T −1/2 ) for some fixed matrix 6EE > 0
P
Furthermore, the sequences in (iii) and (v) converge jointly in distribution.

Using Assumption 1 we can now prove the following theorem.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
308 Helmut Lütkepohl et al.

Theorem A 1. If Assumption 1 holds, the random variables T lˆj , j = r0 + 1, . . . , n, converge


weakly and jointly to the solutions of the equation
0
det[(F B⊥ G B⊥ + A0⊥ S)0 (A0⊥ 6EE A⊥ )−1 (F B⊥
0
G B⊥ + A⊥ S) − ρ(B⊥
0
G B⊥ )] = 0

where F = A0⊥ A1 B10 B⊥ (B⊥


0 B )−1 .
⊥ 2

Proof. From the proof of Lemma 1 of Saikkonen and Lütkepohl (1999) we first obtain the fol-
lowing simple consequences of Assumption 1:
de f
MY X B = A6 B B + o p (1) = 6Y B + o p (1) (14)

and
de f
MY Y = A6 B B A0 + 6EE + o p (1) = 6Y Y + o p (1). (15)
Using the notation 6Y B = 6 0BY = A6 B B , we then also have

6Y Y = A6 BY + 6EE . (16)

A further straightforward consequence of Assumption 1 to be used below is

MY X = O p (1). (17)

Now consider the solutions of the generalized eigenvalue problem (13). In the proof of Lemma
1 of Saikkonen and Lütkepohl (1999) it is shown that the eigenvalues converge weakly to those
of the equation
det(6 BY 6Y−1Y 6Y B − l6 B B ) det(l B⊥
0
G B⊥ ) = 0.
This implies that the r0 largest eigenvalues of (13) converge in probability to positive constants
X X B⊥ = O p (T ) it
and the n−r0 smallest eigenvalues converge in probability to zero. Since B⊥ 0 M

is not difficult to check that lˆj = O p (T ) for j = r0 +1, . . . , n. Thus, to analyse the asymptotic
−1

behaviour of these eigenvalues, we use the equation

det(S(ρ)) = 0, (18)

where S(ρ) = M X Y MY−1 Y MY X − ρ(T X X ) and, if ρ̂1 ≥ · · · ≥ ρ̂n are the solutions of this
−1 M

equation, then ρ̂ j = O p (1), j = r0 + 1, . . . , n, can be assumed.


The solutions of (18) do not change if S(ρ) is premultiplied by the matrix [B : B⊥ ]0 and
postmultiplied by [B : B⊥ ]. Thus, in the same way as in Johansen (1995, p. 159) we consider the
decomposition
det([B : B⊥ ]0 S(ρ)[B : B⊥ ])
= det(B 0 S(ρ)B) det(B⊥ 0 {S(ρ) − S(ρ)B[B 0 S(ρ)B]−1 B 0 S(ρ)}B )

de f
= det(S1 (ρ)) det(S2 (ρ)).
For j = r0 + 1, . . . , n, we have

S1 (ρ̂ j ) = B 0 M X Y MY−1
Y MY X B − ρ̂ j (T B M X X B) = 6 BY 6Y−1Y 6Y B + o p (1),
−1 0
(19)

where the latter equality is based on (14), (15), Assumption 1(i) and the fact that ρ̂ j = O p (1).
This shows that asymptotically ρ̂r0 +1 , . . . , ρ̂n are not roots of det(S1 (ρ)). Hence, we can consider
S2 (ρ).

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Maximum eigenvalue vs. trace tests 309

For j = r0 + 1, . . . , n,

X Y MY Y MY X B − ρ̂ j (T
0 S(ρ)B 0 M −1 −1 B 0 M
B⊥ = B⊥ ⊥ X X B) (20)
= B⊥ M X Y 6Y Y 6Y B + o p (1).
0 −1

Since here ρ̂ j = O p (1), the latter equality is obtained by using (14), (15), (17) and Assump-
tion 1(ii). By the same arguments we also find that, for j = r0 + 1, . . . , n,

X Y MY Y MY X B⊥ − ρ̂ j (T ⊥ X X B⊥ )
0 S(ρ)B 0 M −1 −1 B 0 M
B⊥ ⊥ = B⊥
(21)
= B⊥ M X Y 6Y Y MY X B⊥ − ρ̂ j (T B⊥ M X X B⊥ ) + o p (1).
0 −1 −1 0

From (19)–(21) it follows that

S2 (ρ̂ j ) = B⊥
0
M X Y P MY X B⊥ − ρ̂ j (T −1 B⊥
0
M X X B⊥ ) + o p (1), j = r0 + 1, . . . , n, (22)

where
P = 6Y−1Y − 6Y−1Y 6Y B (6 BY 6Y−1Y 6Y B )−1 6 BY 6Y−1Y
= 6Y−1Y − 6Y−1Y A(A0 6Y−1Y A)−1 A0 6Y−1Y
= A⊥ (A0⊥ 6EE A⊥ )−1 A0⊥ .
Here the second equality is obtained from (12) and the third one can be justified by using the
identity A0⊥ 6Y Y = A0⊥ 6EE obtained from (13) in conjunction with the argument used by
Johansen (1995, p. 142) to prove his result (10.6). Hence, from the definitions it follows that
0 M 0 T −1
PT 0 −1
PT 0
B⊥ X Y P MY X B⊥ = B⊥ t=1 X t Z t P T t=1 Z t X t B⊥
d de f
→ (A1 B10 G B⊥ + S)0 P(A1 B10 G B⊥ + S) = H,

where the weak convergence is due to the definition of Z t and Assumption 1(iii) and (v). From
this result, (22), Assumption 1(iii), and the continuity of generalized eigenvalues, we can con-
clude that ρ̂r0 +1 , . . . , ρ̂n converge weakly to the solutions of the equation

det[H − ρ(B⊥
0
G B⊥ )] = 0. (23)

Observing that B10 G B⊥ = B10 B⊥ (B⊥0 B )−1 B 0 G B (see the proof of Theorem 1 of Saikkonen
⊥ ⊥ ⊥
and Lütkepohl (1999)) and using the definitions it can be readily seen that (23) is identical to the
equation in the theorem. This completes the proof. 2

Using Theorem A.1 and the results in Saikkonen and Lütkepohl (1999) and Lütkepohl and
Saikkonen (2000), it is now a simple matter to demonstrate that the maximum eigenvalue tests
considered in this paper have the limiting distributions stated in Section 3. First, note that if
ρ1 ≥ · · · ≥ ρn are the solutions of the equation in Theorem A.1 or, equivalently, (23), then
n n
X d X
L Rtrace (r0 ) = ρ̂ j + o p (1) −→ ρ j = tr{H (B⊥
0
G B⊥ )−1 }.
j=r0 +1 j=r0 +1

Notice that the last expression is identical to that in Theorem 1 of Saikkonen and Lütkepohl
(1999) by the definitions and well known results of matrix algebra. Similarly,
d
L Rmax (r0 ) = ρ̂r0 +1 + o p (1) −→ ρr0 +1 = λmax {H (B⊥
0
G B⊥ )−1 }.

c Royal Economic Society 2001


1368423x, 2001, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/1368-423X.00068 by Babes-Bolyai University Cluj-Napoca, Wiley Online Library on [30/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
310 Helmut Lütkepohl et al.

Because it has been shown in the aforementioned previous work that the general framework of
equations (11), (12) and Assumption 1 can be applied for deriving the limiting distributions of
the LR trace tests considered in Section 3, it follows that the same is true for the correspond-
d
ing maximum eigenvalue tests. Moreover, it follows that if we have L Rtrace (r0 ) → tr(D) then
d
L Rmax (r0 ) → λmax (D) obtains.
S L (r ) and L R S L (r ) and define
To illustrate this result, consider the test statistics L Rtrace 0 max 0
K (s) = (α⊥ α⊥ ) N (s). From Section A.4.4 of Saikkonen and Lütkepohl (1999) we can con-
0 1/2

clude that Theorem A.1 applies with the counterparts of the matrices B⊥ 0 G B and (F B 0 G B +
⊥ ⊥
1 1
A0⊥ S)0 given by 0 K (s)K (s)0 ds and 0 K (s)dK (s)0 , respectively. Because the counterpart of
R R

A0⊥ 6EE A⊥ is α⊥ 0 α , it follows from Theorem A.1 that the limiting distributions of the test

S L S L (r ) are given by tr(D 0 ) and λ
statistics L Rtrace (r0 ) and L Rmax 0 max (D ), respectively, where D
0 0

is as defined in Section 3. Details of the other maximum eigenvalue tests can be worked out
similarly.

c Royal Economic Society 2001

You might also like