Ranks and Signs-Based Multiple Variance Ratio Test
Ranks and Signs-Based Multiple Variance Ratio Test
Ranks and Signs-Based Multiple Variance Ratio Test
net/publication/228583649
CITATIONS READS
57 1,909
2 authors:
All content following this page was uploaded by Jorge Belaire-Franch on 27 May 2014.
Abstract
∗
The authors acknowledge financial support from the Consejerı́a de Ciencia y Tec-
nologı́a de Castilla-La Mancha, project PAC–02–001, and from the Ministerio de Ciencia
y Tecnologı́a, project SEC2003-09205. We thank Jonathan Wright, Richard Luger, Yoo-
Jae Whang and Kamil Yilmaz for kindly sharing their computer codes.
†
Corresponding author. Department of Economic Analysis. University of Valencia.
Avgda. dels Tarongers s/n. 46022 Valencia (Spain). Phone: 34-96-3828246, fax: 34-96-
3828249, e-mail address: [email protected]
‡
Department of Economic Analysis. University of Valencia. Avgda. dels Tarongers
s/n. 46022 Valencia (Spain). Phone: 34-96-3828246, fax: 34-96-3828249, e-mail address:
[email protected]
1
1 Introduction
There exists a long tradition in the literature concerning the test of the
the random walk hypothesis provides a mean to test the weak form efficiency
Given a time series {Xt }Tt=1 , the random walk hypothesis corresponds to
Xt = µ + φXt−1 + εt ,
where µ is an unknown drift parameter and the error terms εt are, in general,
some asset price series follow a random walk process (see Meese and Single-
ton, 1982; Baillie and Bollerslev, 1989; Hsieh, 1988). However, alternative
in finance, since the seminal work of Lo and MacKinlay (1988), who derive
The variance ratio methodology consists of testing the random walk hy-
pothesis against stationary alternatives, by exploiting the fact that the vari-
variance ratio at lag k is defined as the ratio between (1/k)th of the kth
difference to the variance of the first difference. Hence, one would expect
that, for a random walk process, the variance computed at each individual
1
Nevertheless, the uncorrelatedness of asset returns is neither a necessary nor a suffi-
cient condition for market efficiency, see Campbell et al. (1997), p. 31.
2
lag interval k = 2, 3, . . . would be equal to unity.
examine M1 (k) and/or M2 (k) for several k values. The null is rejected if
this context, Chow and Denning propose to using Hochberg’s (1974) mul-
On the other hand, as stressed by Luger (2003), the validity of the vari-
ance ratio depends on several assumptions that often might not hold in time
may provoke the variance ratio suffer from large size distortions.
Alternative exact parametric tests have been proposed to test for the ran-
dom walk hypothesis (Dufour and Kiviet, 1998; Dufour and Torrès, 2000,
among many others). In a recent paper, Wright (2000) proposes exact non-
parametric variance-ratio tests based on signs and ranks which do not de-
pend on the existence of moments, and the sign-based version is exact even
instead of the original observations, has also been considered by other au-
3
and Koenker, 1997; Breitung and Gourièroux , 1997; So and Shin, 2001).
In a more recent work, Luger (2003) uses ranks and signs to extend the
test of the same null hypothesis. Then, alternative adjustments for multiple
may control the size of the sequential variance ratio testing. The other
ratio tests adapted to the ranks and signs context. Both approaches allow
us to control the test size, and they can be more powerful than alternative
Wright’s tests and assess the size distortion due to sequential testing. Section
4
2 Random walk tests
The variance ratio test proposed by Lo and MacKinlay (1988, 1989) is based
on the fact that, for a random walk series, the variance of its k-th difference
is k times the variance of its first difference. For example, if a series follows
a random walk, the variance of its four-day difference will be four times as
vs. H1 : the time series does not follow a random walk. Let {Xt } denote a
σ̂ 2 (k)
V R(k) = , (1)
σ̂ 2 (1)
where:
T
1 X
2
σ̂ (k) = (Xt − Xt−k − k µ̂)2 ,
m
t=k
and:
T
1 X
2
σ̂ (1) = (Xt − Xt−1 − µ̂)2 .
T −1
t=1
V R(k) − 1
M1 (k) = , (2)
φ(k)1/2
5
which, under the assumption of homoscedasticity, is asymptotically dis-
2(2k − 1)(k − 1)
φ(k) = . (3)
3kT
V R(k) − 1
M2 (k) = , (4)
φ∗ (k)1/2
where
k−1
2(k − j) 2
X
∗
φ (k) = δ(j), (5)
k
j=1
PT
t=j+1 (Xt − Xt−1 − µ̂)2 (Xt−j − Xt−j−1 − µ̂)2
δ(j) = hP i2 . (6)
T 2
t=1 (X t − X t−1 − µ̂)
multiple k values leads to over rejection of the null hypothesis, above the
nominal size.
The Chow and Denning test (CD test hereafter) provides a procedure
for the multiple comparison of the set of variance ratio estimates with unity.
For a set of m test statistics, the random walk null hypothesis is rejected
one. Hence only the maximum absolute value in the set of test statistics is
Prob [max(|Mj (k1 )|, |Mj (k2 )|, . . . , |Mj (km )|) ≤ SM M (α, m, T ))] ≥ 1 − α
6
Therefore, Chow and Denning (1993) suggests to compare the maximum
(1991) joint variance ratio test on daily nominal exchange rates, as suggested
(m × 1) unit vector, and Φ is the covariance matrix of VR. For any lags
r and s, the analytical expression for the covariance matrix is (Fong et al.,
1997):
2(2r−1)(r−1) 2(3s−r−1)(r−1)
3r 3s
Φ= (8)
2(3s−r−1)(r−1) 2(2s−1)(s−1)
3s 3s
The usefulness of this test relies on the fact that, whenever the variance
ratio tests are computed over long lags with overlapping observations, the
and MacKinlay test nor Chow and Denning’s procedure is valid for drawing
inferences. Fong et al. (1997) show that the Richardson and Smith’s test
(RS test hereafter) can be more powerful than Chow and Denning’s multiple
comparison test for empirically relevant alternatives, and it displays low size
Whang and Kim (2003) develop a multiple variance ratio test (W K test
7
totic critical values.
where Mr (k) = V R(k) − 1. To test the null hypothesis, Whang and Kim
They show that the asymptotic null distribution of the statistic is that of a
Given the block size, the M VT statistic is computed for each subsam-
ple, and the original M VT statistic is compared to the 95% quantile of the
critical value.
In a recent paper, Wright (2000) proposes the use of signs and ranks of dif-
demonstrates that his nonparametric variance ratio tests based on ranks (R1
and R2 ) and signs (S1 and S2 ), can be more powerful than the tests sug-
8
gested by Lo and MacKinlay. They have high power against a wide range
ternatives. The tests based on ranks are exact under the independence
and identical distribution assumption, whereas the tests based on signs are
shows that ranks-based tests display low size distortion, under conditional
heteroscedasticity.
PT !
1
(r1t + . . . + r1t−k+1 )2
R1 (k) = Tk t=k
1 PT
−1 × φ(k)−1/2 , (9)
2
T t=1 r1t
PT !
1 2
t=k (r2t + . . . + r2t−k+1 )
R2 (k) = Tk
1 T
−1 × φ(k)−1/2 , (10)
2
P
T t=1 r2t
where
,r
T +1 (T − 1)(T + 1)
r1t = r (yt ) − ,
2 12
r2t = Φ−1 (r(yt )/(T + 1)) .
9
tests based on the signs of first differences are given by:
PT !
1 2
(s t + . . . + s t−k+1 )
S1 (k) = Tk t=k
1 PT
− 1 × φ(k)−1/2 , (11)
s 2
T t=1 t
!
1 PT 2
(s t (µ) + . . . + s t−k+1 (µ))
S2 (k) = Tk t=k
1 PT
− 1 × φ(k)−1/2 , (12)
s (µ)2
T t=1 t
where φ(k) is defined in (3), st = 2u(yt , 0), st (µ) = 2u(yt , µ), and
0.5
if xt > q,
u(xt , q) =
−0.5
otherwise.
Thus, S1 assumes a zero drift value. If the value of the drift parameter is
strategy.
First, an exact confidence interval for the drift parameter µ, valid under
the null hypothesis, is established. Denote y(1) , . . . , y(T ) the order statistics
trials T and probability of success 1/2. The second step consists of com-
puting the S2 statistic, for each candidate value b for the drift parameter in
the confidence interval. The value of the S2 statistic (retaining the sign) at
where, given b ∈ CIµ (α1 ), S2 (k, b) is computed by defining st (b) = 2u(yt , b).
10
α2 level test, such that the overall level of the strategy is bounded by
1. Model 1.
xt ∼ N (0, 1)
2. Model 2.
xt ∼ t3
3. Model 3.
ht = 0.95ht−1 + ξt
xt = exp(0.5ht ) εt
For each artificial sample, Wright’s variance ratio ranks and signs tests
are computed for two k values (2 and 4) when T = 320, three k values (2, 4
and 8) when T = 640 and four k values (2, 4, 8 and 16) when T = 1, 280,
and we analyze the impact on the empirical size of each individual test. The
null hypothesis of random walk is rejected if, for some k value, the statistic
11
tabulated distribution, respectively.4
Insert Table 1
Results in Table 1 confirm the argument that using variance ratio tests at
various aggregation intervals leads to rejection rates larger than the nominal
size.5
tests
As shown in the previous section, Wright (2000) tests suffer size distortions
when they are sequentially applied at several k values. This section proposes
(S)
p̃ji = 1 − (1 − pji )m ,
(H)
p̃ji = min{[k − R(pji ) + 1]pji , 1},
4
Critical values can be found in Wright (2000), Table 1, p. 3. Critical values for the
S2 test at α2 = 4% level have been simulated through 100,000 replications.
5
Note that false rejections notably increase for R1 and R2 tests against Model 3, since
those tests are not robust in the presence of heteroscedasticity.
12
Given a significance level α, the decision rule states that, using the
(S) (S)
variance ratio test j, the null is rejected if p̃j = min1≤i≤m p̃ji ≤ α or
(H) (H)
p̃j = min1≤i≤m p̃ji ≤ α.
original first differences. Then, for the nth bootstrap sample, we com-
pute the value of the m variance ratio test statistics and the associated
N
(N ) 1 X
p̃ji = I(−∞,0] min p∗jl,n − pji , i = 1, . . . , m
N 1≤l≤m
n=1
S1 and S2 tests for several k values and applying either a Hochberg (H),
13
Sidack (S) or bootstrap (B) adjustment for multiplicity. Note, however,
(·) (·)
that the bootstrap adjustment would be valid for the R1 and R2 tests
standard variance ratio tests by Wright’s ranks and signs-based tests, in the
definition of Chow and Denning (1993) and Richardson and Smith (1991)
procedures.
Thus, we can define the following statistics, based on Chow and Den-
CD(R1 ) = max{|R1 (k1 )|, |R1 (k2 )|, . . . , |R1 (km )|} (13)
CD(R2 ) = max{|R2 (k1 )|, |R2 (k2 )|, . . . , |R2 (km )|} (14)
CD(S1 ) = max{|S1 (k1 )|, |S1 (k2 )|, . . . , |S1 (km )|} (15)
CD(S2 ) = max{|S2 (k1 )|, |S2 (k2 )|, . . . , |S2 (km )|} (16)
where Rj∗ (k1 ) is the ranks-based test computed with any random permuta-
14
follow a martingale difference sequence) in Wright (2000), the CD(Sj ) tests
are distributed as
where Sj∗ (k1 ) is the signs-based test computed with an i.i.d. sequence
{s∗t }Tt=1 , each element of which is 1 with probability 1/2 and -1 other-
where T is the sample size, the matrix Φ was defined for two arbitrary lag
PT !
1
(r1t + . . . + r1t−k+1 )2
R1† (k) = Tk t=k
1 PT
−1 , (21)
2
T t=1 r1t
!
1 PT 2
(r 2t + . . . + r 2t−k+1 )
R2† (k) = Tk t=k
1 PT
−1 , (22)
2
T t=1 r2t
!
1 PT 2
(s t + . . . + s t−k+1 )
S1† (k) = Tk t=k
1 PT
−1 , (23)
2
T t=1 st
!
1 PT 2
(s t (µ) + . . . + s t−k+1 (µ))
S2† (k) = Tk t=k
1 PT
−1 . (24)
s (µ) 2
T t=1 t
15
as
T R∗† 0 −1 ∗†
j Φ Rj
where R∗†
j is the vector of ranks-based statistics (21)–(22) computed with
T S∗† 0 −1 ∗†
j Φ Sj
where S∗†
j is the the vector of signs-based statistics (23)–(24) computed
with an i.i.d. sequence {s∗t }Tt=1 , each element of which is 1 with probability
1/2 and -1 otherwise. As in the case of the CD tests, the exact sampling
ance ratio tests are sensitive to deviations from the stronger i.i.d. assump-
tion. However, signs-based multiple variance ratio tests are robust against
under the additional assumption of zero drift value. Regarding the CD(S2 )
variate context.
Next Table displays critical values of the proposed ranks and signs-based
multiple variance ratio tests, at the 5% significance level, for several combi-
Insert Table 2
6
CD(S2 ) and RS(S2 ) tests critical values have been computed by setting α1 = 0.01 and
α2 = 0.04.
16
5 Comparison among multiple variance ratio tests
The purpose of this section is to show the empirical size and power properties
robust version), the RS test and the W K test. To that end, a large Monte
For each procedure, 1,000 artificial samples have been generated of the
yt = ε t , (25)
yt = exp(0.5ht ) εt , (26)
(1 − L)0.1 yt = εt , (29)
where εt and ξt ∼ N (0, 0.1) are independent random variables. Model (25)
stochastic volatility model for the first difference of the variable Xt , models
(27) and (28) correspond to correlated first differences, whereas (29) and
In addition, we explore the finite sample behavior of the tests against the
processes:
17
which correspond to the alternative hypothesis of mean reverting process
for the variable in levels. Hence, the cases (25) and (26) are designed to
analyze the empirical size of the tests, while the rest of cases will be useful
to assess the empirical power of the tests in finite samples. In all cases, we
the 5% significance level for three sample sizes and their corresponding sets
(k = 2, 4, 8 and 16).7
Insert Table 3
terms of superior power against all the considered alternative models. For
more powerful against model (28), whereas the W K test is more powerful
On the other hand, Table 3 shows that the RS test can be seriously
cess, model (26) with Gaussian innovations, is about 34%. The W K test
may display size distortion as well, although of low order compared to the
distortion of RS test, and it seems to diminish for larger sample sizes. More-
over, results in Table 3 allow us to claim that the W K test is not specially
7
We report the results just for T = 1280 (k = 2, 4, 8 and 16). Results for the rest of
sample sizes are available upon request.
8
The W K test has been computed for six different block lengths as suggested by Whang
and Kim (2003), in the range 3.5 T 0.6 < b < 2.5 T 0.3 . Rejection rates displayed in column
labeled W K are the sample mean of the rejection rates computed for each block length.
18
suited to detect ARIMA(1,1,0) processes, since power against models (27)
The results in Tables 4–5 show the performance of ranks and signs-
based variance ratio tests computed at several k values with Hochberg and
The results show that the Sidack and Hochberg-type p-value adjustment
of a loss of power, with the exception of R1 and R2 tests against the het-
test against model (26), results concerning both R1 and R2 tests are not re-
liable when the time series are conditionally heteroscedastic. On the other
rection, well beyond the nominal level, since the two-step procedure leads
to a conservative test.
Insert Table 6
violated by model (26), now the variance ratio tests which are sensitive to
deviations from this assumption (R1 and R2 ) are oversized. Nevertheless, the
S2 test, which was clearly undersized, now displays an empirical size closer
19
to the 5% nominal level, mostly at larger sample sizes. This procedure is
of the signs of the data, regardless of the (absolute) size of the increments.
Therefore, the bootstrap approach for S1 and S2 does not need to replicate
However, regardless the type of adjustment for multiple tests, ranks and
signs-based variance ratio tests are not too powerful against AR(1) station-
ary (in levels) processes, although they may improve the power of standard
RS (33.9%) and W K (11.6%) tests against model (26) with normal random
perturbations.
Regarding the CD(·) and RS(·) tests, Tables 7 and 8 allow us to conclude
that the ranks-based tests CD(R1 ) and CD(R2 ) , and RS(R1 ) and RS(R2 ) are
more powerful than their signs-based counterparts, CD(S1 ) and CD(S2 ) , and
RS(S1 ) and RS(S2 ) . Moreover, CD(·) tests are slightly more powerful than
RS(·) tests, whereas size distortion of ranks-based RS(·) tests are somewhat
larger than size distortion of ranks-based CD(·) tests. Hence, it seems that
ranks and signs-based CD(·) tests are superior to RS(·) tests. Moreover,
20
note that by construction, the CD(R1 ) , CD(R2 ) , RS(R1 ) and RS(R2 ) tests are
exact (size identically equal to 5%) in the i.i.d. case (25), while the CD(S1 ) ,
CD(S2 ) , RS(S1 ) and RS(S2 ) tests are exact in both i.i.d. and uncorrelated
inferior, to the CD(S1 ) test, with the exception of AR(1) stationary alter-
natives. But let us remember that the CD(S1 ) test is constructed under the
data sets.9 The first data set consists of dollar-based weekly nominal ex-
change rates for the Canadian dollar, French franc, German mark, Japanese
yen, and British pound, and cover the period from August 7, 1974, to May
29, 1996. For each week, the exchange rate is observed on Wednesday, or
the next trading day if the markets are closed on Wednesday. The returns
were constructed as the first differences of the log exchange rates. Wright
(2000) analyzed the same data set, which was obtained from the Journal of
Business and Economic Statistics Data Archive. The original source was the
21
out by Wright (2000), the original data source is the same as that used by
Liu and He (1991), although the data analyzed by Wright spans a longer
time period.
Insert Table 9
Table 9 (upper panel) shows the results from applying standard multi-
test allow us to reject the null hypothesis just in the Japanese yen/US dol-
lar case. The RS test rejects the null at the 5% significance level for the
Canadian dollar and Japanese yen exchange rates, and at the 10% level for
the French franc and the German mark. The W K test rejects the null at
different significant levels for several block lengths for all the currencies but
the Canadian dollar. However, we must remind from our Monte Carlo ex-
alternatives, with the worst results in the case of the RS test. Hence, we
should not rely our final conclusion about the random walk hypothesis on
these procedures.
variance ratio tests. More specifically, since we do not know a priori whether
the corresponding true data generating processes have zero drift or not, we
Table 9 (lower panel) presents the new results. The Hochberg-type ad-
justment for multiple tests allow us to reject the null at the 5% level in
two cases (Canada and Japan) and at the 10% level for the rest of exchange
rates (France, Germany and UK). These conclusions are replicated when us-
22
Conclusions using the bootstrap-adjusted p-values, however, strongly reject
the null hypothesis for all the exchange rates, at significance levels well below
the 1%.
Regarding the CD(S2 ) test, the hypothesis of random walk can be clearly
rejected for all the considered exchange rates, supporting the conclusions
(B)
reached with S2 test. Finally, the RS(S2 ) test rejects the null for all the
All in all, the results suggest that the analyzed exchange rates deviate
from the random walk behavior. Thus we could claim that the corresponding
exchange rates do not follow a random walk. This conclusion aligns with
Wright (2000) findings, although this author’s conclusions were not based
in our Monte Carlo work, it is unlikely that the log nominal exchange rates
are stationary in levels, although the exchange rate returns are likely to be
rates for the same currencies, frequency and time span than in the previous
analysis, although now the data source is different. The data used in this
second analysis were certified by the Federal Reserve Bank of New York
and are the noon buying rates in New York for cable transfers payable in
foreign currencies. These are the same data as used by Luger (2003), and
they are available on the Board of Governors of the Federal Reserve System
Web site. For each week, the exchange rate is observed on Wednesday if
the markets are open, or the next trading day otherwise. Again, the returns
Insert Table 10
Table 10 (upper panel) shows the results from applying standard multiple
23
variance ratio tests. As before, the heteroscedasticity-robust version of the
CD test allow us to reject the null hypothesis just in the Japanese yen/US
dollar case. Now, the RS test rejects the null at the 5% significance level for
the Canadian dollar, Japanese yen and British pound exchange rates, and
at the 10% level for the French franc. As in the previous analysis, the W K
test rejects the null at different significant levels for several block lengths for
ment allow us to reject the null at the 5% level for all the currencies but one,
the German mark, for which the null is rejected at the 10% level. The results
with the Sidack-type adjustment are less conclusive, since the null is rejected
at the 5% level in two cases (France and Japan) and at the 10% level for the
the bootstrap-adjusted p-values strongly reject the null hypothesis for all
Germany and Japan at the 5% level, and for UK at the 10% level.
Again, the results suggest that the major exchange rates against the US
dollar deviate from the random walk behavior, confirming that exchange
rates are not martingale processes. This conclusion sharply contrasts with
Luger (2003), who concludes that the random walk hypothesis can not be
rejected on the basis of his results using a set of nonparametric random walk
24
Wright (2000) and Liu and He (1991) findings. Finally, as in the analysis of
the second data set, we could claim that, possibly, the log nominal exchange
rates are not stationary in levels, although the exchange rate returns might
7 Conclusions
The main aim of this paper has been to suggest some alternative multiple
test.
The second set consisted of the ranks and signs versions of Chow and
Denning, and Richardson and Smith tests. The ranks-based procedures are
exact under the i.i.d. assumption whereas the signs-based procedures are
exact under both the i.i.d. and martingale difference sequence assump-
tions. These tests work rather well, although the CD(·) tests seem superior
tives.
25
the CD, the RS and the W K tests. The RS test may display a large size
somewhat oversized against the same alternatives. On the other hand, these
models than our tests, which are more sensitive against ARIMA(1,1,0) mod-
nal exchange rate data against the US dollar, which was already analyzed in
two previous works: Wright (2000) and Luger (2003). We have computed the
sets, strongly reject the null hypothesis of random walk behavior for all the
currencies. Therefore, we could claim that exchange rates for those curren-
cies do not follow a martingale. This conclusion supports Liu and He (1991)
and Wright (2000) findings, whereas it sharply contrasts to Luger (2003) con-
clusions. Moreover, the pattern of rejections support the hypothesis that log
26
8 References
Breitung, J. and GouriMeroux, C., 1997. Rank tests for unit roots. Journal
Campbell, J.Y., Lo, A.W. and MacKinlay, J.-M. 1997. The econometrics of
Cochrane, J.H. 1988. How big is the random walk in GNP? Journal of
Dufour, J-M. and Kiviet, J.F. 1998. Exact inference methods for first-order
27
Fama, E. 1970. Efficient Capital markets: review of theory and empiri-
Fama, E. 1991. Efficient Capital Markets: II. Journal of Finance, Vol XLVI,
No 5, 1575–1617.
the martingale hypothesis for exchange rates. Journal of Business & Eco-
Hasan, M.N. and Koenker, R.W. 1997. Robust rank tests of the unit root
low random walks: evidence from a simple specification test. The Review of
28
Financial Studies, vol. 1, no. 1, 41–66.
Lo, A. W. and MacKinlay, A. C. 1989. The size and power variance ratio
Luger, R. 2003. Exact non-parametric tests for a random walk with un-
Meese, R. A. and Singleton, K. J. 1982. On the unit roots and the em-
So, B.S. and Shin, D.W. 2001. An invariant sign test for random walks
229.
ing: Examples and Methods for p-Value Adjustment. New York: Wiley.
Whang, Y.-J., Kim, J. 2003. A multiple variance ratio test using subsam-
29
pling. Economics Letters 79, 225–230.
1–9.
30
Table 1: Wright’s individual tests size
R1 R2 S1 S2
T = 320
T = 640
T = 1, 280
31
Table 2: Multiple variance ratio tests critical values
T = 320 T = 640 T = 1, 280
k = 2, 4 k = 2, 4, 8 k = 2, 4, 8, 16
32
Table 3: Size and power of standard multiple VR tests
εt Model CD RS WK
N (0, 1) Size
(25) 0.075 0.054 0.062
(26) 0.040 0.339 0.116
Power
(27) 0.901 0.835 0.162
(28) 0.505 0.840 0.167
(29) 0.990 0.967 0.719
(30) 0.832 0.950 0.584
(31) 0.500 0.446 0.824
(32) 0.161 0.681 0.710
t3 Size
(25) 0.041 0.047 0.062
(26) 0.040 0.273 0.118
Power
(27) 0.819 0.866 0.176
(28) 0.551 0.845 0.188
(29) 0.984 0.985 0.739
(30) 0.879 0.965 0.624
(31) 0.436 0.399 0.825
(32) 0.186 0.631 0.738
The entries are rejection rates, at the 5% significance level. The column W K
displays mean rejection rates over six block lengths in the range 3.5 T 0.6 < b <
2.5 T 0.3 .
33
Table 4: Size and power of ranks and signs VR tests (Hochberg adjustment)
(H) (H) (H) (H)
εt Model R1 R2 S1 S2
N (0, 1) Size
(25) 0.036 0.033 0.037 0.002
(26) 0.046 0.091 0.020 0.000
Power
(27) 0.856 0.892 0.511 0.139
(28) 0.817 0.828 0.511 0.191
(29) 0.971 0.980 0.793 0.531
(30) 0.969 0.964 0.848 0.618
(31) 0.573 0.617 0.232 0.004
(32) 0.350 0.514 0.101 0.002
t3 Size
(25) 0.034 0.033 0.036 0.000
(26) 0.041 0.064 0.030 0.002
Power
(27) 0.943 0.955 0.680 0.320
(28) 0.940 0.940 0.732 0.368
(29) 1.000 1.000 0.968 0.847
(30) 0.999 0.996 0.980 0.897
(31) 0.159 0.353 0.036 0.000
(32) 0.117 0.226 0.095 0.018
The entries are rejection rates, at the 5% significance level.
34
Table 5: Size and power of ranks and signs VR tests (Sidack adjustment)
(S) (S) (S) (S)
εt Model R1 R2 S1 S2
N (0, 1) Size
(25) 0.034 0.034 0.037 0.001
(26) 0.043 0.090 0.019 0.000
Power
(27) 0.854 0.891 0.504 0.148
(28) 0.816 0.830 0.503 0.173
(29) 0.968 0.978 0.785 0.435
(30) 0.965 0.964 0.841 0.528
(31) 0.566 0.624 0.202 0.003
(32) 0.375 0.532 0.100 0.002
t3 Size
(25) 0.032 0.033 0.036 0.000
(26) 0.039 0.062 0.030 0.000
Power
(27) 0.941 0.954 0.675 0.290
(28) 0.938 0.941 0.727 0.347
(29) 1.000 0.999 0.963 0.785
(30) 0.999 0.997 0.976 0.844
(31) 0.156 0.323 0.062 0.003
(32) 0.117 0.227 0.094 0.017
The entries are rejection rates, at the 5% significance level.
35
Table 6: Size and power of ranks and signs VR tests (bootstrap adjustment)
(B) (B) (B) (B)
εt Model R1 R2 S1 S2
N (0, 1) Size
(25) 0.042 0.046 0.054 0.047
(26) 0.080 0.134 0.049 0.050
Power
(27) 0.871 0.902 0.541 0.601
(28) 0.844 0.851 0.496 0.629
(29) 0.974 0.989 0.817 0.844
(30) 0.971 0.980 0.853 0.900
(31) 0.619 0.664 0.251 0.152
(32) 0.407 0.579 0.135 0.064
t3 Size
(25) 0.051 0.043 0.047 0.047
(26) 0.072 0.094 0.049 0.036
Power
(27) 0.961 0.961 0.704 0.807
(28) 0.928 0.937 0.697 0.743
(29) 0.997 1.000 0.957 0.969
(30) 0.997 0.996 0.964 0.984
(31) 0.199 0.378 0.072 0.047
(32) 0.112 0.259 0.111 0.122
The entries are rejection rates, at the 5% significance level. Number of bootstrap
replications N = 200.
36
Table 7: Size and power of CD(·) tests
εt Model CD(R1 ) CD(R2 ) CD(S1 ) CD(S2 )
N (0, 1) Size
(25) 0.050 0.050 0.050 0.000
(26) 0.081 0.122 0.050 0.000
Power
(27) 0.868 0.891 0.494 0.297
(28) 0.822 0.850 0.528 0.342
(29) 0.981 0.987 0.838 0.696
(30) 0.978 0.977 0.880 0.762
(31) 0.586 0.641 0.227 0.028
(32) 0.397 0.577 0.117 0.012
t3 Size
(25) 0.050 0.050 0.050 0.000
(26) 0.061 0.081 0.050 0.000
Power
(27) 0.970 0.970 0.719 0.531
(28) 0.949 0.958 0.738 0.550
(29) 0.997 0.998 0.968 0.924
(30) 0.998 0.999 0.976 0.940
(31) 0.191 0.392 0.063 0.008
(32) 0.117 0.248 0.125 0.062
The entries are rejection rates, at the 5% significance level.
37
Table 8: Size and power of RS(·) tests
εt Model RS(R1 ) RS(R2 ) RS(S1 ) RS(S2 )
N (0, 1) Size
(25) 0.050 0.050 0.050 0.007
(26) 0.085 0.150 0.050 0.009
Power
(27) 0.794 0.826 0.449 0.230
(28) 0.784 0.815 0.488 0.277
(29) 0.960 0.974 0.755 0.538
(30) 0.938 0.942 0.773 0.618
(31) 0.431 0.487 0.159 0.014
(32) 0.340 0.548 0.114 0.015
t3 Size
(25) 0.050 0.050 0.050 0.009
(26) 0.065 0.105 0.050 0.005
Power
(27) 0.905 0.920 0.612 0.428
(28) 0.904 0.926 0.646 0.445
(29) 0.996 0.996 0.943 0.865
(30) 0.993 0.994 0.961 0.906
(31) 0.158 0.301 0.051 0.006
(32) 0.121 0.233 0.101 0.039
The entries are rejection rates, at the 5% significance level.
38
Table 9: Application of multiple variance ratio tests. Wright (2000) data
Test Canada France Germany Japan UK
(H)
S2 0.044∗∗ 0.058∗ 0.068∗ 0.020∗∗ 0.057∗
(S)
S2 0.049∗∗ 0.077∗ 0.066∗ 0.039∗∗ 0.072∗
(B)
S2 0.001∗∗ 0.002∗∗ 0.004∗∗ 0.000∗∗ 0.005∗∗
CD(S2 ) 3.293∗∗ 2.630∗∗ 2.757∗∗ 4.733∗∗ 2.800∗∗
RS(S2 ) 12.193∗∗ 13.611∗∗ 10.996∗∗ 31.260∗∗ 8.676
Statistics computed by setting four k values (2, 4, 8 and 16). ∗ and ∗∗ means
significance at the 10% and 5% level, respectively. Rows b1 , . . . , b6 display the 5%
critical values of the W K test for each block length in the range 3.5 T 0.6 < b <
(H) (S)
2.5 T 0.3 . Critical values used to test the significance of S2 , S2 , CD(S2 ) and
RS(S2 ) tests correspond to the sample size T = 1, 280. The bootstrap-adjusted
p-values have been computed by using N = 1, 000 replications.
39
Table 10: Application of multiple variance ratio tests. Luger (2003) data
Test Canada France Germany Japan UK
(H)
S2 0.043∗∗ 0.024∗∗ 0.054∗ 0.010∗∗ 0.046∗∗
(S)
S2 0.050∗ 0.043∗∗ 0.077∗ 0.039∗∗ 0.058∗
(B)
S2 0.000∗∗ 0.001∗∗ 0.005∗∗ 0.000∗∗ 0.001∗∗
CD(S2 ) 3.270∗∗ 3.613∗∗ 2.630∗∗ 5.498∗∗ 3.040∗∗
RS(S2 ) 12.193∗∗ 14.225∗∗ 11.281∗∗ 31.928∗∗ 9.665∗
Statistics computed by setting four k values (2, 4, 8 and 16). ∗ and ∗∗ means
significance at the 10% and 5% level, respectively. Rows b1 , . . . , b6 display the 5%
critical values of the W K test for each block length in the range 3.5 T 0.6 < b <
(H) (S)
2.5 T 0.3 . Critical values used to test the significance of S2 , S2 , CD(S2 ) and
RS(S2 ) tests correspond to the sample size T = 1, 280. The bootstrap-adjusted
p-values have been computed by using N = 1, 000 replications.
40