Ranks and Signs-Based Multiple Variance Ratio Test

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/228583649
Ranks and signs-based multiple variance ratio tests
Article · January 2004
CITATIONS READS
57 1,909
2 authors:
Jorge Belaire-Franch Dulce Contreras Bayarri

University of Valencia University of Valencia
42 PUBLICATIONS 716 CITATIONS 24 PUBLICATIONS 368 CITATIONS
SEE PROFILE SEE PROFILE
All content following this page was uploaded by Jorge Belaire-Franch on 27 May 2014.
The user has requested enhancement of the downloaded file.

∗
Ranks and signs-based multiple variance ratio tests
Jorge Belaire-Franch† Dulce Contreras‡
Abstract
This study proposes multiple variance ratio tests based on ranks

and signs. One set of procedures is based on adjustments for multiple
tests applied on ranks and signs-based univariate variance ratio tests
recently developed by Wright (2000). The other set of tests consists of
straightforward adaptations of standard multiple variance ratio tests.
Simulations confirm that the procedures are reliable, and their power
can be superior to alternative multiple variance ratio tests. The meth-
ods are illustrated by tests of the random walk hypothesis in exchange
rates for five major currencies against the US dollar.
Key words: Variance ratio, ranks, signs, adjustment for multiplicity.
JEL Classification: G12, G14, G15.
∗
The authors acknowledge financial support from the Consejerı́a de Ciencia y Tec-
nologı́a de Castilla-La Mancha, project PAC–02–001, and from the Ministerio de Ciencia
y Tecnologı́a, project SEC2003-09205. We thank Jonathan Wright, Richard Luger, Yoo-
Jae Whang and Kamil Yilmaz for kindly sharing their computer codes.
†
Corresponding author. Department of Economic Analysis. University of Valencia.
Avgda. dels Tarongers s/n. 46022 Valencia (Spain). Phone: 34-96-3828246, fax: 34-96-
3828249, e-mail address: [email protected]
‡
Department of Economic Analysis. University of Valencia. Avgda. dels Tarongers
s/n. 46022 Valencia (Spain). Phone: 34-96-3828246, fax: 34-96-3828249, e-mail address:
[email protected]
1
1 Introduction
There exists a long tradition in the literature concerning the test of the
random walk hypothesis, both in macroeconomics and finance. For instance,
the random walk hypothesis provides a mean to test the weak form efficiency
(and hence, non-predictability) of financial markets (Fama, 1970; 1991).
Given a time series {Xt }Tt=1 , the random walk hypothesis corresponds to
φ = 1 in the first-order autoregressive model
Xt = µ + φXt−1 + εt ,
where µ is an unknown drift parameter and the error terms εt are, in general,
neither independent nor identically distributed. Past evidence suggests that
some asset price series follow a random walk process (see Meese and Single-
ton, 1982; Baillie and Bollerslev, 1989; Hsieh, 1988). However, alternative
test procedures may point towards conflicting conclusions.1
There is a large number of statistical tests with different properties and
sensitiveness against specific alternatives. Among them, Cochrane’s (1988)
variance-ratio methodology is one of the most popular methods, especially
in finance, since the seminal work of Lo and MacKinlay (1988), who derive
the asymptotic distribution for the variance-ratio statistic.
The variance ratio methodology consists of testing the random walk hy-
pothesis against stationary alternatives, by exploiting the fact that the vari-
ance of random walk increments is linear in all sampling intervals. The
variance ratio at lag k is defined as the ratio between (1/k)th of the kth
difference to the variance of the first difference. Hence, one would expect
that, for a random walk process, the variance computed at each individual
1
Nevertheless, the uncorrelatedness of asset returns is neither a necessary nor a suffi-
cient condition for market efficiency, see Campbell et al. (1997), p. 31.
2
lag interval k = 2, 3, . . . would be equal to unity.
Lo and MacKinlay (1988) propose two statistics for testing an individ-
ual variance ratio estimate: the homoscedasticity-robust M1 (k) and the
heteroscedasticity-robust M2 (k). The latter is specially important due to
the changing volatility of financial assets. In practice, it is customary to
examine M1 (k) and/or M2 (k) for several k values. The null is rejected if
it is rejected for some k value. However, as stressed by Chow and Denning
(1993), this sequential procedure leads to an oversized testing strategy. In
this context, Chow and Denning propose to using Hochberg’s (1974) mul-
tiple comparison tests, which allow us to examine a vector of individual
variance ratio tests while controlling for overall test size.
Nevertheless, as pointed out by Fong et al. (1997), Hochberg’s approach
is valid only if the vector of test statistics is multivariate normal. This
condition is satisfied by variance ratios if there is little overlap in the data
(i.e., if k/N is small).
On the other hand, as stressed by Luger (2003), the validity of the vari-
ance ratio depends on several assumptions that often might not hold in time
series. For instance, tail fatness encountered in some financial applications
may provoke the variance ratio suffer from large size distortions.
Alternative exact parametric tests have been proposed to test for the ran-
dom walk hypothesis (Dufour and Kiviet, 1998; Dufour and Torrès, 2000,
among many others). In a recent paper, Wright (2000) proposes exact non-
parametric variance-ratio tests based on signs and ranks which do not de-
pend on the existence of moments, and the sign-based version is exact even
in the presence of conditional heteroscedasticity. Using signs and ranks,
instead of the original observations, has also been considered by other au-
thors to construct robust versions of standard random walk tests (Hasan
3
and Koenker, 1997; Breitung and Gourièroux , 1997; So and Shin, 2001).
In a more recent work, Luger (2003) uses ranks and signs to extend the
Campbell and Dufour (1997) nonparametric approach to test for random
walk with unknown drift.
This paper explores straightforward extensions of Wright’s ranks and
signs-based variance ratio tests. One approach is based on considering the
variance ratio test computed at each lag value k as a different statistical
test of the same null hypothesis. Then, alternative adjustments for multiple
tests, as suggested by Psaradakis (2000) in the context of linearity testing,
may control the size of the sequential variance ratio testing. The other
approach consists of using the distribution of standard multiple variance
ratio tests adapted to the ranks and signs context. Both approaches allow
us to control the test size, and they can be more powerful than alternative
multiple variance ratio tests.
The remainder of the paper is organized as follows. Section 2 reviews
standard univariate and multiple variance ratio tests. Section 3 reviews
Wright’s tests and assess the size distortion due to sequential testing. Section
4 presents the alternative methods to test for random walk in a multiple
variance ratio context. Section 5 presents a large Monte Carlo work to
analyze the performance of the tests in finite samples. Evidence concerning
exchange rates returns against the US dollar is revisited in section 6 using
the suggested methods. Finally, section 7 concludes.
4
2 Random walk tests
2.1 Lo and MacKinlay (1989) variance ratio tests
The variance ratio test proposed by Lo and MacKinlay (1988, 1989) is based
on the fact that, for a random walk series, the variance of its k-th difference
is k times the variance of its first difference. For example, if a series follows
a random walk, the variance of its four-day difference will be four times as
large as the variance of its daily difference.
The hypothesis to be tested is H0 : the time series follows a random walk,
vs. H1 : the time series does not follow a random walk. Let {Xt } denote a
time series consisting of T + 1 observations X0 , . . . , XT . The variance ratio
statistic of the k-th difference of the variable is defined as:
σ̂ 2 (k)
V R(k) = , (1)
σ̂ 2 (1)
where:
T
1 X
2
σ̂ (k) = (Xt − Xt−k − k µ̂)2 ,
m
t=k
m = k(T − k + 1)(T − k)/T,

T
1X
µ̂ = (Xt − Xt−1 )
T
t=1
and:
T
1 X
2
σ̂ (1) = (Xt − Xt−1 − µ̂)2 .
T −1
t=1
The test statistic M1 (k) is given by:
V R(k) − 1
M1 (k) = , (2)
φ(k)1/2
5
which, under the assumption of homoscedasticity, is asymptotically dis-
tributed as N (0, 1). The asymptotic variance, φ(k), is given by:
2(2k − 1)(k − 1)
φ(k) = . (3)
3kT
The test statistic M2 (k), which is robust under heteroscedasticity, is:
V R(k) − 1
M2 (k) = , (4)
φ∗ (k)1/2
where
k−1
2(k − j) 2
X
∗
φ (k) = δ(j), (5)
k
j=1
PT
t=j+1 (Xt − Xt−1 − µ̂)2 (Xt−j − Xt−j−1 − µ̂)2
δ(j) = hP i2 . (6)
T 2
t=1 (X t − X t−1 − µ̂)
2.2 Multiple variance ratio tests
As stressed in the Introduction, the application of variance ratio tests for
multiple k values leads to over rejection of the null hypothesis, above the
nominal size.
The Chow and Denning test (CD test hereafter) provides a procedure
for the multiple comparison of the set of variance ratio estimates with unity.
For a set of m test statistics, the random walk null hypothesis is rejected
if any one of the estimated variance ratios is significantly different from
one. Hence only the maximum absolute value in the set of test statistics is
considered. The CD test is based on the result
Prob [max(|Mj (k1 )|, |Mj (k2 )|, . . . , |Mj (km )|) ≤ SM M (α, m, T ))] ≥ 1 − α
6
Therefore, Chow and Denning (1993) suggests to compare the maximum
M1 or M2 statistics (in absolute value) with the asymptotic α-point critical
value of the studentized maximum modulus, SM M (α; m; ∞), where m is
the number of k values.2
In a recent work, Yilmaz (2003) computes the Richardson and Smith’s
(1991) joint variance ratio test on daily nominal exchange rates, as suggested
by Fong et al. (1997). This test is based on the Wald statistic:
T (VR − 1m )0 Φ−1 (VR − 1m ) ∼ χ2m (7)
where VR is the (m × 1) vector of sample m variance ratios, 1m is the
(m × 1) unit vector, and Φ is the covariance matrix of VR. For any lags
r and s, the analytical expression for the covariance matrix is (Fong et al.,
1997):  
2(2r−1)(r−1) 2(3s−r−1)(r−1)
3r 3s
Φ= (8)
 

2(3s−r−1)(r−1) 2(2s−1)(s−1)
3s 3s
The usefulness of this test relies on the fact that, whenever the variance
ratio tests are computed over long lags with overlapping observations, the
distribution of the variance ratio test is nonnormal; then, neither the Lo
and MacKinlay test nor Chow and Denning’s procedure is valid for drawing
inferences. Fong et al. (1997) show that the Richardson and Smith’s test
(RS test hereafter) can be more powerful than Chow and Denning’s multiple
comparison test for empirically relevant alternatives, and it displays low size
distortion in the presence of heteroscedastic increments.
Whang and Kim (2003) develop a multiple variance ratio test (W K test
hereafter) which uses a subsampling procedure to approximate the asymp-

2
For a number of k values equal to four, the asymptotic critical value is 2.491, at the
5% significance level.
7
totic critical values.
Whang and Kim consider the following joint null hypothesis:
H0 : Mr (k) = 0 for all k = 2, . . . , kmax ,
against the alternative:
H1 : Mr (k) 6= 0 for some k,
where Mr (k) = V R(k) − 1. To test the null hypothesis, Whang and Kim
suggest the statistic:

√
M VT = max| T Mr (k)|.
k
They show that the asymptotic null distribution of the statistic is that of a
maximum of a multivariate normal vector with unknown covariance matrix,
which is complicated to estimate. Therefore, Whang and Kim propose to
approximate the null distribution by means of the subsampling approach.
Given the block size, the M VT statistic is computed for each subsam-
ple, and the original M VT statistic is compared to the 95% quantile of the
corresponding tabulated empirical distribution. The null is rejected at the
5% level if the original M VT is larger than the empirical (one-tailed) 95%
critical value.
3 Wright (2000) variance ratio tests
In a recent paper, Wright (2000) proposes the use of signs and ranks of dif-
ferences in place of the differences in the Lo and MacKinlay tests. Wright
demonstrates that his nonparametric variance ratio tests based on ranks (R1
and R2 ) and signs (S1 and S2 ), can be more powerful than the tests sug-
8
gested by Lo and MacKinlay. They have high power against a wide range
of models displaying serial correlation, including fractionally integrated al-
ternatives. The tests based on ranks are exact under the independence
and identical distribution assumption, whereas the tests based on signs are
exact even under conditional heteroscedasticity. Moreover, Wright (2000)
shows that ranks-based tests display low size distortion, under conditional
heteroscedasticity.
Given T observations of first differences of a variable, {y1 , . . . , yT }, Wright’s
proposed R1 and R2 are defined as:3
PT !
1
(r1t + . . . + r1t−k+1 )2
R1 (k) = Tk t=k
1 PT
−1 × φ(k)−1/2 , (9)
2
T t=1 r1t
PT !
1 2
t=k (r2t + . . . + r2t−k+1 )
R2 (k) = Tk
1 T
−1 × φ(k)−1/2 , (10)
2
P
T t=1 r2t
where
,r
T +1 (T − 1)(T + 1)
r1t = r (yt ) − ,
2 12
r2t = Φ−1 (r(yt )/(T + 1)) .
φ(k) is defined in (3), r(yt ) is the rank of yt among y1 , . . . , yT , and Φ−1 is
the inverse of the standard normal cumulative distribution function. The

3
Note that formulas (9), (10), (11) and (12) differ from Wright’s original formulas.
In private correspondence, Wright acknowledges the typographical errors in his paper.
Numerical results and estimated critical values in his article were computed using the
formulation displayed in the present work.
9
tests based on the signs of first differences are given by:
PT !
1 2
(s t + . . . + s t−k+1 )
S1 (k) = Tk t=k
1 PT
− 1 × φ(k)−1/2 , (11)
s 2
T t=1 t
!
1 PT 2
(s t (µ) + . . . + s t−k+1 (µ))
S2 (k) = Tk t=k
1 PT
− 1 × φ(k)−1/2 , (12)
s (µ)2
T t=1 t
where φ(k) is defined in (3), st = 2u(yt , 0), st (µ) = 2u(yt , µ), and


0.5
 if xt > q,
u(xt , q) =

−0.5
 otherwise.
Thus, S1 assumes a zero drift value. If the value of the drift parameter is
unknown, the procedure described in Luger (2003), based on Campbell and
Dufour (1997), is applied to compute S2 . This method consists of a two-step
strategy.
First, an exact confidence interval for the drift parameter µ, valid under
the null hypothesis, is established. Denote y(1) , . . . , y(T ) the order statistics
of the sample y1 , . . . , yT . An exact confidence interval CIµ (α1 ) for µ with

level 1 − α1 is given by y(h+1) , y(T −h) , where h is the largest integer such
that Pr [B ≤ h] ≤ α1 /2, for B a binomial random variable with number of
trials T and probability of success 1/2. The second step consists of com-
puting the S2 statistic, for each candidate value b for the drift parameter in
the confidence interval. The value of the S2 statistic (retaining the sign) at
aggregation interval k is then defined as
S2 (k) = inf {|S2 (k, b)| : b ∈ CIµ (α1 )}
where, given b ∈ CIµ (α1 ), S2 (k, b) is computed by defining st (b) = 2u(yt , b).
The chosen S2 value is compared to the appropriate critical values for an
10
α2 level test, such that the overall level of the strategy is bounded by
α = α1 + α2 . In this paper, we have set α1 = 0.01 and α2 = 0.04.
However, as pointed out by Wright (2000), using several k values would
lead to an over rejection of the null hypothesis, as in Lo and MacKinlay’s
tests context. This phenomenon is illustrated through the following ex-
periment. 1,000 samples of three alternative models with zero population
autocorrelation have been generated:
1. Model 1.
xt ∼ N (0, 1)
2. Model 2.
xt ∼ t3
3. Model 3.
ht = 0.95ht−1 + ξt
xt = exp(0.5ht ) εt
where εt ∼ N (0, 1) and ξt ∼ N (0, 0.1) are independent random variables,
and Model 3 is a stochastic volatility model.
For each artificial sample, Wright’s variance ratio ranks and signs tests
are computed for two k values (2 and 4) when T = 320, three k values (2, 4
and 8) when T = 640 and four k values (2, 4, 8 and 16) when T = 1, 280,
and we analyze the impact on the empirical size of each individual test. The
null hypothesis of random walk is rejected if, for some k value, the statistic
is larger or lower than the 97.5% or 2.5% percentile of the corresponding
11
tabulated distribution, respectively.4
Insert Table 1
Results in Table 1 confirm the argument that using variance ratio tests at
various aggregation intervals leads to rejection rates larger than the nominal
size.5
4 Ranks and signs-based multiple variance ratio
tests
As shown in the previous section, Wright (2000) tests suffer size distortions
when they are sequentially applied at several k values. This section proposes
different approaches to control the size of Wright’s tests.
4.1 Multiplicity adjustments
A first approach consists of applying p-value adjustments for multiplicity, in
line with Psaradakis (2000).
We compute the Sidack-adjusted p-value for each test j as:
(S)
p̃ji = 1 − (1 − pji )m ,
i = 1, . . . , m, where pji is the p-value corresponding to the variance ratio
test j computed for an individual k value, and m is number of k values. The
Hochberg (1988) adjusted p-values are obtained as:
(H)
p̃ji = min{[k − R(pji ) + 1]pji , 1},
4
Critical values can be found in Wright (2000), Table 1, p. 3. Critical values for the
S2 test at α2 = 4% level have been simulated through 100,000 replications.
5
Note that false rejections notably increase for R1 and R2 tests against Model 3, since
those tests are not robust in the presence of heteroscedasticity.
12
Given a significance level α, the decision rule states that, using the
(S) (S)
variance ratio test j, the null is rejected if p̃j = min1≤i≤m p̃ji ≤ α or
(H) (H)
p̃j = min1≤i≤m p̃ji ≤ α.
These methods, however, assume that the test statistics computed at
different intervals are uncorrelated. In order to take into account possible
correlations among the statistics, bootstrap-adjusted p-values can be com-
puted as described in Westfall and Young (1993) and Psaradakis (2000).
The goal of the procedure is to obtain an approximation to the null sam-
pling distribution of min1≤i≤m pji , as follows. First, one simulates N boot-
strap samples, each of size T , by resampling with replacement from the
original first differences. Then, for the nth bootstrap sample, we com-
pute the value of the m variance ratio test statistics and the associated
p-values p∗j1,n , . . . , p∗jm,n , repeating the same process for n = 1, . . . , N , ob-
taining the sample {min1≤i≤m p∗ji,n : n = 1, . . . , N }. The empirical distri-
bution of {min1≤i≤m p∗ji,n : n = 1, . . . , N } is an estimate of the bootstrap
approximation to the sampling distribution of min1≤i≤m pji under the null
hypothesis of independent and identically distributed (i.i.d.) increments.
Bootstrap-adjusted p-values are computed as
N
(N ) 1 X
p̃ji = I(−∞,0] min p∗jl,n − pji , i = 1, . . . , m
N 1≤l≤m
n=1
where IA (ω) is an indicator function, equal to 1 if w ∈ A and 0 otherwise.

(N )
We reject the null hypothesis with test j if min1≤i≤m p̃ji ≤ α.
(H) (S) (B) (H)
Therefore, we may define the ranks-based tests R1 , R1 , R1 , R2 ,
(S) (B) (H) (S) (B) (H) (S) (B)
R2 , R2 and the signs-based tests S1 , S 1 , S1 , S2 , S2 , S2 ,
which are constructed by computing the corresponding Wright’s R1 , R2 ,
S1 and S2 tests for several k values and applying either a Hochberg (H),
13
Sidack (S) or bootstrap (B) adjustment for multiplicity. Note, however,
(·) (·)
that the bootstrap adjustment would be valid for the R1 and R2 tests
only under the more restrictive assumption of i.i.d. differences, whereas

(·) (·)
it remains reliable for the S1 and S2 tests under the uncorrelated with
heteroscedastic increments case.
4.2 Multiple variance ratio tests
In the present section we suggest straightforward extensions of Wright’s tests
to the multiple variance ratio context. More specifically, we substitute the
standard variance ratio tests by Wright’s ranks and signs-based tests, in the
definition of Chow and Denning (1993) and Richardson and Smith (1991)
procedures.
Thus, we can define the following statistics, based on Chow and Den-
ning’s extremum statistics approach:
CD(R1 ) = max{|R1 (k1 )|, |R1 (k2 )|, . . . , |R1 (km )|} (13)
CD(R2 ) = max{|R2 (k1 )|, |R2 (k2 )|, . . . , |R2 (km )|} (14)
CD(S1 ) = max{|S1 (k1 )|, |S1 (k2 )|, . . . , |S1 (km )|} (15)
CD(S2 ) = max{|S2 (k1 )|, |S2 (k2 )|, . . . , |S2 (km )|} (16)
Under Assumption 0 (i.i.d. first differences) in Wright (2000), the CD(Rj )
tests are distributed as
max{|Rj∗ (k1 )|, |Rj∗ (k2 )|, . . . , |Rj∗ (km )|}
where Rj∗ (k1 ) is the ranks-based test computed with any random permuta-
tion of the elements {1, 2, . . . , T }. Under Assumptions 1 and 2 (increments
14
follow a martingale difference sequence) in Wright (2000), the CD(Sj ) tests
are distributed as
max{|Sj∗ (k1 )|, |Sj∗ (k2 )|, . . . , |Sj∗ (km )|}
where Sj∗ (k1 ) is the signs-based test computed with an i.i.d. sequence
{s∗t }Tt=1 , each element of which is 1 with probability 1/2 and -1 other-
wise. Therefore, the exact sampling distribution of CD(Rj ) and CD(Sj )
tests (j = 1, 2) can be simulated with any arbitrary degree of accuracy.
If we rely on Richardson and Smith’s joint variance ratio test, we can
alternatively define the following statistics:
RS(R1 ) = T R†1 0 Φ−1 R†1 (17)
RS(R2 ) = T R†2 0 Φ−1 R†2 (18)
RS(S1 ) = T S†1 0 Φ−1 S†1 (19)
RS(S2 ) = T S†2 0 Φ−1 S†2 , (20)
where T is the sample size, the matrix Φ was defined for two arbitrary lag
values in (8), and:
PT !
1
(r1t + . . . + r1t−k+1 )2
R1† (k) = Tk t=k
1 PT
−1 , (21)
2
T t=1 r1t
!
1 PT 2
(r 2t + . . . + r 2t−k+1 )
R2† (k) = Tk t=k
1 PT
−1 , (22)
2
T t=1 r2t
!
1 PT 2
(s t + . . . + s t−k+1 )
S1† (k) = Tk t=k
1 PT
−1 , (23)
2
T t=1 st
!
1 PT 2
(s t (µ) + . . . + s t−k+1 (µ))
S2† (k) = Tk t=k
1 PT
−1 . (24)
s (µ) 2
T t=1 t
Under Assumption 0 in Wright (2000), the RS(Rj ) tests are distributed
15
as
T R∗† 0 −1 ∗†
j Φ Rj
where R∗†
j is the vector of ranks-based statistics (21)–(22) computed with
any random permutation of the elements {1, 2, . . . , T }. Under Assumptions
1 and 2 in Wright (2000), the RS(Sj ) tests are distributed as
T S∗† 0 −1 ∗†
j Φ Sj
where S∗†
j is the the vector of signs-based statistics (23)–(24) computed
with an i.i.d. sequence {s∗t }Tt=1 , each element of which is 1 with probability
1/2 and -1 otherwise. As in the case of the CD tests, the exact sampling
distribution of RS(Rj ) and RS(Sj ) tests (j = 1, 2) can be simulated with any
arbitrary degree of accuracy.
Analogously to Wright’s individual tests, the ranks-based multiple vari-
ance ratio tests are sensitive to deviations from the stronger i.i.d. assump-
tion. However, signs-based multiple variance ratio tests are robust against
conditional heteroscedasticity, although CD(S1 ) and RS(S1 ) are constructed
under the additional assumption of zero drift value. Regarding the CD(S2 )
and RS(S2 ) tests, we advocate to follow Luger’s (2003) two-step procedure
based on Campbell and Dufour (1997), as described in section 3 in the uni-
variate context.
Next Table displays critical values of the proposed ranks and signs-based
multiple variance ratio tests, at the 5% significance level, for several combi-
nations of sample size and k values.6
Insert Table 2
6
CD(S2 ) and RS(S2 ) tests critical values have been computed by setting α1 = 0.01 and
α2 = 0.04.
16
5 Comparison among multiple variance ratio tests
The purpose of this section is to show the empirical size and power properties
of the procedures suggested in this paper in finite samples, and to compare
them to those of existing procedures: the CD test (in its heteroscedasticity-
robust version), the RS test and the W K test. To that end, a large Monte
Carlo experiment has been set out.
For each procedure, 1,000 artificial samples have been generated of the
data generating process Xt = Xt−1 + yt , where alternatively:
yt = ε t , (25)
yt = exp(0.5ht ) εt , (26)
yt = 0.1 yt−1 + εt , (27)
yt = 0.1 yt−1 + exp(0.5ht ) εt , (28)
(1 − L)0.1 yt = εt , (29)
(1 − L)0.1 yt = exp(0.5ht ) εt , (30)
where εt and ξt ∼ N (0, 0.1) are independent random variables. Model (25)
states that the first difference of Xt is an i.i.d. process, (26) defines a
stochastic volatility model for the first difference of the variable Xt , models
(27) and (28) correspond to correlated first differences, whereas (29) and
(30) are fractionally integrated processes with integration order d = 0.1.
In addition, we explore the finite sample behavior of the tests against the
processes:
Xt = 0.95 Xt−1 + εt , (31)
Xt = 0.95 Xt−1 + exp(0.5ht ) εt , (32)
17
which correspond to the alternative hypothesis of mean reverting process
for the variable in levels. Hence, the cases (25) and (26) are designed to
analyze the empirical size of the tests, while the rest of cases will be useful
to assess the empirical power of the tests in finite samples. In all cases, we
have set εt ∼ N (0, 1) and εt ∼ t3 . Rejection rates have been computed at
the 5% significance level for three sample sizes and their corresponding sets
of k values: T = 320 (k = 2 and 4), T = 640 (k = 2, 4 and 8), and T = 1280
(k = 2, 4, 8 and 16).7
Insert Table 3
Regarding the existing multiple variance ratio methods (CD, RS and
W K tests8 ), we can claim from Table 3 that there is not a “winner” in
terms of superior power against all the considered alternative models. For
instance, the CD test is more powerful than RS and W K tests against
model (29) with normal random perturbations, however, the RS test is
more powerful against model (28), whereas the W K test is more powerful
than the other procedures against model (32).
On the other hand, Table 3 shows that the RS test can be seriously
oversized against heteroscedastic processes. For instance, when T = 1, 280
the rejection rate against increments following a stochastic volatility pro-
cess, model (26) with Gaussian innovations, is about 34%. The W K test
may display size distortion as well, although of low order compared to the
distortion of RS test, and it seems to diminish for larger sample sizes. More-
over, results in Table 3 allow us to claim that the W K test is not specially
7
We report the results just for T = 1280 (k = 2, 4, 8 and 16). Results for the rest of
sample sizes are available upon request.
8
The W K test has been computed for six different block lengths as suggested by Whang
and Kim (2003), in the range 3.5 T 0.6 < b < 2.5 T 0.3 . Rejection rates displayed in column
labeled W K are the sample mean of the rejection rates computed for each block length.
18
suited to detect ARIMA(1,1,0) processes, since power against models (27)
and (28) is rather low even for large sample sizes.
The results in Tables 4–5 show the performance of ranks and signs-
based variance ratio tests computed at several k values with Hochberg and
Sidack-type multiplicity adjustments, respectively. They have been pro-
duced following Wright’s procedures to compute the empirical quantiles,
using 100,000 replications in order to compute the empirical p-values for
each variance ratio test.
Insert Tables 4–5
The results show that the Sidack and Hochberg-type p-value adjustment
procedures lead to undersized testing strategies, most probably at the cost
of a loss of power, with the exception of R1 and R2 tests against the het-
eroscedastic model (26). However, despite the satisfactory results with R1
test against model (26), results concerning both R1 and R2 tests are not re-
liable when the time series are conditionally heteroscedastic. On the other
hand, S2 test is clearly undersized with either Sidack or Hochberg-type cor-
rection, well beyond the nominal level, since the two-step procedure leads
to a conservative test.
Insert Table 6
The bootstrap adjustments, shown in Table 6, have been achieved using
N = 200 bootstrap samples. Since the empirical distributions have been
constructed under the null hypothesis of i.i.d. increments, which is clearly
violated by model (26), now the variance ratio tests which are sensitive to
deviations from this assumption (R1 and R2 ) are oversized. Nevertheless, the
S2 test, which was clearly undersized, now displays an empirical size closer
19
to the 5% nominal level, mostly at larger sample sizes. This procedure is
valid for S1 and S2 tests even under conditional heteroscedasticity because
the uncorrelatedness of the original series is equivalent to the i.i.d. behavior
of the signs of the data, regardless of the (absolute) size of the increments.
Therefore, the bootstrap approach for S1 and S2 does not need to replicate
the empirical characteristics of the conditional variance since it suffices to
simply bootstrap the increments. The positive impact of the bootstrap
adjustment on S2 test is so large that it outperforms S1 test in some cases,
as in models (27) and (28).
However, regardless the type of adjustment for multiple tests, ranks and
signs-based variance ratio tests are not too powerful against AR(1) station-
ary (in levels) processes, although they may improve the power of standard
multiple variance ratio tests against ARIMA(1,1,0) alternatives. For exam-

(H)
ple, R1 test power of 94.3% (at T = 1, 280) against model (27) with t3
perturbations, compares well to CD (81.9%), RS (86.6%) and W K tests
(17.6%), whereas it displays better empirical size (4.6% at T = 1, 280) than
RS (33.9%) and W K (11.6%) tests against model (26) with normal random
perturbations.
Insert Tables 7–8
Regarding the CD(·) and RS(·) tests, Tables 7 and 8 allow us to conclude
that the ranks-based tests CD(R1 ) and CD(R2 ) , and RS(R1 ) and RS(R2 ) are
more powerful than their signs-based counterparts, CD(S1 ) and CD(S2 ) , and
RS(S1 ) and RS(S2 ) . Moreover, CD(·) tests are slightly more powerful than
RS(·) tests, whereas size distortion of ranks-based RS(·) tests are somewhat
larger than size distortion of ranks-based CD(·) tests. Hence, it seems that
ranks and signs-based CD(·) tests are superior to RS(·) tests. Moreover,
20
note that by construction, the CD(R1 ) , CD(R2 ) , RS(R1 ) and RS(R2 ) tests are
exact (size identically equal to 5%) in the i.i.d. case (25), while the CD(S1 ) ,
CD(S2 ) , RS(S1 ) and RS(S2 ) tests are exact in both i.i.d. and uncorrelated
heteroscedastic cases, (25) and (26).
Comparing the performance of CD(·) tests to the performance of mul-
tiplicity adjustments, there is not an outstanding procedure against all al-

(H) (S)
ternatives. However, S2 and S2 tests are clearly inferior than the rest
(B)
of signs-based tests. On the other hand, the S2 test is superior, or not
inferior, to the CD(S1 ) test, with the exception of AR(1) stationary alter-
natives. But let us remember that the CD(S1 ) test is constructed under the
additional assumption of zero drift.
6 Empirical analysis of US dollar exchange rates
In this section we illustrate the use of the suggested procedures by re-
examining the behavior of some major currencies against the US dollar.
More specifically, we assess the null hypothesis of random walk of two
data sets.9 The first data set consists of dollar-based weekly nominal ex-
change rates for the Canadian dollar, French franc, German mark, Japanese
yen, and British pound, and cover the period from August 7, 1974, to May
29, 1996. For each week, the exchange rate is observed on Wednesday, or
the next trading day if the markets are closed on Wednesday. The returns
were constructed as the first differences of the log exchange rates. Wright
(2000) analyzed the same data set, which was obtained from the Journal of
Business and Economic Statistics Data Archive. The original source was the
International Monetary Fund International Financial Statistics. As pointed

9
Since the sample size of every time series in each data set consists of 1,138 returns,
we use the critical values corresponding to T = 1, 280 and k = 2, 4, 8 and 16.
21
out by Wright (2000), the original data source is the same as that used by
Liu and He (1991), although the data analyzed by Wright spans a longer
time period.
Insert Table 9
Table 9 (upper panel) shows the results from applying standard multi-
ple variance ratio tests. The heteroscedasticity-robust version of the CD
test allow us to reject the null hypothesis just in the Japanese yen/US dol-
lar case. The RS test rejects the null at the 5% significance level for the
Canadian dollar and Japanese yen exchange rates, and at the 10% level for
the French franc and the German mark. The W K test rejects the null at
different significant levels for several block lengths for all the currencies but
the Canadian dollar. However, we must remind from our Monte Carlo ex-
periment that RS and W K tests can be oversized against heteroscedastic
alternatives, with the worst results in the case of the RS test. Hence, we
should not rely our final conclusion about the random walk hypothesis on
these procedures.
Alternatively, we apply heteroscedasticity-robust signs-based multiple
variance ratio tests. More specifically, since we do not know a priori whether
the corresponding true data generating processes have zero drift or not, we
compute the version of the procedures based on the S2 test, by setting
α1 = 0.01 and α2 = 0.04.
Table 9 (lower panel) presents the new results. The Hochberg-type ad-
justment for multiple tests allow us to reject the null at the 5% level in
two cases (Canada and Japan) and at the 10% level for the rest of exchange
rates (France, Germany and UK). These conclusions are replicated when us-
ing the Sidack-type adjustment, although, in general, with larger p-values.
22
Conclusions using the bootstrap-adjusted p-values, however, strongly reject
the null hypothesis for all the exchange rates, at significance levels well below
the 1%.
Regarding the CD(S2 ) test, the hypothesis of random walk can be clearly
rejected for all the considered exchange rates, supporting the conclusions
(B)
reached with S2 test. Finally, the RS(S2 ) test rejects the null for all the
currencies but one: the British pound.
All in all, the results suggest that the analyzed exchange rates deviate
from the random walk behavior. Thus we could claim that the corresponding
exchange rates do not follow a random walk. This conclusion aligns with
Wright (2000) findings, although this author’s conclusions were not based
on a controlled size testing strategy. Moreover, given the estimated power
in our Monte Carlo work, it is unlikely that the log nominal exchange rates
are stationary in levels, although the exchange rate returns are likely to be
correlated and heteroscedastic with fat-tailed innovations.
Our second data set consists of dollar-based weekly nominal exchange
rates for the same currencies, frequency and time span than in the previous
analysis, although now the data source is different. The data used in this
second analysis were certified by the Federal Reserve Bank of New York
and are the noon buying rates in New York for cable transfers payable in
foreign currencies. These are the same data as used by Luger (2003), and
they are available on the Board of Governors of the Federal Reserve System
Web site. For each week, the exchange rate is observed on Wednesday if
the markets are open, or the next trading day otherwise. Again, the returns
were constructed as the first differences of the log exchange rates.
Insert Table 10
Table 10 (upper panel) shows the results from applying standard multiple
23
variance ratio tests. As before, the heteroscedasticity-robust version of the
CD test allow us to reject the null hypothesis just in the Japanese yen/US
dollar case. Now, the RS test rejects the null at the 5% significance level for
the Canadian dollar, Japanese yen and British pound exchange rates, and
at the 10% level for the French franc. As in the previous analysis, the W K
test rejects the null at different significant levels for several block lengths for
all the currencies but the Canadian dollar.
Table 10 (lower panel) presents the results using the heteroscedasticity-
robust signs-based multiple variance ratio tests. The Hochberg-type adjust-
ment allow us to reject the null at the 5% level for all the currencies but one,
the German mark, for which the null is rejected at the 10% level. The results
with the Sidack-type adjustment are less conclusive, since the null is rejected
at the 5% level in two cases (France and Japan) and at the 10% level for the
other three currencies (Canada, Germany and UK). As we found previously,
the bootstrap-adjusted p-values strongly reject the null hypothesis for all
the exchange rates, at significance levels below the 1%.
Regarding the CD(S2 ) test, it supports again the conclusions reached

(B)
with S2 test, whereas the RS(S2 ) test rejects the null for Canada, France,
Germany and Japan at the 5% level, and for UK at the 10% level.
Again, the results suggest that the major exchange rates against the US
dollar deviate from the random walk behavior, confirming that exchange
rates are not martingale processes. This conclusion sharply contrasts with
Luger (2003), who concludes that the random walk hypothesis can not be
rejected on the basis of his results using a set of nonparametric random walk
tests.10 Conversely, these results provide additional and stronger support of

10
Luger (2003) computed the individual S2 test as well. His individual results at several
k values supported the rejection of the null hypothesis for all the currencies. However, as
Luger claimed, that conclusion could not be reached on the basis of a sequential testing
procedure without control on the final size. Moreover, Luger calculated the statistics using
24
Wright (2000) and Liu and He (1991) findings. Finally, as in the analysis of
the second data set, we could claim that, possibly, the log nominal exchange
rates are not stationary in levels, although the exchange rate returns might
be correlated and heteroscedastic with heavy-tailed innovations.
7 Conclusions
The main aim of this paper has been to suggest some alternative multiple
variance ratio tests based on ranks and signs, as originally proposed by
Wright (2000) in the univariate context.
We have presented two sets of procedures. The first set consisted of
considering the application of individual variance ratio tests as the applica-
tion of different individual tests. Then, different adjustments for multiple

(H) (H) (H) (H)
tests have been suggested. The resulting statistics, R1 , R2 , S1 , S2 ,
(S) (S) (S) (S) (B) (B) (B) (B)
R 1 , R 2 , S1 , S2 , R 1 , R 2 , S1 and S2 work reasonably well. In
the case of bootstrap adjustments, its validity is confined to the signs-based
tests, and it provokes an important improvement of the behavior of the S2
test.
The second set consisted of the ranks and signs versions of Chow and
Denning, and Richardson and Smith tests. The ranks-based procedures are
exact under the i.i.d. assumption whereas the signs-based procedures are
exact under both the i.i.d. and martingale difference sequence assump-
tions. These tests work rather well, although the CD(·) tests seem superior
to the RS(·) . Compared to the rest of suggested procedures, there is not
heteroscedasticity-robust test with superior power against all the alterna-
tives.
In addition, we have compared our tests with three existing methods:

Wright’s original formulation, see footnote 2.
25
the CD, the RS and the W K tests. The RS test may display a large size
distortion against martingale difference sequences and the W K test may be
somewhat oversized against the same alternatives. On the other hand, these
alternative models are more capable to detect stationary AR(1) alternative
models than our tests, which are more sensitive against ARIMA(1,1,0) mod-
els with heteroscedasticity and heavy-tailed innovations.
Finally, we have illustrated the procedures through the analysis of nomi-
nal exchange rate data against the US dollar, which was already analyzed in
two previous works: Wright (2000) and Luger (2003). We have computed the
existing multiple variance ratio tests, in addition to the heteroscedasticity-

(H) (S) (B)
robust S2 , S2 , S2 , CD(S2 ) and RS(S2 ) tests. Our results, in both data
sets, strongly reject the null hypothesis of random walk behavior for all the
currencies. Therefore, we could claim that exchange rates for those curren-
cies do not follow a martingale. This conclusion supports Liu and He (1991)
and Wright (2000) findings, whereas it sharply contrasts to Luger (2003) con-
clusions. Moreover, the pattern of rejections support the hypothesis that log
exchange rates follow and ARIMA(1,1,0) process with heteroscedastic incre-
ments and heavy-tailed innovations.
26
8 References
Baillie, R. T. and Bollerslev, T. 1989. Common stochastic trends in a sys-
tem of exchange rates. Journal of Finance 44, 167–181
Breitung, J. and GouriMeroux, C., 1997. Rank tests for unit roots. Journal
of Econometrics 81, 7–27.
Campbell, B. and Dufour, J.-M. 1997. Exact nonparametric tests of orthog-
onality and random walk in the presence of a drift parameter. International
Economic Review 38, 151–173.
Campbell, J.Y., Lo, A.W. and MacKinlay, J.-M. 1997. The econometrics of
financial markets. Princeton University Press. Princeton, New Jersey.
Chow, K. V. and Denning, K. C. 1993. A simple multiple variance ratio
test. Journal of Econometrics, vol. 58, 385–401.
Cochrane, J.H. 1988. How big is the random walk in GNP? Journal of
Political Economy 96, 893–920.
Dufour, J-M. and Kiviet, J.F. 1998. Exact inference methods for first-order
autoregressive distributed lag models. Econometrica 66, 79–104.
Dufour, J-M. and Torrès, O. 2000. Markovian processes, two-sided au-
toregressions and finite-sample inference for stationary and nonstationary
autoregressive processes. Journal of Econometrics 99, 255–289.
27
Fama, E. 1970. Efficient Capital markets: review of theory and empiri-
cal work. The Journal of Finance (May), 383–417.
Fama, E. 1991. Efficient Capital Markets: II. Journal of Finance, Vol XLVI,
No 5, 1575–1617.
Fong, W. M., Koh, S. K. and Ouliaris, S. 1997. Joint variance-ratio tests of
the martingale hypothesis for exchange rates. Journal of Business & Eco-
nomic Statistics, vol. 15 no. 1 (January), 51–59.
Hasan, M.N. and Koenker, R.W. 1997. Robust rank tests of the unit root
hypothesis. Econometrica 65, 133–161.
Hochberg, Y. 1988. A sharper Bonferroni procedure for multiple tests of
significance. Biometrika 75, 800-802.
Hochberg, Y. 1974. Some Conservative Generalizations of the T-Method
in Simultaneous Inference. Journal of Multivariate Analysis 4, 224–234.
Hsieh, D. 1988. The statistical property of daily foreign exchange rates:
1974–1983, Journal of International Economics, 24, 129–145
Liu, C. Y and He, J . 1991. A Variance Ratio Test of Random Walks
in Foreign Exchange Rates. Journal of Finance, 46, 773–785.
Lo, A. W. and MacKinlay, A. C. 1988. Stock market prices do not fol-
low random walks: evidence from a simple specification test. The Review of
28
Financial Studies, vol. 1, no. 1, 41–66.
Lo, A. W. and MacKinlay, A. C. 1989. The size and power variance ratio
test in finite samples: a Monte Carlo investigation. Journal of Economet-
rics, vol. 40, 203–238.
Luger, R. 2003. Exact non-parametric tests for a random walk with un-
known drift under conditional heteroscedasticity. Journal of Econometrics,
vol. 115, 259–276.
Meese, R. A. and Singleton, K. J. 1982. On the unit roots and the em-
pirical modeling exchange rates, Journal of Finance, 37, 1029–1034.
Psaradakis, Z. 2000. p-Value adjustments for multiple tests for nonlinearity.
Studies in Nonlinear Dynamics and Econometrics, Vol. 4, Issue 3, Article 1.
Richardson, M. and Smith, T. 1991. Tests of financial models in the pres-
ence of overlapping observations. Review of Financial Studies 4, 227–254.
So, B.S. and Shin, D.W. 2001. An invariant sign test for random walks
based on recursive median adjustment. Journal of Econometrics 102, 197–
229.
Westfall, P. H. and Young, S. S. 1993. Resampling-Based Multiple Test-
ing: Examples and Methods for p-Value Adjustment. New York: Wiley.
Whang, Y.-J., Kim, J. 2003. A multiple variance ratio test using subsam-
29
pling. Economics Letters 79, 225–230.
Wright, J. H. 2000. Alternative Variance-Ratio Tests Using Ranks and
Signs. Journal of Business & Economic Statistics, Vol 18, No 1, January,
1–9.
Yilmaz, K. 2003. Martingale property of exchange rates and Central Bank
interventions. Journal of Business & Economic Statistics, 21, 383–395.
30
Table 1: Wright’s individual tests size
R1 R2 S1 S2
T = 320
Model 1 0.079 0.073 0.077 0.005

Model 2 0.071 0.070 0.085 0.005
Model 3 0.091 0.132 0.067 0.006
T = 640
Model 1 0.101 0.085 0.095 0.016

Model 2 0.102 0.105 0.097 0.010
Model 3 0.132 0.189 0.089 0.015
T = 1, 280
Model 1 0.126 0.115 0.125 0.021

Model 2 0.108 0.121 0.096 0.019
Model 3 0.159 0.240 0.126 0.023
The entries are Wright’s variance ratio tests rejection rates at (individual) 5% level
31
Table 2: Multiple variance ratio tests critical values
T = 320 T = 640 T = 1, 280
k = 2, 4 k = 2, 4, 8 k = 2, 4, 8, 16
CD(R1 ) 2.14 2.25 2.33

CD(R2 ) 2.14 2.25 2.33
CD(S1 ) 2.12 2.26 2.35
CD(S2 ) (α2 = 4%) 2.24 2.35 2.43
RS(R1 ) 5.98 7.73 9.45

RS(R2 ) 5.96 7.74 9.42
RS(S1 ) 5.98 7.84 9.58
RS(S2 ) (α2 = 4%) 6.40 8.34 10.14
The entries are 5% critical values, computed through 100,000 replications. Critical
values for CD(S2 ) and RS(S2 ) tests correspond to the 4% critical values of CD(S1 )
and RS(S1 ) tests, respectively.
32
Table 3: Size and power of standard multiple VR tests
εt Model CD RS WK
N (0, 1) Size
(25) 0.075 0.054 0.062
(26) 0.040 0.339 0.116
Power
(27) 0.901 0.835 0.162
(28) 0.505 0.840 0.167
(29) 0.990 0.967 0.719
(30) 0.832 0.950 0.584
(31) 0.500 0.446 0.824
(32) 0.161 0.681 0.710
t3 Size
(25) 0.041 0.047 0.062
(26) 0.040 0.273 0.118
Power
(27) 0.819 0.866 0.176
(28) 0.551 0.845 0.188
(29) 0.984 0.985 0.739
(30) 0.879 0.965 0.624
(31) 0.436 0.399 0.825
(32) 0.186 0.631 0.738
The entries are rejection rates, at the 5% significance level. The column W K
displays mean rejection rates over six block lengths in the range 3.5 T 0.6 < b <
2.5 T 0.3 .
33
Table 4: Size and power of ranks and signs VR tests (Hochberg adjustment)
(H) (H) (H) (H)
εt Model R1 R2 S1 S2
N (0, 1) Size
(25) 0.036 0.033 0.037 0.002
(26) 0.046 0.091 0.020 0.000
Power
(27) 0.856 0.892 0.511 0.139
(28) 0.817 0.828 0.511 0.191
(29) 0.971 0.980 0.793 0.531
(30) 0.969 0.964 0.848 0.618
(31) 0.573 0.617 0.232 0.004
(32) 0.350 0.514 0.101 0.002
t3 Size
(25) 0.034 0.033 0.036 0.000
(26) 0.041 0.064 0.030 0.002
Power
(27) 0.943 0.955 0.680 0.320
(28) 0.940 0.940 0.732 0.368
(29) 1.000 1.000 0.968 0.847
(30) 0.999 0.996 0.980 0.897
(31) 0.159 0.353 0.036 0.000
(32) 0.117 0.226 0.095 0.018
The entries are rejection rates, at the 5% significance level.
34
Table 5: Size and power of ranks and signs VR tests (Sidack adjustment)
(S) (S) (S) (S)
N (0, 1) Size
(25) 0.034 0.034 0.037 0.001
(26) 0.043 0.090 0.019 0.000
Power
(27) 0.854 0.891 0.504 0.148
(28) 0.816 0.830 0.503 0.173
(29) 0.968 0.978 0.785 0.435
(30) 0.965 0.964 0.841 0.528
(31) 0.566 0.624 0.202 0.003
(32) 0.375 0.532 0.100 0.002
t3 Size
(25) 0.032 0.033 0.036 0.000
(26) 0.039 0.062 0.030 0.000
Power
(27) 0.941 0.954 0.675 0.290
(28) 0.938 0.941 0.727 0.347
(29) 1.000 0.999 0.963 0.785
(30) 0.999 0.997 0.976 0.844
(31) 0.156 0.323 0.062 0.003
(32) 0.117 0.227 0.094 0.017
35
Table 6: Size and power of ranks and signs VR tests (bootstrap adjustment)
(B) (B) (B) (B)
N (0, 1) Size
(25) 0.042 0.046 0.054 0.047
(26) 0.080 0.134 0.049 0.050
Power
(27) 0.871 0.902 0.541 0.601
(28) 0.844 0.851 0.496 0.629
(29) 0.974 0.989 0.817 0.844
(30) 0.971 0.980 0.853 0.900
(31) 0.619 0.664 0.251 0.152
(32) 0.407 0.579 0.135 0.064
t3 Size
(25) 0.051 0.043 0.047 0.047
(26) 0.072 0.094 0.049 0.036
Power
(27) 0.961 0.961 0.704 0.807
(28) 0.928 0.937 0.697 0.743
(29) 0.997 1.000 0.957 0.969
(30) 0.997 0.996 0.964 0.984
(31) 0.199 0.378 0.072 0.047
(32) 0.112 0.259 0.111 0.122
The entries are rejection rates, at the 5% significance level. Number of bootstrap
replications N = 200.
36
Table 7: Size and power of CD(·) tests
εt Model CD(R1 ) CD(R2 ) CD(S1 ) CD(S2 )
N (0, 1) Size
(25) 0.050 0.050 0.050 0.000
(26) 0.081 0.122 0.050 0.000
Power
(27) 0.868 0.891 0.494 0.297
(28) 0.822 0.850 0.528 0.342
(29) 0.981 0.987 0.838 0.696
(30) 0.978 0.977 0.880 0.762
(31) 0.586 0.641 0.227 0.028
(32) 0.397 0.577 0.117 0.012
t3 Size
(25) 0.050 0.050 0.050 0.000
(26) 0.061 0.081 0.050 0.000
Power
(27) 0.970 0.970 0.719 0.531
(28) 0.949 0.958 0.738 0.550
(29) 0.997 0.998 0.968 0.924
(30) 0.998 0.999 0.976 0.940
(31) 0.191 0.392 0.063 0.008
(32) 0.117 0.248 0.125 0.062
37
Table 8: Size and power of RS(·) tests
εt Model RS(R1 ) RS(R2 ) RS(S1 ) RS(S2 )
N (0, 1) Size
(25) 0.050 0.050 0.050 0.007
(26) 0.085 0.150 0.050 0.009
Power
(27) 0.794 0.826 0.449 0.230
(28) 0.784 0.815 0.488 0.277
(29) 0.960 0.974 0.755 0.538
(30) 0.938 0.942 0.773 0.618
(31) 0.431 0.487 0.159 0.014
(32) 0.340 0.548 0.114 0.015
t3 Size
(25) 0.050 0.050 0.050 0.009
(26) 0.065 0.105 0.050 0.005
Power
(27) 0.905 0.920 0.612 0.428
(28) 0.904 0.926 0.646 0.445
(29) 0.996 0.996 0.943 0.865
(30) 0.993 0.994 0.961 0.906
(31) 0.158 0.301 0.051 0.006
(32) 0.121 0.233 0.101 0.039
38
Table 9: Application of multiple variance ratio tests. Wright (2000) data
Test Canada France Germany Japan UK
CD 2.092 1.967 1.958 3.800∗∗ 1.832

RS 10.415∗∗ 8.737∗ 8.770∗ 25.796∗∗ 7.111
WK 4.272 9.868 9.615 16.871 10.497
b1 10.297 9.638∗∗ 9.997 10.868∗∗ 9.635∗∗
b2 9.193 10.511 11.529 12.049∗∗ 10.969∗
b3 7.316 10.043∗ 9.971 11.702∗∗ 10.544∗
b4 7.641 10.303 10.347 12.466∗∗ 12.810
b5 6.621 9.351∗∗ 8.595∗∗ 12.205∗∗ 12.373
b6 6.563 10.031∗ 9.429∗∗ 12.527∗∗ 12.872
(H)
S2 0.044∗∗ 0.058∗ 0.068∗ 0.020∗∗ 0.057∗
(S)
S2 0.049∗∗ 0.077∗ 0.066∗ 0.039∗∗ 0.072∗
(B)
S2 0.001∗∗ 0.002∗∗ 0.004∗∗ 0.000∗∗ 0.005∗∗
CD(S2 ) 3.293∗∗ 2.630∗∗ 2.757∗∗ 4.733∗∗ 2.800∗∗
RS(S2 ) 12.193∗∗ 13.611∗∗ 10.996∗∗ 31.260∗∗ 8.676
Statistics computed by setting four k values (2, 4, 8 and 16). ∗ and ∗∗ means
significance at the 10% and 5% level, respectively. Rows b1 , . . . , b6 display the 5%
critical values of the W K test for each block length in the range 3.5 T 0.6 < b <
(H) (S)
2.5 T 0.3 . Critical values used to test the significance of S2 , S2 , CD(S2 ) and
RS(S2 ) tests correspond to the sample size T = 1, 280. The bootstrap-adjusted
p-values have been computed by using N = 1, 000 replications.
39
Table 10: Application of multiple variance ratio tests. Luger (2003) data
Test Canada France Germany Japan UK
CD 2.125 2.080 1.879 4.220∗∗ 2.075

RS 10.552∗∗ 8.805∗ 6.320 26.959∗∗ 10.557∗∗
WK 4.375 10.517 9.515 19.272 11.545
b1 10.259 9.888∗∗ 10.295 11.564∗∗ 8.912∗∗
b2 8.947 9.633∗∗ 10.313 13.756∗∗ 12.893
b3 7.309 9.192∗∗ 9.353∗∗ 14.235∗∗ 10.472∗∗
b4 7.700 10.277∗∗ 9.563∗ 14.809∗∗ 13.357
b5 6.616 9.116∗∗ 7.942∗∗ 15.278∗∗ 15.171
b6 6.492 10.364∗∗ 8.759∗∗ 16.301∗∗ 16.028
(H)
S2 0.043∗∗ 0.024∗∗ 0.054∗ 0.010∗∗ 0.046∗∗
(S)
S2 0.050∗ 0.043∗∗ 0.077∗ 0.039∗∗ 0.058∗
(B)
S2 0.000∗∗ 0.001∗∗ 0.005∗∗ 0.000∗∗ 0.001∗∗
CD(S2 ) 3.270∗∗ 3.613∗∗ 2.630∗∗ 5.498∗∗ 3.040∗∗
RS(S2 ) 12.193∗∗ 14.225∗∗ 11.281∗∗ 31.928∗∗ 9.665∗
Statistics computed by setting four k values (2, 4, 8 and 16). ∗ and ∗∗ means
significance at the 10% and 5% level, respectively. Rows b1 , . . . , b6 display the 5%
critical values of the W K test for each block length in the range 3.5 T 0.6 < b <
(H) (S)
2.5 T 0.3 . Critical values used to test the significance of S2 , S2 , CD(S2 ) and
RS(S2 ) tests correspond to the sample size T = 1, 280. The bootstrap-adjusted
p-values have been computed by using N = 1, 000 replications.
40
View publication stats

Ranks and Signs-Based Multiple Variance Ratio Test

Uploaded by

Copyright:

Available Formats

Ranks and Signs-Based Multiple Variance Ratio Test

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ranks and Signs-Based Multiple Variance Ratio Test

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Ranks and signs-based multiple variance ratio tests

Article · January 2004

Jorge Belaire-Franch Dulce Contreras Bayarri

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Jorge Belaire-Franch† Dulce Contreras‡

This study proposes multiple variance ratio tests based on ranks

Key words: Variance ratio, ranks, signs, adjustment for multiplicity.

JEL Classification: G12, G14, G15.

random walk hypothesis, both in macroeconomics and finance. For instance,

(and hence, non-predictability) of financial markets (Fama, 1970; 1991).

φ = 1 in the first-order autoregressive model

neither independent nor identically distributed. Past evidence suggests that

test procedures may point towards conflicting conclusions.1

There is a large number of statistical tests with different properties and

sensitiveness against specific alternatives. Among them, Cochrane’s (1988)

variance-ratio methodology is one of the most popular methods, especially

the asymptotic distribution for the variance-ratio statistic.

ance of random walk increments is linear in all sampling intervals. The

Lo and MacKinlay (1988) propose two statistics for testing an individ-

ual variance ratio estimate: the homoscedasticity-robust M1 (k) and the

heteroscedasticity-robust M2 (k). The latter is specially important due to

the changing volatility of financial assets. In practice, it is customary to

it is rejected for some k value. However, as stressed by Chow and Denning

(1993), this sequential procedure leads to an oversized testing strategy. In

tiple comparison tests, which allow us to examine a vector of individual

variance ratio tests while controlling for overall test size.

Nevertheless, as pointed out by Fong et al. (1997), Hochberg’s approach

is valid only if the vector of test statistics is multivariate normal. This

condition is satisfied by variance ratios if there is little overlap in the data

(i.e., if k/N is small).

series. For instance, tail fatness encountered in some financial applications

in the presence of conditional heteroscedasticity. Using signs and ranks,

thors to construct robust versions of standard random walk tests (Hasan

Campbell and Dufour (1997) nonparametric approach to test for random

walk with unknown drift.

This paper explores straightforward extensions of Wright’s ranks and

signs-based variance ratio tests. One approach is based on considering the

variance ratio test computed at each lag value k as a different statistical

tests, as suggested by Psaradakis (2000) in the context of linearity testing,

approach consists of using the distribution of standard multiple variance

multiple variance ratio tests.

The remainder of the paper is organized as follows. Section 2 reviews

standard univariate and multiple variance ratio tests. Section 3 reviews

4 presents the alternative methods to test for random walk in a multiple

variance ratio context. Section 5 presents a large Monte Carlo work to

analyze the performance of the tests in finite samples. Evidence concerning

exchange rates returns against the US dollar is revisited in section 6 using

the suggested methods. Finally, section 7 concludes.

2.1 Lo and MacKinlay (1989) variance ratio tests

large as the variance of its daily difference.

The hypothesis to be tested is H0 : the time series follows a random walk,

time series consisting of T + 1 observations X0 , . . . , XT . The variance ratio

statistic of the k-th difference of the variable is defined as:

m = k(T − k + 1)(T − k)/T,

The test statistic M1 (k) is given by:

tributed as N (0, 1). The asymptotic variance, φ(k), is given by:

The test statistic M2 (k), which is robust under heteroscedasticity, is: