This Content Downloaded From 2.14.59.251 On Sat, 18 Jul 2020 08:34:10 UTC
This Content Downloaded From 2.14.59.251 On Sat, 18 Jul 2020 08:34:10 UTC
This Content Downloaded From 2.14.59.251 On Sat, 18 Jul 2020 08:34:10 UTC
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.com/stable/2236609?seq=1&cid=pdf-
reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
BY NILs BLOMQVIST
The definition of q' is not new [5] but as far as is known, its statistical proper-
ties have never been studied completely.
As the number of sample points belonging to the first and third quadrants
around, (x, y) must be equal, the probability of the combined event
S a*dy
(4) S=-*d r k- - br
a-b *dxb*dy b
a
r k- r
+ d, c-dy c - d *dxd*dyd +dF.
Each of the first four terms of the expression (4) refers to a case in which two
sample points determine (x, y), and the last term refers to a case in which (x, y)
is determined by only one point. From (3) it follows that the probability of
obtaining ni at most equal to 21 is
X0 X0 E pk(2r; x, y)
(6) P{n1 < 2R} = Ldk(x = y) r0
x0 00 X pk(2r; x, y)
r=O
as
k
Clearly the integrand in (6) is <1 everywhere it exists. In the points (x, y)
where the denominator is equal to zero the integrand is undefined, but as the
measure (T) of the set of such points is zero, we need not have any trouble
with them.
Under the conditions a)-d) Jx and y converge in probability to zero; that is
E pk(2r; 0, 0)
(7) lim P{n1 < 2R} = lim r0-
Z pk(2r; 0, 0)
r=O
According to (3)
(2k10 1
(8) pk(2r; 0, 0) - + 1)! (aoco)r. (bodO)kr _So,
where the subscripts indicate the value at the point (0, 0). Because of (2),
co = a0o do = bo and ao + bo -
and
r - 2kao + t\/2kaobc,
R = 2kao + TV/2kaobo
2n,1 n
q = 2k -1 =nk
where, as before, (0, 0) are the coordinates of the population medians. Then q
has the desired property of being equal to zero in the case of independence and
equal to ?t1 in the case of linear relationship between x and y.
According to (9) q' is a consistent estimate of q when the conditions a)-d) are
fulfilled. Furthermore, as the standard deviation of q' is, to a first approximation,
independent of quantities other than q, it is possible to construct approximate
confidence limits for q for large sample sizes. This is done in the following way.
In terms of n and q we have, according to the last paragraph of section 3 and
(10),
Eq' '-
Let) DxbeatadadiednomacfndXian2t
Let 1'(x) be a standardized normal cdf and X1 and X2 two numbers such that
P j V - 1 < X2 a,
which gives the desired result.
If we let X2 = -i = X and solve the inequality in (11) for q, the following
symmetrical confidence interval is obtained
(12) q - arcsin p.
2( r) - arcsin P)] =
a2(ff) r / \2- -9
- *q 1. - arcsin p)
for p = 0.
6. Tests of independence based on q'. In testing independence between x
and y it is in practice more convenient to use critical regions based on n1 instead
of q'. Since, under the null hypothesis, the measure of a critical region is inde-
pendent of F(x, y) (Fl(x) and F2(y) are assumed to be continuous), any test
based on ni is non-parametric. We have made exact calculations of the q'-distribu-
tion for sample sizes n up to 50. For larger sample sizes the normal approximation
for n1 does not seem to entail errors of practical importance.
To derive the exact distribution of ni under the null hypothesis we suppose
that n equals 2k. The probability that any k sample points shall have smaller
x-values than the other k points is
/k\
Hence, since any arrangement of the sample points according to their x-values
does not affect the distribution of the y-values,
0 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
2 .333 .486 .567 .619 .656 .684 .706 .724 .740 .752 .764 .773
4 .029 .080 .132 .179 .220 .257 .289 .318 .343 .366 .387
6 .0022 .010 .023 .039 .057 .076 .094 .113 .131 .148
8 .0002 .0011 .0033 .0070 .012 .018 .026 .034 .042
10 .0001 .0004 .0011 .0022 .0038 .0060 .0087
12 .0002 .0004 .0007 .0012
14 .0001 .0001
6 10 14 18 22 26 30 34 38 42 46 50
~--. 2k
1 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
3 .100 .206 .286 .347 .395 .434 .466 .494 .517 .538 .556 .572
5 .0079 .029 .057 .086 .115 .143 .169 .194 .217 .238 .258
7 .0006 .0034 .0089 .017 .027 .038 .050 .063 .076 .089
9 .0003 .0012 .0028 .0053 .0086 .013 .017 .023
11 .0001 .0004 .0009 .0017 .0028 .0042
13 .0001 .0001 .0003 .0005
15
2k - 1
- k *V2k - 1
7. The asymptotic efficiency of the q'-test. In the case that x and y are nor-
mally distributed with the correlation coefficient p, it is possible, but rather
tedious, to calculate the power function of the q'-test. We will, therefore, restrict
ourselves to considering only the asymptotic behavior of the power function.
Consider tests of independence (p = 0) against one-sided alternatives p > 0.
Let L(l)'(p) be the powver function of the q'-test for the sample size mn anld L(2)(p)
be the power function of the test based on the correlation coefficient r in a
sample of size n. We assume that all tests have the same size, i.e.
for all m and n. We shall say that the q'-test has the asymptotic efficiency e if
(OL(1) _
This means that the sample size in using the r-test need only be lOOE%
that in using the q'-test, in order to get the same derivative of the power fun
at p = 0 (for large sample sizes). Since the definition of e only concerns the
behavior in the neighborhood of p = 0, it might perhaps be more correct to call e
the asymptotic local efficiency.
In order to calculate e we define two sequences {qm } and {r,, I such that
{q' > qm} and {r > r. } are tests with the afore mentioned properties. According
to (9) and (10) q' is asymptotically normally distributed with mean q and s.d.
V/(1 -q2)/m. ]Furthermore, r is asymptotically normally distributed with mean
p and s.d. (1- p2)/Vn. Hence,
ap 0
Thus we conclude
K p /'o
n dq\2
m dp/o
(2)2
REFERENCES