0% found this document useful (0 votes)
67 views35 pages

Stable Marked Point Processes

This document discusses stable marked point processes, which model irregularly spaced observation points generated by a Poisson process. It presents the following: 1) New results on the asymptotic behavior of sample means and variances for continuous-time stable random fields, extending to cases with infinite variance. 2) Derivation of the joint asymptotics for the sample mean and variance of a marked point process, where observations come from a stationary random field and points are generated by a Poisson process. 3) Methods for constructing confidence intervals for the unknown mean using subsampling, both when the stability parameter is known and unknown. These are tested in a simulation study.

Uploaded by

ravirajaa07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views35 pages

Stable Marked Point Processes

This document discusses stable marked point processes, which model irregularly spaced observation points generated by a Poisson process. It presents the following: 1) New results on the asymptotic behavior of sample means and variances for continuous-time stable random fields, extending to cases with infinite variance. 2) Derivation of the joint asymptotics for the sample mean and variance of a marked point process, where observations come from a stationary random field and points are generated by a Poisson process. 3) Methods for constructing confidence intervals for the unknown mean using subsampling, both when the stability parameter is known and unknown. These are tested in a simulation study.

Uploaded by

ravirajaa07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Stable Marked Point Processes

Tucker McElroy
U.S. Bureau of the Census
University of California, San Diego
Dimitris N. Politis
University of California, San Diego
Abstract
In many contexts, such as queueing theory, spatial statistics, geostatistics and
meteorology, data are observed at irregular spatial positions. One model of this
situation is to consider the observation points as generated by a Poisson Process.
Under this assumption, we study the limit behavior of the partial sums of the
Marked Point Process {(t
i
, X(t
i
))}, where X(t) is a stationary random eld and
the points t
i
are generated from an independent Poisson random measure N on
R
d
. We dene the sample mean and sample variance statistics, and determine
their joint asymptotic behavior in a heavy-tailed setting, thus extending some
nite variance results of Karr (1986). New results on subsampling in the context
of a Marked Point Process are also presented, with the application of forming a
condence interval for the unknown mean under an unknown degree of heavy tails.
1 Introduction
Random eld data arise in diverse areas, such as spatial statistics, geostatistics, and
meteorology, to name a few. It often happens that the observation locations of the
data are irregularly spaced, which is a serious deviation from the typical formulation
of random eld theory where data are located at lattice points. One eective way of
modeling the observation points is through a Poisson Process. In Karr (1982, 1986),
the statistical problem of mean estimation is addressed given a Marked Point Process
1
structure of the data where the observation locations are governed by a Poisson Ran-
dom Measure N assumed independent of the distribution of the stationary random
eld itself. Karr (1986) obtained Central Limit Theorem results for the sample mean
in this context under a nite second moment assumption, and showed that the limiting
variance depends on the integrated autocovariance function. The paper at hand is a
rst analysis of innite-variance Marked Point Processes.
Within the literature of dependent, heavy-tailed stationary time series, the discrete
time stochastic process
X(t) =
_
R
(t +x) M(dx) (1)
has been studied in Resnick, Samorodnitsky, and Xue (1999). In their work, t is the
integer index of the discrete time process, M is an -stable random measure with
Lebesgue control measure, and is a suciently regular real-valued function. This is
an excellent model for the types of data discussed above because X(t) can be dened
for any t in R
d
, when and M are extended to R
d
as well. A Marked Point Process
can be dened using model (1); data from a Marked Point Process are of the form
{t
i
, X(t
i
)} for i = 1, 2, where the points t
1
, t
2
, are generated by a Poisson
Random Measure. We focus on the non-Gaussian case where < 2, so that the
variance of X is innite. Nevertheless, we study the sample variance statistic, since
it forms a suitable studentization of the mean see McElroy and Politis (2002). The
sample variance (as well as its square root, the sample standard deviation) is always
well-dened, even though the true variance may be innite.
The second section of this paper develops some theory on continuous-time stable
processes, and the convergence of integrated partial sums of the data is established.
This is relevant to our main discussion since a continuously observed process must be
dened before we can conceive of a Marked Point Process, as the random eld must be
well-dened at every observation point t. Since the limit results for continuous-time
2
stable processes are new, and helpful towards our Marked Point Process problem, they
are also included.
In the third section, we describe the Marked Point Process situation, and derive the
joint asymptotics for the sample mean and sample variance. As expected, the limit
is stochastic, but its randomness only comes from the stable random measure M, and
not from the Poisson Random Measure N. It will be seen that our results generalize
the limit theory of Karr (1986) to the case of innite variance.
Finally, it is well known that subsampling is applicable in the context of a Marked
Point Process under some conditions; see Chapter 6 of Politis, Romano, and Wolf
(1999) hereafter denoted PRW (1999). Our limit theorems permit us to verify the
subsampling requirements, so that a valid condence interval for the mean is con-
structed in our Section 4. Two methods are presented: one based on the asymptotics
of the sample mean when is known, and one based on the asymptotics of the self-
normalized sample mean when is not known. These methods are tested and com-
pared through a simulation study in Section 5. All technical proofs are placed in the
Appendix, Section 6.
2 Continuous Parameter Processes
In this section we develop a limit theory for the sample mean and sample variance of
an -stable continuous-time random eld {X(t), t R
d
}. Consider an -stable random
measure M with skewness intensity () and Lebesgue control measure (denoted by )
dened on the space R
d
. The random measure is independently scattered, and for
Lebesgue-measurable sets A the distribution of M(A) is -stable with scale (A)
1/
,
skewness
_
A
(x)(dx)/(A), and location 0 see Samorodnitsky and Taqqu (1994,
page 118) for details. This choice of control measure reects our desire that the process
be strictly stationary; translation invariance is a necessary condition for stationarity
3
in such models. In addition, it is necessary for to have period zero for stationarity,
i.e., the skewness intensity (x) is constant as a function of x. We will denote this
constant by . Let be a lter function in L

:= {f : f

:=
_
R
d
|f(x)|

(dx) < }
that is continuous and bounded for almost every x with respect to Lebesgue measure.
Then we may construct the following stochastic integral with respect to an -stable
random measure M (see Samorodnitsky and Taqqu, 1994):
X(t) =
_
R
d
(x +t)M(dx) (2)
with t R
d
. We stipulate that the number is in (0, ]

[0, 1] (later we require to


be integrable); will be xed throughout the discussion. Note that = 2 corresponds
to a Gaussian stochastic process, and has been extensively studied. It is a fact that
the random variable X(t) is -stable with scale parameter

=
__
R
d
|(x +t)|

(dx)
_1

=
__
R
d
|(x)|

(dx)
_1

and skewness parameter

=
_
R
d
(x +t)
<>
(x)(dx)

=
_
R
d
(x)
<>
(dx)

where b
<>
= sign(b)|b|

. The location parameter is zero unless = 1, in which case


it is

=
2

_
R
d
(x +t)(x) log |(x +t)|(dx) =
2

_
R
d
(x) log |(x)|(dx).
Intuitively, we may think of X as the convolution of and M, in analogy with the
innite order moving average of classical time series analysis. Essentially, one runs
the independent increments -stable measure M through the linear lter , and the
resulting time series is strictly stationary with non-trivial dependence; two random
variables X(t) and X(t + k) are independent if and only if the lag k exceeds the
diameter of s support. Of course, need not be compactly supported, in which case
all the variables are dependent. Hence, this construction makes for an interesting and
4
relevant linear heavy-tailed model. As shown in Proposition 1 of McElroy and Politis
(2004), the model dened by equation (1) is well-dened and stationary. That is, for
each t, the random variable X(t) is -stable with location zero (unless = 1, in which
case the location is

), constant skewness, and constant scale.


We will be interested in the asymptotic distribution of the partial sums. In order
to consider fairly general, non-rectangular regions, we let K be a prototype region,
which is a Lebesgue measurable set in R
d
with measurable boundary K such that
(K) = 0; see Nordman and Lahiri (2003) for background on this concept. Then
let K
n
= n K, which essentially scales the prototype region by the integer n. Our
statistics are computed over K
n
, and our asymptotic results are achieved as n .
This device allows us to consider non-rectangular regions, and at the same time is a
realistic construction. The appropriate rate of convergence for the partial sums will
then be n
d

, as shown in Theorem 1 below. We wish to average X(t) over all points in


K
n
, and so to that end we must calculate
_
K
n
X(t)(dt). (3)
Now a discrete sum of the random eld is certainly well-dened by the linearity of the
-stable random integral, but it is not a priori clear that (3) makes sense. Thus we
make the following denition:
Denition 1 By the expression (3) we mean the limit in probability as m of the
following:

t
i
K
m
n
X(t
i
)t
i
, (4)
where K
m
n
is a mesh of m points t
i
in K
n
, t
i
is the d volume of the elements of
the mesh, and the mesh gets progressively ner as m is increased (this is the usual
Riemann sums construction).
5
Let us establish that this denition makes sense. By using the linearity of the stable
integral, line (4) becomes
_
R
d

t
i
K
m
(t
i
+x)t
i
M(dx) =
_
R
d
F
m
(x) M(dx) (5)
where F
m
(x) =

t
i
K
m (t
i
+ x)t
i
. Now it follows that the limit in probability as
m of (5) is
_
R
d
F(x)M(dx) for F(x) :=
_
K
n
(t +x) (dt) so long as
_
R
d
|F
m
(x) F(x)|

(dx) 0
as m . Since the integrands are bounded in L
1
, we may apply the Lebesgue Dom-
inated Convergence Theorem and obtain our result, since F
m
(x) F(x) pointwise.
Thus we have established that the expression (3) makes sense, and also that it is
equal to
_
R
d
F(x) M(dx) =
_
R
d
_
K
n
(t +x) (dt)M(dx). (6)
In a similar fashion, one can dene the integral of the second moment
_
K
n
X
2
(t)(dt) (7)
as a limit in probability of a Riemann sum of X
2
(t); unfortunately, due to squaring, a
nice representation such as (6) is not possible. However, the Laplace transform of (7)
is closely related to the Fourier transform of
_
R
d
_
K
n
(t +x) B(dt)M(dx), (8)
where B is an independent Gaussian random measure. This is shown in the proof of
Theorem 1. Note that, conditional on B, (8) is an -stable random variable with scale
__
R
d

_
K
n
(x +t)B(dt)

(dx)
_
1/
.
6
Now we are interested in the continuous versions of the sample mean and sample
variance; it is sucient to examine the asymptotics of
_
n

_
K
n
X(t)(dt), n

2d

_
K
n
X
2
(t)(dt)
_
which we explore through the joint Fourier/Laplace Transform. For two random vari-
ables A and B 0, this is dened by

A,B
(, ) = Eexp{iAB} R, 0.
Like the joint characteristic function, the pointwise convergence of
A
n
,B
n
to a function
continuous at (0, 0) establishes joint weak convergence of A
n
and B
n
see Fitzsimmons
and McElroy (2006). The next theorem gives a complete answer to our inquiry:
Theorem 1 Consider a random eld dened by the model given by (1), where 0 <
2. Then the sample rst and second moments jointly have a limit:
_
n

_
K
n
X(t)(dt), n

2d

_
K
n
X
2
(t)(dt)
_
L
= (S

(), U

()) (9)
as n ; in the above, U

() is nondegenerate only if < 2. S

() is an -
stable random variable with scale parameter || where =
_
R
d
(x)(dx) and skewness
parameter sign(). If = 1, the location parameter is zero; otherwise the location
parameter is
2

log ||. When either = 1 or = 1 and = 0, we may write


S

()
L
= (K)
1/
M(B) for the marginal distribution. U

() is for < 2 an
/2-stable random variable with scale parameter
2[(K)E|G|

]
2/
_
R
d

2
(x)(dx) (cos(/4))
2/
,
skewness 1, and location 0, where G is a standard normal random variable. If = 2,
then U

() is a point mass at the second moment of X, which is 2


_
R
d

2
(s)(ds).
We may write the marginal distribution as
U

()
L
= 2[(K)E|G|

]
2/
_
R
d

2
(x)(dx) (cos(/4))
2/
()
7
where () is an /2-stable subordinator. The limiting joint Fourier/Laplace Trans-
form Eexp{iS

() U

()} for < 2 is (letting


2
=
_
_
R
d

2
(x)(dx))
exp{(K)E

+
_
2
2
G

_
1 i
E
_
+

2
2
G
_
<>
E

2
2
G

_
i(K)
2

1
=1
E
__
+
_
2
2
G
_
log

+
_
2
2
G

_
}
(10)
for real and 0.
Remark 1 One can develop condence intervals via this result. However, due to
considerations of space, we will give the details only in the Marked Point case, as the
continuous case is much simpler.
3 Marked Point Processes
We will now consider the more intricate situation wherein the observation locations
of the random eld are themselves random. It often happens in statistical problems
that random eld data is not observed at lattice points, but instead at points scattered
around the observation region with no discernible pattern. Frequently, we can model
this situation through the employment of a Random Measure for the point locations.
Generally, this probabilistic structure is referred to as a Marked Point Process

N:

N =

(T
i
,X(T
i
))
for
x
(A) = 1 if x A and 0 otherwise. See Karr (1986) for a treatment of Marked
Point Processes for L
2
random elds. If we wish to impose that the distribution of
points does not depend on the location of the observation region, but only on its size
and shape, then we say the random measure is spatially homogeneous this is like
a stationarity assumption. Also it is often sensible that the distribution of points in
one observation region can be assumed independent of the distribution of points in
another disjoint observation region the independent scattering property. It turns
8
out that a homogeneous Poisson Random Measure (PRM) satises these properties,
and therefore is a reasonable model in many scenarios. So let N denote a PRM with
mean measure hence N is sometimes denoted PRM(), i.e.,
Denition 2 We say that N is PRM() on the measure space {R
d
, B, } (where B
are the Borel sets in R
d
) if and only if it is an independently scattered, countably
additive random measure that satises
P[N(A) = k] = exp{(A)}
(A)
k
k!
, k = 0, 1 (11)
for every A B
0
:= {B B : (A) < }. In other words, N(A) Pois((A)) . We
call the mean measure of N.
Remark 2 More generally, we could just dene the mean measure to be () :=
EN(). Now if we impose that be a translation invariant measure on R
d
, then spatial
homogeneity follows at once from (11). Of course, must then be Lebesgue measure
(denoted by , as in the previous section) modulo some constant positive multiplicative
factor, i.e., = r for some r R
+
.
We are interested in investigating the limit behavior of the sample mean and sample
variance over the observation region K
n
(we preserve the notation from last section),
where the data locations are now determined by the random measure N, which is
independent of the stochastic process (1). Thus we wish to study the joint convergence
of
_
N(K
n
)

_
K
n
X(t) N(dt), N(K
n
)

_
K
n
X
2
(t) N(dt)
_
(12)
as n ; note that N(K
n
) is the actual observed sample size.
Theorem 2 Consider a continuous parameter random eld generated from the model
given by (1), where 0 < 2 and the skewness intensity is constant. Suppose that a
PRM N with mean measure = r, independent of the stochastic process, governs the
distribution of observation locations. If the observation region is the set K
n
and < 2,
9
then the normalized sample mean and sample variance computed over the observation
region jointly converge in distribution to an -stable random variable

S

() and a
positive /2-stable random variable

U

() as n :
_
N(K
n
)

_
K
n
X(t)N(dt), N(K
n
)

_
K
n
X
2
(t)N(dt)
_
(13)
L
=
_
r
1/

S

(), r
2/

U

()
_
The joint Fourier/Laplace transform of the limit ( real and > 0) is given by
Eexp{i

()

()} = exp{

(, )
_
1 i

(, )

(, )
_
+i1
{=1}

(, )}
with the parameters given by

(, ) = (E|g(, )|

)
1/

(, ) = E(g(, ))
<>

(, ) =
2

E[g(, ) log g(, )]


g(, ) =
_
R
d
(s)N(ds) +
_
2

_
R
d

2
(s)N(ds)G
for G a standard normal random variable independent of N. Hence

S

() is -stable
with scale

(1, 0), skewness


(1,0)

(1,0)
, and location 1
{=1}

(1, 0); whereas



U

() is
/2-stable with scale
2
_
cos(/4)E
__
R
d

2
(s)N(ds)
_
/2
E|Z|

_
2/
skewness one, and location zero. If = 2, the limit of the sample variance

U

() is a
point mass at 2r
_
R
d

2
(s) (ds). The stated limit for the sample mean still holds when
= 2; in this case,

S

(2) is a Gaussian random variable.


10
As a special case, let us x = 2 in Theorem 2: the limiting squared scale
2

(, 0)
(half the variance) of the partial sums is then
E

_
R
d
(s)N(ds)

2
=
__
R
d
(s) r(ds)
_
2
+
_
R
d

2
(s) r(ds)
= r
2
_
R
d
_
R
d
(x)(x +t) (dx)(dt) +r
_
R
d

2
(s) (ds)
=
1
2
r
_
R
d
(t) (dt) +
1
2
r (0)
where (t) = Cov(X(t), X(0)) = 2r
_
R
d
(x)(x +t) (dx) is the covariance (or codif-
ference) function. Therefore the variance of the limit r
1/2

S

(2) is
_
R
d
(t) (dt)+(0)
which agrees with equation (3.10) of Karr (1986) in the nite variance case.
Remark 3 Since the codierence function (t) = 2

( +t)

is a natural
generalization of the covariance function to < 2, one may be tempted to conjecture
that E

_
R
d
(s)N(ds)

=
1
2
__
R
d
(s) (ds) +(0)
_
. This is in fact false, as evaluation
on a simple will conrm.
Corollary 1 follows from Theorem 2 by the continuous mapping theorem:
Corollary 1 Under the same assumptions as Theorem 2, the self-normalized mean
converges as n :
_
K
n
X(t)N(dt)
_
_
K
n
X
2
(t)N(dt)
L
=

()
_

()
.
Remark 4 Note that no knowledge of is required to compute the self-normalized
mean. It is also interesting that the limit does not depend on r. The ratio is noncon-
stant, since a squared -stable variable never has an /2-stable distribution.
4 Subsampling Applications
This next section of the paper describes how subsampling methods may be used
for practical application of the results of the previous section. The idea is to use the
11
subsampling distribution estimator as an approximation of the limit distribution of
our mean-centered statistic from a Marked Point Process that depends on unknown
parameters (including ); this procedure will yield approximate quantiles for its sam-
pling distribution, and thus condence intervals for the mean can be formed. For more
details and background on these methods, see PRW (1999).
In order to use subsampling, it is desirable that the model satisfy certain mixing
conditions. Strong mixing is one common condition on the dependence structure which
is sucient to insure the validity of the subsampling theorems see Rosenblatt (1956)
for its introduction. Many time series models satisfy the assumption of strong mixing;
for Gaussian processes, the summability of the autocovariance function implies the
strong mixing property. In the case where d = 1, if our process (1) is symmetric, the
strong mixing condition is always satised, as Proposition 3 of McElroy and Politis
(2004) demonstrates. The mixing condition that is needed in general is a bit more
complicated, and is dened in the next subsection.
4.1 Subsampling with Known Index
In this subsection assume that is known, and is greater than 1; consider the
following location-shifted model:
Z(t) := X(t) + =
_
R
d
(x +t)M(dx) + (14)
This is the appropriate model for stable stationary random elds with nonzero location.
Note that, since > 1, the location parameter of X(t) is zero. For applications, we
will suppose that our data are the observations {Z(t
i
) : t
i
K
n
} for some specied
observation region K
n
, and a random collection of N(K
n
) points t
i
generated by the
Poisson Random Measure N. Our goal is to estimate the location parameter (i.e., the
mean) with the sample mean

K
n
=
1
N(K
n
)
_
K
n
Z(t)N(dt).
12
Let
Z
(k; l
1
) be the mixing coecients dened in PRW (1999, p. 141). Since by
(13) the sample mean converges, we can apply Theorem 6.3.1 of PRW(1999) to this
situation. Let K
n
(1c) := {y K
n
: B+y K
n
} for B := cK
n
, where c = c
n
(0, 1)
is a sequence that tends to zero as the diameter of K
n
, denoted by (K
n
), tends to
innity. We also require that c
n
(K
n
) as n . Since (K
n
) = (nK) =
n(K), this means that 1/c
n
= o(n). In practice, since K
n
is not clearly dened by
the data, one may take it to be the convex hull or rectangular hull of the observation
points. Then it is simple to produce the scaled down copy B of K
n
, once c is chosen.
Let
K
n
,B,y
be the statistic evaluated on the set B + y for any y K
n
(1 c), and
let L
K
n
,B
(x) be the subsampling distribution estimator (6.8) of PRW(1999). We will
assume (6.9) of PRW(1999) as a condition on the mixing coecients; this condition is
easily satised in the d = 1 special case if the random eld is strong mixing (which
is always true for symmetric one-dimensional stable integrals, by Proposition 3 of
McElroy and Politis (2004)). Then by Theorem 6.3.1 of PRW(1999),
L
K
n
,B
(x)
P
J(x)
as n , where J(x) is the cdf of r
1/

S

().
Remark 5 It follows that an asymptotically correct (1 p)100% condence interval
for is given by
_
L
1p/2
N(K
n
)
1/1
, L
p/2
N(K
n
)
1/1
_
where L
p
is the p-th quantile of L
K
n
,B
, dened as inf{x : L
K
n
,B
(x) p}. Note that
explicit knowledge of is necessary for this construction.
4.2 Subsampling with Unknown index
The method outlined above is often not immediately applicable because the rate of
convergence
(K
n
)
depends on , which is typically unknown; thus, in practice, it may
be necessary to estimate . This can be done via a subsampling estimator of the rate,
13
as discussed in Bertail, Politis, and Romano (1999) or PRW (1999, Chapter 8). These
methods can be extended to the Marked Point Process scenario in a straightforward
fashion; details are omitted here, but may be obtained by contacting the authors.
The single important dierence from Bertail, Politis, and Romano (1999), is that the
data-driven rate
N(B+y)
appearing in the subsampling distribution estimator must be
replaced by a deterministic rate
(B+y)
; one can even use Lebesgue measure instead
of the mean measure , since the use of logarithms in the estimator ensures that the
dierences in scale are irrelevant.
Alternatively, one may be able to avoid the estimation of by using a self-normalized
estimate of the mean, e.g., one may consider dividing by the sample standard deviation
as in our Corollary 1. Suppose that
K
n
is an estimate of scale for X(t). Then we may
form the ratio

N(K
n
)
(
K
n
)

K
n
where
u
is the appropriate rate of convergence, such that the ratio has a nontrivial
weak limit. The goal is to self-normalize such that
u
will be a known rate, i.e., a rate
that does not depend on unknown model parameters. A leading example is to self-
normalize such that an asymptotic result with
u
=

u holds. This is an improvement
over the convergence rate of the (un-normalized) sample mean, where
u
depends on
, which is unknown.
We consider, generally, the scenario in which such a self-normalized convergence
holds, such that
u
is a known rate. Corollary 1 furnishes an example of such a
scenario see the discussion following Remark 6 below. Now we adjust the denition
of the subsampling distribution estimator accordingly:
L
K
n
,B
(x) := (K
n
(1 c))
1
_
K
n
(1c)
1
{
N(B+y)


K
n
,B,y

K
n

K
n
,B,y

x}
dy, (15)
14
with
K
n
,B,y
dened similarly to
K
n
,B,y
. On a practical note, it is possible that N(B+
y) = 0; in this case we get a point mass in L
K
n
,B
(x) at zero, but this wont aect the
asymptotics.
Theorem 3 Assume the following convergences:

N(K
n
)
(
K
n
)

K
n
L
= J (16)
a
N(K
n
)
(
K
n
)
L
= V
d
N(K
n
)

K
n
L
= W
for positive a
n
and d
n
such that
n
= a
n
/d
n
, and W does not have positive mass at
zero. Let be a parameter and assume that
u
has the form (6.6) of PRW (1999). Let
c = c
n
(0, 1) be such that c
n
0 but c
n
(K
n
) . Finally, assume the mixing
condition (6.9) of PRW(1999). Then the following conclusions hold:
(i.) L
K
n
,B
(x)
P
J(x) for every continuity point x of J(x).
(ii.) If J() is continuous, then sup
x
|L
K
n
,B
(x) J(x)|
P
0.
(iii.) Let
c
K
n
,B
(1 t) = inf{x : L
K
n
,B
(x) 1 t}.
If J(x) is continuous at x = inf{x : J(x) 1 t}, then
P{
(K
n
)
_

K
n

_
c
K
n
,B
(1 t)} 1 t.
Thus the asymptotic coverage probability of the interval [
K
n

N(K
n
)
1
c
K
n
,B
(1p), )
is the nominal level 1 p.
Remark 6 Since the subsampling distribution estimator involves Riemann integra-
tion, some numerical approximation must be made in its calculation see section 6.4
of PRW (1999) for further details.
15
As a specic application of Theorem 3, consider the model given by (14), and dene
the sample standard deviation by

K
n
=

1
N(K
n
)
_
K
n
(Z(t)
K
n
)
2
N(dt).
Hence, we have the following convergence:
N(K
n
)
1/2
(
K
n
)

K
n
L
=

()
_

()
(17)
as n , which follows at once from Corollary 1. Note that when = 2, this
convergence is valid, but

U

() is degenerate (it is equal to the variance of X, which


is 2r
_
R
d

2
(s) (ds) ). The limiting ratio, however, is always absolutely continuous.
Hence, we may apply Theorem 3 with
u
=

u, and obtain an asymptotic 1 p
condence interval for :
_

K
n
L
1p/2

K
n
_
N(K
n
)
,
K
n
L
p/2

K
n
_
N(K
n
)
_
where p = L
K
n
,B
(L
p
). Note that no explicit knowledge of is necessary for this
construction; neither is it necessary to know the Poisson intensity r.
5 Simulation Studies
Focus on the above mean estimation situation, with the asymptotic result (17). So
in this case,
u
=

u is a known rate. In this section we demonstrate the methods of
Section Four through several simulation studies. First we illustrate how a stable marked
point process can be simulated, and then we discuss the practical implementation
of the subsampling distribution estimators. Finally, our simulation results present
the empirical coverage of the condence intervals constructed via the subsampling
methodology.
16
5.1 Implementation
Following Samorodnitsky and Taqqu (1994, p. 149) also see Resnick, Samorodnit-
sky, and Xue (1999) the series representation of (1) is
X(t) = C
1/

i1

1/
i
(U
i
+t)q(U
i
)
1/
so long as X(t) is symmetric -stable. In this representation,
{
i
} are iid Rademacher random variables
{
i
} are the arrival times of a unit Poisson process (so they are sums of i iid
unit exponentials)
{U
i
} are iid random variables with pdf q
C

is a positive constant dened in (1.2.9) of Samorodnitsky and Taqqu (1994)


and all these three sequences are independent of each other. We have freedom to select
q, as long as it is a pdf with support on the whole real line. In our simulations, we
take q to be the Cauchy density in R
d
; in practice, a heavy-tailed q was more eective
in producing realistic simulations. For simulation, we adopt the following procedure:
rstly, x and determine the observation region K
n
= nK, which can have a variety
of shapes in R
d
. Also let be Lebesgue measure for simplicity. If we want a sample
of size n, we might pick K
n
such that (K
n
) = n, though this is no guarantee that we
will obtain n data points.
1. Simulate N(K
n
) which is Poisson with mean rate (K
n
).
2. Simulate T
1
, , T
N(K
n
)
iid from a uniform distribution on K
n
.
3. Simulate {
i
}, {
i
}, and {U
i
} for i I, where I is a predetermined threshold.
4. Determine a vector v, which is the center of the region K
n
. Compute
X(T
j
) = C
1/

i=1

i
(U
i
+T
j
v)q(U
i
)
1/
17
for j = 1, , N(K
n
).
Use of the centering constant v is optional, but we found that it improves the quality
of the simulation; theoretically, it merely introduces a deterministic lag, and thus does
not aect the distribution. One choice of v is to let each component be dened by
the various centroids. As mentioned in Samorodnitsky and Taqqu (1994), simulation
by this method is unwieldy because convergence in I is slow. However, there is no
alternative method for correlated stable random elds. By trial and error, we found
that I = 100 gave a decent trade-o between simulation quality and speed; increasing
I to 1000 gave little visible improvement to the simulation, while greatly retarding the
speed.
Next, we need to compute the subsampling distribution estimators given by (6.8) of
PRW(1999) and (15). The easiest method is to approximate the integrals with a Monte
Carlo approximation. This is achieved by drawing a large number of random variables
(we used 10, 000) uniform on K
n
(1 c). One detail is that for a given simulated y, the
number N(B + y) could be zero; this will create divide by zero problems for

K,B,y
,
so by convention the latter is set equal to zero in this case. In practice, this creates a
point mass at zero in the subsampling distribution, but the eect is lessened by taking
larger c values. Another practical problem occurs when N(B + y) = 1, which creates
a divide by zero issue for the self-normalized subsampling distribution estimator. In
this case, we should have

N(B+y)
_

K
n
,B,y

K
n

K
n
,B,y
_
= 1 (X(t

)
K
n
)/0 =
where the sign depends on whether or not the data value at the single point t

exceeds
the mean. In our code, we replace
K
n
,B,y
by a very small value in this case, so the
resulting ratio will be a large positive or negative value, corresponding to whether
X(t

) is greater than or less than the mean.


18
5.2 Results
Our simulation study focuses on dimension d = 2, with K
100
given by both a square
region (10 by 10) and a rectangular (5 by 20). Using Lebesgue mean measure, this
gives us samples of average size 100. The centering vector v is just given by the
midpoints (5, 5) and (2.5, 10) respectively. We simulated stable processes for values
1.1, 1.2, , 1.9 and a Gaussian lter function (x
1
, x
2
) = exp{(x
2
1
+x
2
2
)/2}., and
investigated the block ratios c = .1, .2, .3, .4.
Method 1 assumes that is known (see Subsection 4.1) whereas Method 2 uses a
self-normalized subsampling distribution, as in Subsection 4.2. Each simulation was
performed 1000 times; the tables record the proportion of simulations for which the
constructed condence interval contained the true mean of zero (the standard errors
are approximately .0095, .0069, and .0031 for = .1, .05, and .01 respectively). Clearly
both methods are sensitive to the choice of c. For Method 2, the coverage is a decreasing
function of c and an increasing function of , whereas for Method 1 the coverage is
not monotonic in c and is decreasing in . For most cases, an optimal value of c for
Method 2 would seem to lie between .2 and .3, based on the observed pattern that
c = .2 resulted in over-coverage and c = .3 in under-coverage. In contrast, it seems
that for high values of , no value of c could be found to provide good coverage for
Method 1. Neither method was particularly sensitive to the shape of the sampling
region, since results for the square and the rectangular region were similar. Although
Method 1 had superior coverage for low , the overall performance of Method 2 was
superior. In general, the choice of c will depend on how important under-coverage
and over-coverage are for a particular problem; c is also sensitive to the shape of K
n
.
Finally, the coverage did improve with larger sample size, but for reasons of brevity
those results are not displayed here.
19
Table 1. Coverage at Nominal Level .90, Square Region
Method 1 Method 2
c .1 .2 .3 .4 .1 .2 .3 .4
= 1.1 .926 .961 .954 .869 .983 .909 .699 .577
= 1.2 .904 .954 .918 .857 .981 .933 .719 .591
= 1.3 .807 .928 .909 .810 .991 .942 .752 .638
= 1.4 .726 .911 .880 .821 .987 .961 .770 .637
= 1.5 .622 .854 .863 .806 .995 .954 .780 .687
= 1.6 .547 .777 .834 .777 .998 .965 .783 .721
= 1.7 .440 .761 .802 .740 .997 .974 .798 .716
= 1.8 .414 .712 .746 .723 .997 .984 .810 .706
= 1.9 .366 .670 .739 .691 .999 .986 .839 .727
Table 2. Coverage at Nominal Level .95, Square Region
Method 1 Method 2
c .1 .2 .3 .4 .1 .2 .3 .4
= 1.1 .987 .989 .987 .934 .999 .984 .810 .676
= 1.2 .987 .992 .959 .881 .996 .984 .829 .663
= 1.3 .943 .983 .999 .992 1.000 .991 .845 .721
= 1.4 .925 .964 .938 .886 .998 .990 .857 .701
= 1.5 .844 .941 .929 .872 .998 .991 .856 .749
= 1.6 .763 .871 .894 .850 1.000 .993 .873 .785
= 1.7 .661 .860 .880 .811 1.000 .995 .874 .785
= 1.8 .616 .808 .820 .794 1.000 .998 .890 .762
= 1.9 .519 .776 .801 .759 1.000 .997 .901 .784
20
Table 3. Coverage at Nominal Level .99, Square Region
Method 1 Method 2
c .1 .2 .3 .4 .1 .2 .3 .4
= 1.1 .999 .999 .999 .981 1.000 1.000 .926 .784
= 1.2 1.000 .999 .987 .945 1.000 .999 .941 .763
= 1.3 .995 .995 .999 .997 1.000 1.000 .946 .806
= 1.4 .987 .993 .985 .945 1.000 1.000 .948 .802
= 1.5 .964 .978 .966 .925 1.000 1.000 .952 .831
= 1.6 .930 .943 .949 .911 1.000 1.000 .955 .851
= 1.7 .862 .927 .922 .876 1.000 1.000 .949 .863
= 1.8 .838 .895 .885 .863 1.000 1.000 .957 .834
= 1.9 .734 .873 .873 .827 1.000 1.000 .976 .845
Table 4. Coverage at Nominal Level .90, Rectangular Region
Method 1 Method 2
c .1 .2 .3 .4 .1 .2 .3 .4
= 1.1 .948 .965 .926 .874 .983 .861 .582 .492
= 1.2 .893 .945 .896 .853 .987 .848 .582 .507
= 1.3 .835 .925 .883 .838 .992 .894 .637 .548
= 1.4 .737 .879 .848 .784 .992 .887 .632 .558
= 1.5 .631 .840 .804 .764 .997 .921 .675 .560
= 1.6 .571 .784 .788 .759 .996 .922 .687 .602
= 1.7 .472 .727 .731 .693 .998 .940 .707 .601
= 1.8 .418 .681 .712 .695 .998 .951 .728 .626
= 1.9 .396 .645 .709 .672 .997 .954 .737 .651
21
Table 5. Coverage at Nominal Level .95, Rectangular Region
Method 1 Method 2
c .1 .2 .3 .4 .1 .2 .3 .4
= 1.1 .991 .990 .979 .932 .998 .960 .673 .558
= 1.2 .979 .978 .966 .911 .998 .948 .681 .578
= 1.3 .966 .973 .943 .901 .999 .953 .716 .599
= 1.4 .920 .946 .911 .864 .998 .967 .724 .625
= 1.5 .859 .909 .882 .833 1.000 .960 .752 .625
= 1.6 .797 .884 .857 .827 .999 .976 .771 .658
= 1.7 .719 .821 .812 .782 .999 .988 .795 .666
= 1.8 .618 .799 .787 .764 1.000 .988 .806 .699
= 1.9 .567 .751 .773 .745 1.000 .990 .806 .696
Table 6. Coverage at Nominal Level .99, Rectangular Region
Method 1 Method 2
c .1 .2 .3 .4 .1 .2 .3 .4
= 1.1 .999 1.000 .996 .975 1.000 .994 .818 .655
= 1.2 .998 1.000 .994 .958 1.000 .998 .820 .656
= 1.3 1.000 .996 .978 .952 1.000 .996 .843 .685
= 1.4 .992 .982 .968 .931 1.000 .999 .852 .694
= 1.5 .963 .970 .942 .889 1.000 .997 .861 .707
= 1.6 .942 .948 .924 .886 1.000 .996 .885 .747
= 1.7 .889 .912 .882 .852 1.000 1.000 .902 .736
= 1.8 .832 .878 .870 .826 1.000 1.000 .894 .777
= 1.9 .794 .856 .860 .813 1.000 1.000 .899 .776
22
Method 2s superior performance in simulation is interesting, since it also uses less
information (it does not assume that is known) than Method 1. This seems to corrob-
orate the assertion that a data-driven normalization (such as the standard deviation)
is superior in nite-sample to one based purely on a rate of convergence. If an extreme
does not occur in the observed data, normalization via a rate will over-compensate;
conversely, if an unusual number of extremes (or an unusually large extreme) occurs,
the rate under-compensates. Use of the standard deviation instead will automatically
adjust in an appropriate fashion, since it will be smaller in the rst scenario and larger
in the second.
6 Appendix
Proof of Theorem 1 We will principally treat the < 2 case, since when = 2 the
results are already known see Karr (1986). We will consider the joint Fourier/Laplace
Transform of
_
n
d/
_
K
n
X(t) (dt), n
2d/
_
K
n
X
2
(t) (dt)
_
.
We rst consider the case that the lter function has compact support in the set
L = {x R
d
: |x
i
| l i}. Then we can write the following
Eexp{in
d/
_
K
n
X(t) dt n
2d/
_
K
n
X
2
(t) dt}
= Eexp{in
d/
_
K
n
X(t) dt +i
_
2n
d/
_
K
n
X(t)B(dt)}
= Eexp{in
d/
_
R
d
__
K
n
(x +t)W(dt)
_
M(dx)}
where W
t
is a Brownian motion with drift and volatility

2, and B is a Gaussian
random measure independent of M. Conditional on this Brownian motion, the scale
parameter is

n
=
_
1
n
d
_
R
d

_
K
n
(x +t)W(dt)

dx
_
1/
,
23
the skewness is
n
/

n
, with

n
=
1
n
d
_
R
d
__
K
n
(x +t)W(dt)
_
<>
dx,
and the location 1
{=1}

n
, with

n
=
2

1
n
d
_
R
d
__
K
n
(x +t)W(dt)
_
log

_
K
n
(x +t)W(dt)

dx.
We only need to determine the convergence in probability of each parameter. We
focus on the scale, as the other two are proved in a similar fashion. Let us dene
H
x
=
_
K
n
(x+t)W(dt), which is Gaussian with mean
_
K
n
(x+t) dt and covariance
E[(H
x
E[H
x
])(H
y
E[H
y
])] = 2
_
K
n
(x +t)(y +t) dt.
Hence all moments are bounded in K
n
, since the variance is uniformly bounded by
2
_
R
d

2
(t) dt. Let B = (0, 1]
d
, and J
j
=
_
B
|H
x+j
|

dx, so that

n
=
1
n
d
_
R
d
|H
x
|

dx =
1
n
d

jZ
d
_
B
|H
x+j
|

dx =
1
n
d

jZ
d
J
j
where J
j
= 0 if x + j + t L for all x B and t K
n
. Since the summands are
bounded in probability (since, for example, the 2 moment exists), we claim that the
above expression is o
P
(1)+
1
n
d

kK
n
J
k
. If J
j
= 0, then there must exist some x B
and t K
n
such that j + x + t L. For any set A, dene A(l) = {y : d

(y, A) l}
where d

is the sup-norm metric on R


d
. Then

jZ
d J
j
=

jK
n
(l+1)
J
j
, since
j +x +t L max
i=1, ,d
|j
i
+x
i
+t
i
| l
max
i=1, ,d
|j
i
+t
i
| l + 1 for some t K
n
inf
tK
n
max
i=1, ,d
|j
i
+t
i
| l + 1
and hence for such a j
d

(j, K
n
) = inf
kK
n
d

(j, k) = inf
kK
n
max
i=1, ,d
|j
i
+k
i
| l + 1,
24
which implies that j K
n
(l+1). Now we partition into a disjoint union K
n
(l+1) =
K
n

(K
n
(l + 1) \ K
n
). It is simple to show that K
n
(l + 1) = nK(l + 1/n),
where K = K
1
the prototype region. Hence K
n
(l +1)\K
n
= n(K(l +1/n)\K),
and the count of integer lattice points in this set will be asymptotic (as n ) to
n
d
(K(l +1/n) \K) by the denition of Lebesgue measure. However, the sequence
of sets K(l + 1/n) \ K are inscribed for l xed, so by continuity from above,
(K(l + 1/n) \ K)
_
_

n1
K(l + 1/n) \ K
_
_
((K)) = 0
since the intersection is a subset of the boundary. This shows that asymptotically, the
only terms that count in

j
J
j
are those in K
n
. Now these J
k
variables are weakly
dependent, since the Gaussian variables H
x
and H
y
are independent if min
i
|x
i
y
i
| > l.
Hence the law of large numbers holds, and

n
= o
P
(1) +
1
n
d

kK
n
E[J
k
].
It is necessary to determine E[J
k
]; here we follow the argument given in the proof
of Theorem 2 below, which is actually more complicated. Dening

H
x
=
_
R
d
(x +
t)W(dt), one can show that the dierence between E|H
xk
|

and E|

H
xk
|

tends to
zero. Now
E|

H
xk
|

= E

_
R
d
(t) dt +
_
2

_
R
d

2
(t) dtG

which no longer depends on x or k (and G denotes a standard normal random variable).


With similar results for the other parameters, the joint Fourier/Laplace Transform
satises
Eexp{

N
_
1 i

N
_
+i
N
1
{=1}
}
Eexp{(K)

(, )
_
1 i

(, )

(, )
_
+i(K)

(, )1
{=1}
}
25
where

(, ) =
_
E

+
_
2
2
G

_
1/

(, ) = E
_
+
_
2
2
G
_
<>

(, ) =
2

E
__
+
_
2
2
G
_
log
_
+
_
2
2
G
__
with =
_
R
d
(s)(ds) and
2
=
_
_
R
d

2
(s)(ds). This convergence follows from the
convergence in probability of the parameters, because the exponential function in the
transform is bounded. The existence of S

() and U

() follow from the continuity


of the limiting Fourier/Laplace Transform at (0, 0). Letting = 0, one recognizes
the Fourier Transform of an -stable random variable with parameters as described
in Theorem 1, which is S

(); if = 0, one obtains the Laplace Transform of a


positive /2 stable random variable U

(), whose scale parameter is calculated in the


Theorems statement.
We comment that it is sucient to consider convergence of the joint Fourier/Laplace
transform by Fitzsimmons and McElroy (2006). Finally, we must remove the trunca-
tion of the model. This is similar (and actually easier) to the argument presented in
the proof of Theorem 2, and is not repeated here. Thus the proof is complete. 2
Proof of Theorem 2 Note that, since N is a PRM, it follows from the Law of
Large Numbers, independent scattering, spatial homogeneity, and shift invariance of
that
N(K
n
)
(K
n
)
a.s.
1 as n tends to innity. Thus it suces to examine the limit behavior
of
_
n

_
K
n
X(t)N(dt), n

2d

_
K
n
X
2
(t)N(dt)
_
(18)
since (K
n
) = rn
d
(K); at the end we must multiply our results by r
1/
(K)
1/
.
Let us rst consider a lter function with compact support in the set L = {x R
d
:
26
|x
i
| l i}. Then we can write
Eexp{in
d/
_
K
n
X(t)N(dt) n
2d/
_
K
n
X
2
(t)N(dt)}
= Eexp{in
d/
_
K
n
X(t)N(dt) +i
_
2n
d/
_
K
n
X(t)G(t)N(dt)}
= Eexp{in
d/
_
R
d
_

_
K
n
(x +t)N(dt) +
_
2
_
K
n
(x +t)G(t)N(dt)
_
M(dx)}
by introducing a process of iid standard normal random variables {G(t)} that are
independent of M. Conditional on N and the G(t)s, this is an -stable random variable
with scale

N
=
_
1
n
d
_
R
d

_
K
n
(x +t)N(dt) +
_
2
_
K
n
(x +t)G(t)N(dt)

dx
_
1/
,
skewness
N
/

N
, with

N
=
1
n
d
_
R
d
_

_
K
n
(x +t)N(dt) +
_
2
_
K
n
(x +t)G(t)N(dt)
_
<>
dx,
and location 1
{=1}

N
, with

N
=
2

1
n
d
_
R
d
_

_
K
n
(x +t)N(dt) +
_
2
_
K
n
(x +t)G(t)N(dt)
_
log
_

_
K
n
(x +t)N(dt) +
_
2
_
K
n
(x +t)G(t)N(dt)
_
dx.
Hence, to determine convergence, we will establish the limits in probability of each of
these parameters and thereby determine the joint Fourier/Laplace Transform of the
sample mean and sample variance. We provide an explicit proof of the convergence of

N
; the proofs for the other two parameters are similar. Let us write
H
x
=
_
K
(x +s)N(ds) +
_
2
_
K
(x +s)G(s)N(ds)
so that

N
= n
d
_
R
d
|H
x
|

dx. Now {H
x
} is, conditional on N, a Gaussian process
with mean
_
K
(x +s)N(ds) and covariance
Cov
N
[H
x
, H
y
] = E[(H
x
EH
x
)(H
y
EH
y
)|N] = 2
_
K
(x +s)(y +s)N(ds).
27
Taking a second expectation shows that
Cov[H
x
, H
y
] = E[(H
x
EH
x
)(H
y
EH
y
)] = 2
_
K
(x +s)(y +s) ds
which is zero if |x
i
y
i
| l for at least one i between 1 and d. Hence, these variables
are l-dependent (taking the -norm for R
d
), which will be useful in establishing a
weak law of large numbers. It is also true that all moments of H
x
exist (even as n
increases):
|H
x
| ||
_
K
n
|(x +s)|N(ds) +
_
2
_
K
n
|(x +s)||G(s)|N(ds)
L
= ||
_
K
n
|(x +s)|N(ds) +
_
2

_
K
n
(x +s)
2
N(ds)|G
x
|

_
R
d
|(x +s)|N(ds)(|| +
_
2|G
x
|)
where G
x
is a dependent sequence of standard normal random variables. The equality
in distribution follows from the stability property of Gaussian random variables, and
the fact that integration with respect to N is, conditional on N, a discrete sum. The nal
random variable is a Poisson with mean
_
R
d
|(s)|N(ds) multiplied by an independent
Gaussian with mean || and variance 2; all moments therefore exist, even as n .
Next, we write

N
=
1
n
d
_
R
d
|H
x
|

dx =
1
n
d

jZ
d
_
B
|H
x+j
|

dx
using the same notation as in the proof of Theorem 1. By the same arguments, up to
terms going to zero in probability this expression is the same as
1
n
d

kK
n
_
B
|H
xk
|

dx.
Because the variables J
k
=
_
B
|H
xk
|

dx are a random eld in k with nite depen-


dence, the weak law of large numbers applies. Hence

N
= o
P
(1) +
1
n
d

kK
n
E[J
k
]
as n . It remains to compute the expectations. Let

H
x
=
_
R
d
(x +s)N(ds) +
_
2
_
R
d
(x +s)G(s)N(ds)
so that
E|

H
xk
|

= E

_
R
d
(y)N(dy) +
_
2

_
R
d

2
(y)N(dy)

G
xk

28
where

G
x
is a mean zero Gaussian sequence with known correlation structure:
E[

G
x

G
x+h
] =
_
R
d
(y)(y +h) dy
_
R
d

2
(y) dy
.
We claim that the average E[J]
n
is asymptotically E|

H
0
|

. The absolute dierence is

1
n
d

kK
n
E[J
k
]
1
n
d

kK
n
E
_
B
|

H
xk
|

dx

(19)

1
n
d

kK
n
_
B
E

|H
xk
|

H
xk
|

dx.
We have the following inequality, stated as a separate Lemma:
Lemma 1 If 0 < 1,
||a|

|b|

| |a b|

and if 1 < 2,
||a|

|b|

| |a b|

+ 2 max{|a|, |b|}|a b|
/2
for all real numbers a and b.
Proof of Lemma 1. The case 1 is well-known. So suppose that > 1. If
|a| > |b|, then
|a b|

=
_
|a b|
/2
_
2

_
|a|
/2
|b|
/2
_
2
=
_
|a|
/2
|b|
/2
__
|a|
/2
+|b|
/2
2|b|
/2
_
= (|a|

|b|

) 2|b|
/2
_
|a|
/2
|b|
/2
_
where the second line follows from 2. This in turn implies that
|a|

|b|

|a b|

+ 2|b|
/2
_
|a|
/2
|b|
/2
_
|a b|

+ 2 max{|a|
/2
, |b|
/2
}|a b|
/2
.
Now the case that |b| > |a| is similar, which proves the Lemma. 2
29
Using this lemma, it suces to examine the average expected integral of
|H
xk


H
xk
|

_
_

_
K
c
n
(x k +s)N(ds)

_
2
_
K
c
n
(x k +s)G(s)N(ds)

_
_
where is either or /2. If = /2, we have
1
n
d

kK
n
_
B
E|H
xk


H
xk
|
/2
2
/2
1
n
d

kK
n
_
B
_
K
c
n
|(x k +s)|
/2
ds dx
_
||
/2
+ (2)
/4
_
using the fact that EN(ds) = ds. If = , using the result that
E

_
K
c
n
(x k +s)N(ds)

+E

_
2
_
K
c
n
(x k +s)G(s)N(ds)

||

E
_
_
K
c
n
|(x k +s)|
/2
N(ds)
_
2
+ (2)
/2
E
_
_
K
c
n
|(x k +s)|
/2
|G(s)|
/2
N(ds)
_
2
= ||

_
K
c
n
|(x k +s)|

ds
+ (2)
/2
_
_
_
K
c
n
|(x k +s)|

ds +
_
_
K
c
n
|(x k +s)|
/2
ds
_
2
_
_
E|G|

we easily obtain
1
n
d

kK
n
_
B
E|H
xk


H
xk
|

1
n
d

kK
n
_
B
_
K
c
n
|(x k +s)|

ds dx
_
||

+ (2)
/2
E|G|

_
+ 2

1
n
d

kK
n
_
B
_
_
K
c
n
|(x k +s)|
/2
ds
_
2
dx
_
(2)
/2
E|G|

.
_
So in order to show that (19) tends to zero as n , it is enough to demonstrate
that
1
n
d

kK
n
_
B
_
_
K
c
n
|(x k +s)|

ds
_

dx
30
tends to zero, for equal to either or /2, and equal to one or two. Note that,
through a simple change of variable, this becomes
1
n
d
_
K
n
_
_
K
c
n
|(s x)|

ds
_

dx.
It is a simple but tedious analysis exercise to show that this tends to zero as n ,
for = or /2 and equal to one or two. The end result of this analysis is that

N
P
(K)E

_
R
d
(y)N(dy) +
_
2

_
R
d

2
(y)N(dy)Z

= (K)

(, )
where Z is a standard normal random variable. Note that this limiting scale parameter
is not random. Using similar techniques, one can show that
N
converges in probability
to
(K)E
_

_
R
d
(y)N(dy) +
_
2

_
R
d

2
(y)N(dy)Z
_
<>
= (K)

(, )
and
N
tends to
(K)
2

E[
_

_
R
d
(y)N(dy) +
_
2

_
R
d

2
(y)N(dy)Z
_
log
_

_
R
d
(y)N(dy) +
_
2

_
R
d

2
(y)N(dy)Z
_
]
which is (K)

(, ). Now multiplying our results by r


1/
(K)
1/
, our joint
Fourier/Laplace Transform is
Eexp{r
1
(K)
1

N
_
1 i

N
_
+ir
1
(K)
1

N
1
{=1}
} (20)
Eexp{r
1

(, )
_
1 i

(, )

(, )
_
+ir
1

(, )1
{=1}
}
by the dominated convergence theorem. The existence of

S

() and

U

() now fol-
lows from the continuity of the limiting Fourier/Laplace Transform at (0, 0). Letting
= 0, one recognizes the Fourier Transform of an -stable random variable with pa-
rameters as described in Theorem 2, which is

S

(); if = 0, one obtains the Laplace


Transform of a positive /2 stable random variable

U

(), whose scale parameter is


calculated in the Theorems statement.
31
Finally, we must remove the truncation of the model. Dene
X
l
(t) =
_
R
d

l
(x +t) M(dx) =
_
R
d
(x +t)1
lD
(x +t) M(dx)
where D = (1, 1]
d
so that lD is a d-dimensional cube centered at the origin with
width 2l. Then
n
d/
_
K
n
X(t) N(dt) n
d/
_
K
n
X
m
(t) N(dt)
= n
d/
_
R
d
_
K
n
(x +t)1
{lD}
c
(x +t) N(dt)M(dx)
must tend to zero in probability as l for any xed n. Taking the power of the
scale of the above random variable, conditional on N, we obtain
1
n
d
_
R
d

_
K
n
(x +t)1
{lD}
c
(x +t)N(dt)

(dx).
If < 1, this is bounded by
1
n
d
_
R
d
_
K
n
|(x +t)|

1
{lD}
c
(x +t)N(dt) (dx) =
1
n
d
_
K
n
_
{lD}
c
t
|(x +t)|

(dx)N(dt)
=
_
{lD}
c
|(y)|

(dy)
N(K
n
)
n
d
.
Thus, the limit superior as n of
1
n
d
_
R
d

_
K
n
(x +t)1
{lD}
c
(x +t)N(dt)

(dx)
tends to zero as l . When > 1, we use the bound of
sup
x

_
K
n
(x +t)1
{lD}
c
(x +t)N(dt)

1
n
d
_
R
d

_
K
n
(x +t)1
{lD}
c
(x +t)N(dt)

1
(dx)

_
R
d
|(t)|N(dt)
1
n
d
_
R
d
_
K
n
|(x +t)|
1
1
{lD}
c
(x +t)N(dt) (dx)
= Q
_
{lD}
c
|(y)|
1
dy
where Q =
_
R
d
|(t)|N(dt) is a positive random variable that is bounded in n. So the
limit superior in this case also tends to zero as l . Similar arguments can be
applied to the sum of squares, and since the limiting joint Fourier/Laplace Transform
is continuous in l, we can take the limit as l on both sides of our joint weak
convergence (20). This completes the proof. 2
32
Proof of Theorem 3 This proof follows the same structure as Theorem 6.3.1 of
PRW (1999). First we show that
N(B+y)
is almost surely asymptotic to
(B+y)
=

(B)
. As in the proof of Theorem 2,
N(B+y)
(B+y)
a.s.
1 as n ; this uses the condition
that c
n
(K
n
) . It follows from the form of (u) that

N(B+y)

(B+y)
a.s.
1. Let x be a
continuity point of J(x), the cdf of the limit random variable J. Then
_

N(B+y)
_

K
n
,B,y

K
n

K
n
,B,y
_
x
_
=
_

N(B+y)
_

K
n
,B,y


K
n
,B,y
_
x +
N(B+y)
_

K
n


K
n
,B,y
__
.
For any t > 0, let
R
K
n
,B
(t) = (K
n
(1 c))
1
_
K
n
(1c)
1
{
N(B+y)


K
n


K
n
,B,y

t}
dy
= (K
n
(1 c))
1
_
K
n
(1c)
1
{d
N(B+y)

K
n
,B,y
a
N(B+y)
(
K
n
)/t}
dy.
Now for all > 0, a
N(B+y)
(
K
n
) with probability tending to one, since
a
N(B+y)
/a
N(K
n
)
P
0 follows from c
n
(K
n
) . So with probability tending to one,
R
K
n
,B
(t) (K
n
(1 c))
1
_
K
n
(1c)
1
{d
N(B+y)

K
n
,B,y
/t}
dy.
If /t is a continuity point of W, then the above expression tends to P[W /t]
by Theorem 6.3.1 of PRW(1999). Since W has no point mass at zero, we can make
R
K
n
,B
(t) arbitrarily close to 1 by choosing small enough. So for all t, R
K
n
,B
(t)
P
1.
Next, using the inequality
1

N(B+y)


K
n
,B,y


K
n
,B,y

x+
N(B+y)


K
n


K
n
,B,y

N(B+y)


K
n
,B,y


K
n
,B,y

x+t

+ 1

N(B+y)


K
n


K
n
,B,y

>t

we can establish
L
K
n
,B
(x) (K
n
(1 c))
1
_
K
n
(1c)
1
{
N(B+y)


K
n
,B,y


K
n
,B,y

x+t}
dy + (1 R
K
n
,B
(t)).
33
Now by the rst convergence in (16), we may apply Theorem 6.3.1 of PRW(1999) to

N(B+y)
_

K
n
,B,y


K
n
,B,y
_
and obtain, for any > 0, L
K
n
,B
(x) J(x + t) + with probability tending to one.
At this point let t tend to zero. Similar arguments produce the opposite inequality
L
K
n
,B
(x) J(x +t) . Now letting 0, we obtain L
K
n
,B
(x)
P
J(x) as desired.
The proofs of (ii) and (iii) are similar to the proof of Theorem 2.2.1 in PRW(1999).
2
References
[1] Bertail, P., Politis, D., and Romano, J. (1999). On subsampling estimators with
unknown rate of convergence. Journal of the American Statistical Association 94,
Number 446, 569579.
[2] Fitzsimmons, P., and McElroy, T. (2006) On joint fourier-laplace transforms.
Mimeo, http://www.math.ucsd.edu/ politis/PAPER/FL.pdf.
[3] Karr, A.F. (1982). A partially observed poisson process. Stochastic Processes and
their Applications 12, 249269.
[4] Karr, A.F. (1986). Inference for stationary random elds given poisson samples.
Advances in Applied Probability 18, 406422.
[5] McElroy, T., and Politis, D. (2002). Robust inference for the mean in the presence
of serial correlation and heavy-tailed distributions. Econometric Theory 18, 1019
1039.
[6] McElroy, T., and Politis, D. (2004). Large sample theory for statistics of stable
moving averages. Journal of Nonparametric Statistics 16, 623 657.
34
[7] Nordman, D., and Lahiri, S. (2003). On optimal variance estimation under dier-
ent spatial subsampling schemes. In Recent Advances and Trends in Nonparamet-
ric Statistics. Elsevier, San Diego, CA. 421436.
[8] Politis, D., Romano, J., and Wolf, M. (1999). Subsampling. Springer, New York.
[9] Resnick, S., Samorodnitsky, G., and Xue, F. (1999). How misleading can sample
ACFs of stable MAs be? (Very!). Annals of Applied Probability 9, No. 3, 797817.
[10] Rosenblatt, M. (1956). A central limit theorem and a strong mixing condition.
Proceedings of the National Academy of Sciences 42, 4347.
[11] Samorodnitsky, G. and Taqqu, M. (1994) Stable Non-Gaussian Random Processes.
Chapman and Hall.
35

You might also like