0% found this document useful (0 votes)

6 views26 pages

Pohq 4

This article discusses kernel quantile estimation and introduces two new bandwidth selection methods. Simulations with four distributions indicate that kernel smoothed quantile estimators generally outperform empirical quantile estimators, particularly with small sample sizes, although no single method is superior across all distributions. The paper also details the asymptotic mean squared errors and optimal bandwidths for these estimators, along with methods for estimating necessary density and quantile derivatives.

Uploaded by

saeedali2132002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views26 pages

Pohq 4

Uploaded by

saeedali2132002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Bandwidth Selection for Kernel Quantile

Estimation

Ming-Yen Cheng1 and Shan Sun2

Abstract

In this article, we summarize some quantile estimators and related bandwidth

selection methods and give two new bandwidth selection methods. By four distribu-
tions: standard normal, exponential, double exponential and log normal we simulated
the methods and compared their efficiencies to that of the empirical quantile. It turns
out that kernel smoothed quantile estimators, with no matter which bandwidth se-
lection method used, are more efficient than the empirical quantile estimator in most
situations. And when sample size is relatively small, kernel smoothed estimators
are especially more efficient than the empirical quantile estimator. However, no one
method can beat any other methods for all distributions.

Keywords. Bandwidth, kernel, quantile, nonparametric smoothing.

Short Title. Quantile Estimation.

JEL subject classification: C14, C13.

1
Department of Mathematics, National Taiwan University, Taipei 106, Taiwan. Email:
[email protected]
2
Department of Mathematics and Statistics, Texas Tech University, Lubbock, Texas 79409 - 1042,
USA. Email: [email protected]

1
1 Introduction
The estimation of population quantiles is of great interest when a parametric form
for the underlying distribution is not available. In addition, quantiles often arise
as the natural thing to estimate when the underlying distribution is skewed. Let
X1 , X2 , · · · , Xn be an independent and identically distributed random sample drawn
from an absolutely continuous distribution function F with density f . Let X(1) ≤
X(2) ≤ · · · ≤ X(n) denote the corresponding order statistics. The quantile function Q
of the population is defined as Q(p) = inf{x : F (x) ≥ p}, 0 < p < 1. Note that Q is
the left-continuous inverse of F . Denote, for each 0 < p < 1, the pth quantile of F
by ξp , that is, ξp = Q(p).
A traditional nonparametric estimator of the distribution function is the empir-
ical function Fn (x), which is defined as
n
1X
Fn (x) = I(−∞,x] (Xi )
n i=1

where IA (x) = 1 if x ∈ A and 0 otherwise. Accordingly, a nonparametric estimator

of ξp is the empirical quantile

Qn (p) = inf{x : Fn (x) ≥ p} = X([np]+1) ,

where [np] denotes the integer part of np. Let pr = r/(n + 1) and qr = 1 − pr . If we
use X(r) to estimate the pr th quantile, then the asymptotic bias and variance are

pr qr Q00 (pr ) p r qr n 1 000 1 0000 o

ABias{X(r) } = + (q r − p r )Qr + Q ,
2(n + 2) (n + 2)2 3 8 r
p r qr p r qr n 1 00 o
AV ar{X(r) } = Q02 + 2(q r − p r )Q0 00
Q + p q
r r (Q0 000
Q + Q ) .
(n + 2) r (n + 2)2 r r r r
2 r
The asymptotic mean squared error of X(r) should be AM SE{X(r) } = ABias{X(r) }2 +
AV ar{X(r) }.
When F is continuous, it is more natural to use a smooth random function as an
estimator of F since there is a substantial lack of efficiency, caused by the variability
of individual order statistics. Indeed, the choice of Fn does not always lead to the
best estimator of F (cf. Read (1972), who has shown that Fn is inadmissible with
2
respect to the integrated square loss). Intuitively appealing and easily understood
competitors to Qn are the popular kernel quantile estimators, see Section 2.
Section ?? gives the asymptotic mean squared errors and asymptotically optimal
bandwidths for two kernel smoothed quantile estimators. The optimal bandwidths
depend on unknown quantities such as density derivatives and quantile derivatives.
Kernel estimators and optimal bandwidths for these unknowns are addressed as well.
In Section 4, we give four methods to select the bandwidths for the two kernel quantile
estimators based on data. In Section 5 we implement these methods on four specific
distributions and the results of the simulation. The Appendix gives some proofs.

2 Kernel smoothed quantile estimation

2.1 Inverse of kernel distribution function estimator

A popular kernel quantile estimator is based on the Nadaraya (1964) type estimator
for F , defined as
n
1X
F̂n (x) = Kh (x − Xi )
n i=1
where Z x
1 t
Kh (x) = k( )dt,
−∞ h h
R∞
k is a kernel function satisfying k ≥ 0, −∞
k(x)dx = 1. Here h = hn > 0 is called
the smoothing parameter or bandwidth since it controls the amount of smoothness
in the estimator for a given sample of size n. We make the assumption that h → 0
as n → ∞. The corresponding estimator of the quantile function Q = F −1 is then
defined by
Q̂n (p) = inf{x : F̂n (X) ≥ p}, 0 < p < 1. (1)

Nadaraya (1964) showed under some assumptions for k, f and h, Q̂n (p) (appropri-
ately normalized) has an asymptotic standard normal distribution. Another notable
property of Q̂n (p), namely the almost sure consistency, was obtained by Yamato
(1973). Ralescu and Sun (1992) obtained the necessary and sufficient conditions for
the asymptotic normality of Q̂n (p). Azzalini (1981) and an unpublished report used

3
heuristic arguments based on second order approximations and performed some nu-
merical comparisons of Q̂n (p) with the classical sample quantile for estimating the
95th quantile of the Gamma (1) distribution. These studies indicated a considerable
amount of empirical evidence to support the superiority of Q̂n (p) for a variety of
smooth distribution functions.
Azzalini (1981) considered second order property of F̂n under the following as-
sumptions: (i) h → 0 as n → ∞; (ii) the kernel has a finite support, that is,
k(t) = 0 if |t| > t0 for some positive t0 ; (iii) the density f is continuous in the
interval (x − t0 h, x + t0 h); and (iv) f 0 (x) exists. He pointed out that the asymptotic
optimal bandwidth for F̂ is of the form
u 31
hopt = (2)
4vn
where
n Z t0 o n1 Z t0 o2
2 0 2
u = f (x) t0 − K (t)dt) , v = f (x) t k(t)dt .
−t0 2 −t0

Also, Azzalini (1981) suggested, without offering a proof, that (2) is again the asymp-
totically optimal choice of h for Q̂n (p). We state the result in the following theorem
and the proof of the theorem can be found in Shankar (1998).
We make the following assumptions:
Assumption A

(1) f is differentiable with a bounded derivative f 0 ;

(2) f 0 is continuous in the neighborhood of ξp and f 0 (ξp ) 6= 0;

R∞ R∞
(3) −∞
xk(x)dx = 0 and −∞
x2 k(x)dx < ∞.

Theorem 1. Under assumptions (1)-(3), the asymptotic mean squared error of Q̂(p)
is
p(1 − p) h4 f 0 (ξp )2 2 h 1
AM SE Q̂(p) = + µ 2 (k) − ψ(k)
nf (ξp )2 4 f (ξp )2 n f (ξp )
and the asymptotically optimal choice of bandwidth for the smoothed empirical quan-
tile function Q̂n (p) is
31
f (ξp )ψ(k)
hopt,1 = (3)
n{f 0 (ξp )}2 µ2 (k)2
4
R∞ 2
R
where µ2 (k) = −∞
t k(t)dt and ψ(k) = 2 yk(y)K(y)dy. If we take k as the standard
R∞ √
normal density, then −∞ tdK 2 (t) = 1/ π, µ2 (k) = 1 and
" # 31
f (ξp )
hopt,1 = √ .
πn{fg0 n∗ (ξp )}2

2.2 Kernel smoothing the order statistics

Another type of smooth quantile estimator, provided by Yang (1985) and also traced
to Parzen (1979), is
n i
1 p−x
Z
X n
Q̃n (p) = X(i) k( )dx. (4)
i=1
i−1
n
h h

It is clear that when i/n is close to p, Q̃n (p) puts more weight on the order statistics
X(i) . The asymptotic normality and mean squared consistency of Q̃n (p) were pro-
vided by Yang (1985), while Falk (1984) showed that the asymptotic performance of
Q̃n (p) is better than that of the empirical sample quantile Qn (p) in the sense of rel-
ative deficiency for appropriately chosen kernels and sufficiently smooth distribution
functions.
Building on Faulk (1984), Sheater and Morron (1990) gave the asymptotic mean
squared error (AMSE) of Q̃n (p) as follows if f is not symmetric or f is symmetric
but p 6= 0.5:
p(1 − p) 2 1 h
q (p) + h4 q 0 (p)2 µ2 (k)2 − q 2 (p)ψ(k)

AM SE Q̃n (p) = (5)
n 4 n
where q = Q0 and q 0 = Q00 . If q = Q0 > 0 then
13
Q0 (p)2 ψ(k)

hopt,2 = . (6)
nQ00 (p)2 µ2 (k)2

Remark 2.1. When F is symmetric and p = 0.5, then

AM SE Q̃n (p) = n−1 [q(0.5)]2 {0.25 − 0.5hψ(k) + n−1 h−1 R(k)},

R
where R(k) = k 2 (x)dx. In this case, there is no single optimal bandwidth minimizing
the AM SE.

5
Remark 2.2. If q = 0, we need higher order terms. The AM SE of Q̃n (p) can be
shown as follows:
Z
1 1 4 00 2 2 −1 2 00 2
AM SE Q̃n (p) = ( − )h Q (q) µ2 (k) + 2n h Q (q) (q − ht)tk(t)j(t)dt
4 n
Rt
where j(t) = −∞ xk(x)dx. The proof is provided in the Appendix.

3 Density and quantile derivative estimation

The asymptotically optimal bandwidths hopt,1 and hopt,2 for Q̂n (p)) and Q̃n (p) depend
on f (ξp ), f 0 (ξp ), Q0 (p) and Q00 (p). This section provides nonparametric estimators of
these quantities and the asymptotically optimal bandwidths.

3.1 Density derivative estimation

From (3) we know that we need to estimate f 0 . A natural estimator of the rth
derivative (r ≥ 1) of f can be obtained by differentiating the estimator
n n
d d n1 X o 1X
fˆgn (x) = F̂n (x) = Kgn (x − Xi ) = kg (x − Xi ) (7)
dx dx n i=1 n i=1 n

of the density f (x), giving

n n
dr 1 X x − Xi 1 X (r) x − Xi
fˆg(r) (x) = k = k (8)
n
dxr ngn i=1 gn ngnr+1 i=1 gn

where gn is the smoothing parameter (Wand and Jones, 1995). Therefore, the asymp-
(r)
totic mean squared error properties of fˆgn (x) can be derived straightforwardly to
obtain (Wand and Jones, 1995)
1 1 2
AM SE{fˆg(r) R(k (r) )f (x) + gn4 {µ2 (k)}2 f (r+2) (x)

(x)} = (9)
n
ngn2r+1 4
R
where R(η) = η 2 (x)dx for any square-integrable function η. It follows that the
AMSE-optimal bandwidth for estimating f (r) (x) is of order n−1/(2r+5) . The asymp-
totically optimal bandwidth for for fˆgn (x) is given by
51
R(k)f (x)
gn∗ = . (10)
n(µ2 (k))2 f 00 (x)2
6
and the asymptotically optimal bandwidth for fˆ0 gn (x) is
71
3R(k 0 )f (x)

gn∗∗ = . (11)
n(µ2 (k))2 f 000 (x)2

When k is the standard Normal density,

15 17
f (x) 3f (x)
gn∗ = √ 00 2 , gn∗∗ = √ 000 2 .
n πf (x) 4n πf (x)

3.2 Quantile derivative estimation

Next, we estimate Q0 = q and Q00 = q 0 in the following ways. From (4), we know that
the estimator of Q0 = q can be constructed as follows:
Pn i−1
q̃(p) = Q̃0 n (p) = i=1 X(i) [ka (p − n
) − ka (p − ni )]
Pn i−1
= i=2 (X(i) − X(i−1) )ka (p − n
) − X(n) ka (p − 1) + X(1) ka (p).
(12)
where ka (x) = a1 k( xa ) and a = an is the bandwidth for q̃. Jones (1992) derived that
the asymptotic MSE of q̃(p) is given as follows:

a4
Z
1 2
AM SE{q̃(p)} = q 00 (p)2 µ2 (k)2 + q (p) k 2 (y)dy . (13)
4 na
Minimizing (13) with respect to a, we obtain the asymptotically optimal bandwidth
for q̃(p) as
1
Q0 (p)2 k 2 (y)dy 5
R
a∗opt = . (14)
nQ000 (p)2 µ2 (k)2
To estimate Q00 = q 0 in (6), note that
n n p − i−1 i o
d 0 1 X 0 n 0 p− n
Q̃00 n (p) = Q̃ (p) = 2 X(i) k −k . (15)
dp n a i=1 a a

Similarly, we obtain the asymptotically optimal bandwidth for Q̃00 n (p) as

R 0 2 1
3 k (x) dxQ0 (p)2 7
a∗∗
opt = . (16)
nµ2 (k)2 Q(4) (p)2

7
4 Bandwidth selection
In this section, we consider several data-based methods to find the asymptotically
optimal bandwidths for the estimators Q̂n (p) and Q̃n (p). Bandwidth plays a critical
role in implementation of practical estimation. It determines the trade-off between
the amount of smoothness obtained and closedness of the estimation to the true
distribution. (see Wand and Jones)

4.1 Method 1. Approximate hopt,1 for Q̂n (p) using density

derivative estimators
Note that the asymptotically optimal bandwidth hopt,1 for Q̂n (p), given in (3), involves
f (ξp ) and f 0 (ξp ), which can be estimated by fˆgn (ξˆp ) and fˆ0 (ξˆp ) respectively. Here, ξˆp
gn

is the empirical p-th quantile Qn (p). Using in (10) with f (ξˆp ) and f 00 (ξˆp ) replaced
gn∗
by their Normal(µ, σ 2 ) reference values, we obtain fˆgn∗ (x). Using gn∗∗ in (11) with
f (ξˆp ) and f 000 (ξˆp ) replaced by their Normal(µ, σ 2 ) reference values, we obtain fˆ0 ∗∗ (x).
gn

Plugging this into (3), we have a data-based bandwidth

" # 13
fˆg∗ (ξˆp )ψ(k)
ĥopt,1 = n 2 (17)
n fˆ0 g∗∗ (ξˆp ) µ2 (k)2
n

for Q̂n (p). If k is the standard normal density then

" # 13
fˆgn∗ (ξˆp )
ĥopt,1 = √ ˆ0 2 . (18)
n π f g∗∗ (ξˆp )
n

Remark 4.1. In the expression of the hopt,1 , we have the derivative of f in the
denominator. If f 0 has zeros, then its estimates at these zeros are also very small.
Hence the estimator ĥopt,1 of hopt,1 at these zeros will be very unstable. For example,
if f is standard normal, then f 0 = −xf has a zero at x = 0, which corresponds to
p = 0.5, and hence, when p = 0.5, the estimator ĥopt,1 is very unstable. Similarly,
the first derivative of the double exponential density has a zero at x = 0 and the first
derivative of the log normal density has a zero at x = e−1 .

8
4.2 Method 2. Approximate hopt,2 for Q̃n (p) using quantile
derivative estimators
The asymptotically optimal bandwidth hopt,2 , given in (6), for Q̃n (p) involves the
unknown quantities Q0 (p) and Q00 (p), which can be estimated by Q̃0n (p) and Q̃00n (p)
in (12) and (15), respectively. The asymptotically optimal bandwidths a∗opt and a∗∗
opt ,

given in (14) and (16), for Q̃0n (p) and Q̃00n (p) depend on Q0 (p), Q000 (p) and Q(4) (p). We
replace these unknowns by their Normal(µ, σ 2 ) reference values. Then, using Q̃0n (p)
with a = a∗opt and Q̃00n (p) with a = a∗∗ opt , we have the following data-based bandwidth
( ) 31
Q̃0 n (p)2 ψ(k)
ĥopt,2 = (19)
nQ̃00 n (p)2 µ2 (k)2
for Q̃n (p).

4.3 Method 3. Approximate hopt,1 for Q̂n (p) using quantile

derivative estimators
We introduce an alternative way of estimating f (ξp ) and f 0 (ξp ) in hopt,1 , see (3), which
uses estimators of the quantile derivatives. Note that
1 1 1
Q0 (p) = = = (20)
f (F −1 (p)) f (Q(p)) f (ξp )
−f 0 (Q(p)) −f 0 (ξp )
Q00 (p) = = . (21)
f 3 (Q(p)) f 3 (ξp )
Hence, (3) becomes
31
Q0n (p)5 ψ(k)

hopt,1 = .
nQ00n (p)2 µ2 (k)2
Similar to Method 2, first replace the unknowns in a∗opt and a∗∗
opt by their Normal

reference values, and then use Q̃0n (p) with a = a∗opt and Q̃00n (p) with a = a∗∗
opt to get
( ) 31
Q̃0 n (p)5 ψ(k)
h̄opt,2 = . (22)
nQ̃00 n (p)2 µ2 (k)2
If we take k as the standard normal density, then
( ) 31
Q̃0 n (p)5
h̄opt,2 = √ .
n π Q̃00 n (p)2

9
4.4 Method 4. Approximate hopt,2 for Q̃n (p) using density
derivative estimators
From (20) and (21), we have
31
f (ξp )4 ψ(k)

hopt,2 = . (23)
nf 0 (ξp )2 µ2 (k)2

Then, plugin the estimators of f (ξp ) and f 0 (ξp ) in Method 1, see (17), to obtain
( ) 31
fˆgn∗ (ξˆp )4 ψ(k)
h̄opt,2 = . (24)
nfˆ0 g∗∗ (ξˆp )2 µ2 (k)2
n

When k is standard normal density, h̄opt,2 becomes

( ) 13
fˆgn∗ (ξˆp )4
h̄opt,2 = √ .
n π fˆ0 gn∗∗ (ξˆp )2

5 Numerical Performance
We implement the methods in Section 4. Four distributions are selected: Exponential,
Double Exponential, Lognormal and standard Normal. We shall use the standard
normal density as the kernel k, i.e. k(x) = √1
2π
exp(−x2 /2). Then k 0 (x) = −xk(x).
and we can find Z
µ2 (k) = x2 k(x)dx = 1,
Z Z x
1
ψ(k) = 2 {k(x)[ k(t)dt]}dx = √ ,
−∞ π
Z
1
R(k) = k 2 (x)dx = √ ,
2 π
Z Z
1
R(k 0 ) = {k 0 (x)}2 dx = x2 k 2 (x)dx = √ .
4 π

5.1 True values

In the following we compute the asymptotically optimal bandwidths and the AM SEs
for the four distributions. First, we have the relationship between Q(p) and f (ξp ) as

10
following

1 f 0 (ξp ) 3f 0 (ξp )2 − f (ξp )f 00 (ξp )

Q0 (p) = , Q00 (p) = − , Q000
(p) = ,
f (ξp ) f (ξp )3 f (ξp )5

10f (ξp )f 0 (ξp )f 00 (ξp ) − f (ξp )2 f 000 (ξp ) − 15f 0 (ξp )3

Q(4) (p) = .
f (ξp )7
Using the above results, the asymptotic mse of Q̃n (p) is
p(1 − p) h4 f 0 (ξp )2 h
AM SE Q̃n (p) = 2
+ 6
− √ .
nf (ξp ) 4f (ξp ) n πf (ξp )2
Also we have 51
f (ξp )8

∗
a = √ ,
2n π(3f 0 (ξp )2 − f (ξp )f 00 (ξp ))2
71
3f (ξp )12

∗∗
a = √ .
4n π(10f (ξp )f 0 (ξp )f 00 (ξp ) − f (ξp )2 f 000 (ξp ) − 15f 0 (ξp )3 )2
Case 1. f is the standard normal density. We have
1 x2
f (x) = √ e− 2 , f 0 (x) = (−x)f (x), f 00 (x) = (x2 − 1)f (x), f 000 (x) = (3x − x3 )f (x).
2π
Hence, with x = ξp ,
(√ ) 51 ( √ ) 71
2 2
2 exp (x /2) 3 2 exp (x /2)
gn∗ = , gn∗∗ = ,
n(x2 − 1)2 4n(3x − x3 )2
√
2πp(1 − p) x2
2h x2 h4 2
AM SE Q̂n (p) = e − e2 + x ,
n n 4
" # 1 ( ) 71
−2x2 5 −3x2
1 e 1 3e
a∗ = √ √ , a∗∗ = √ √ ,
2π 2
2n(2x + 1) 2 2π 2 2n(6x3 + 7x)2
√
2πp(1 − p) x2 2 4 2 2x2 2 πh x2
AM SE Q̃n (p) = e +π h x e − e .
n n
Case 2. f is the density of Exponential(1). We have

f (x) = e−x = −f 0 (x) = f 00 (x) = −f 000 (x).

Hence 15 17
exp(x) 3 exp(x)
gn∗ = √ , gn∗∗ = √ ,
n π 4n π
11
p(1 − p) 2x h h4
e − √ ex + ,

AM SE Q̂n (p) =
n n π 4
−4x 15 17
3e−6x

∗ e ∗∗
a = √ , a = √ ,
8 πn 144 πn
p(1 − p) 2x h4 4x h
e + e − √ e2x

AM SE Q̃n (p) =
n 4 n π
Case 3. f is the density of lognormal. We have
1 log2 x
f (x) = √ e− 2 ,
2πx
1 log2 x 1 1 f (x)
f 0 (x) = √ e− 2 (− 2 − 2 log x) = − (1 + log x),
2π x x x
1 log2 x 1 f (x)
f 00 (x) = √ e− 2 3 (1 + 3 log x + log2 x) = 2 (1 + 3 log x + log2 x),
2π x x
1 log2 x 1 f (x)
f 000 (x) = √ e− 2 4 (−8 log x−6 log2 x−log3 x) = − 3 (8 log x+6 log2 x+log3 x).
2π x x
Hence
4
15 6
71
x 3x
gn∗ = √ , gn∗∗ = √ ,
n π(1 + 3 log x + log2 x)2 f (x) 4n π(8 log x + 6 log2 x + log3 x)2 f (x)
√
2πp(1 − p) 2 log2 x 2h log2 x h4 (1 + log x)2
AM SE Q̂n (p) = xe − xe 2 + ,
n n 4 x2
( ) 15
−2 log2 x
1 e
a∗ = √ √ ,
2π n 2(2 + 3 log x + 2 log2 x)2
( ) 71
−3 log2 x
1 3e
a∗∗ = √ √ ,
2π 2n 2(5 + 13 log x + 11 log2 x + 6 log3 x)2
2 2
n h o
AM SE Q̃n (p) = π 2 h4 (1 + log x)2 e2 log x + 2πx2 elog x p(1 − p) − √ .

π
Case 4. f is the density of double exponential.
We have f (x) = 12 e−|x| = f 00 (x) except at x = 0 and
(
0 − 12 e−x x > 0 1
f (x) = = − sign(x)e−|x| = f 000 (x).
1 x
e x<0 2
2

12
Hence
15 17
2e|x| 3e|x|

gn∗∗ = √ , gn∗∗ = √
n π 2n π
4p(1 − p) 2|x| 2h h4
e − √ e|x| + ,

AM SE Q̂n (p) =
n n π 4
1
−4|x| 5 −6|x| 71

∗ e ∗∗ e
a = 7
√ , a = 10
√ ,
2n π 2 3n π
4p(1 − p) 2|x| 4h
AM SE Q̃n (p) = 4h4 e4|x| + e − √ e2|x| .

n n π

5.2 Simulation results

We sampled from the four distributions of size 50, 100, 500, and 1000, and computed
the bandwidths and AMSE’s at values of p from 0.05 to 0.95 with step size 0.05.
However, by remark 2.1.1, we omitted p = 0.5 0.5 for normal and double exponential
distributions and p = 0.35 for lognormal. We repeated the computation for 100 times.
In the first several times of simulations, we obtained some extremely large or small
bandwidths, which certainly resulted in extremely large asymptotic MSE. Hence we
adopted the strategy in Sheather and Marron (1990) to adjust too small or large
bandwidths. For example, in method 1, we forced fˆ0 (ξp )−2 to be in the interval [0.05,
1.5] as follows: if it is not in the interval, we replace it by the closest endpoint of the
interval. Simulation results are displayed by figures. In the figures, plotted against
p is the relative efficiency, i.e. the ratio of the AMSE of the different methods to
the AMSE of the empirical quantile. Figures 1–4 summarize performance of different
methods with the same sample size for the four distributions. Figures 5–8 show
performance of one method with different sample sizes.
From Figures 1-4 we can see that the solid line, which corresponds to sample size
n = 50, is almost the lowest in each plot. This is because when sample size is small,
the empirical quantile has a relatively bigger M SE. Hence the kernel estimators are
relatively more efficient.
Generally speaking, the four methods did a better job than empirical quantiles.
For example, in Figure 6, we can see that when n = 50 only method 2 gave an
efficiency more than 1 with p values between 0.75 and 0.95. Efficiency of all other

13
methods are under 1 with all p values. But, unfortunately, no method works better
than all the other methods for all distributions and all sample sizes. In Figure 8, for
example, Method 2 sometimes works better than the others, but sometimes worse
than the others. From this Figure it seems that Method 1 is always more efficient
than Method 3. But if we look at Figure 6, Method 3 is more efficient than Method 1
for many p values in each sample size. We can also see from Figures 5-8 that plots of
Method 1 (2) are similar to plots of Method 3 (4). This is not casual because we use
the same formula to compute their asymptotic M SEs. From Figures 1–4, we observe
that another common behavior for Method 2 and Method 4 is that they performance
badly near the boundaries, i.e. when p is close to 0 or 1.
In a word, the kernel quantile estimators, wit no matter which bandwidth selec-
tion method, are more efficient than the empirical quantile estimator in most situa-
tions. And when sample size n is relatively small, say n=50, they are significantly
more efficient than the empirical quantile estimator. But no one single method is
most efficient in any situations.

References
[1] Azzalini, A. (1981). A note on the estimation of a distribution function and
quantiles by a kernel method. Biometrika 68, 326-328.

[2] Faulk, M. (1984). Relative deficiency of kernel type estimators of quantiles,

Ann. Stat., 12, 261-268.

[3] Jones, M. C. (1992). Estimating densities, quantile, quantile densities and

density quantiles. Ann. Inst. Statist. Math., 44, 721-727

[4] Nadaraya, E. A. (1964). Some new estimates for distribution functions. Theory
Probab. Appl., 9, 497-500.

[5] Parzen, E. (1979). Nonparametric statistical data modeling. J. Amer. Stat.

Assoc., 74, 105-131.

14
[6] Read, R.R. (1972). The asymptotic inadmissibility of the sample distribution
function. Ann. Math. Statist., 43, 89-95.

[7] Ralescu, S. S. and Sun, S. (1993). Necessary and sufficient conditions for the
asymptotic normality of perturbed sample quantiles. J. Statist. Plann. Infer-
ence, 35 55-64.

[8] Shankar, B. (1998). An optimal choice of bandwidth for perturbed sample

quantiles, master thesis.

[9] Sheather, S. J. and Marron, J. S. (1990). Kernel quantile estimators. J. Amer.

Statist. Assoc., 85, 410-416.

[10] Wand, M. P. and Jones, M. C. (1995). Kernel smoothing. Chapman and Hall,
London.

[11] Yamato, H. (1973). Uniform convergence of an estimator of a distribution

function. Bull. Math. Statist., 15, 69–78.

[12] Yang, S. S. (1985). A smooth nonparametric estimation of a quantile function.

J. Amer. Stat. Assoc., 80, 1004-1011.

15
Appendix

We now provide the proof for AM SE in Remark 2.2. Here we follow the notation
of Faulk (1984). Since F −10 (q) = Q0 (q) = 0, we have
R1 R
= n−1 { k(x)(q − αn x − 1(0,q−αn x) (y))F −10 (q − αn x)dx}2 dy

V ar Q̃n (p) 0

R1 R
= n−1 0
{ k(x)(q − αn x − 1(0,q−αn x) (y))[F −10 (q) − αn xF −100 (q) + O(αn2 )]dx}2 dy

R1 R
= n−1 0
{ k(x)(q − αn x − 1(0,q−αn x) (y))(−αn xF −100 (q))dx}2 dy + O(n−1 αn2 )

R1 R
= b 0
{ k(x)(q − αn x − 1(0,q−αn x) (y))xdx}2 dy + O(n−1 αn2 )

R1
xk(x)1(0,q−αn x) (y))dx}2 dy + O(n−1 αn2 )
R R R
= b 0
{q xk(x)dx − αn x2 k(x)dx −

R1
xk(x)1(0,q−αn x) (y))dx]2 dy + O(n−1 αn2 )
R
= b 0
[αn µ2 (k) +

R1 2 2
R
= b 0
{αn µ 2 (k) + 2αn µ 2 (k) xk(x)1(0,q−αn x) (y))dx

+[ xk(x)1(0,q−αn x) (y))dx]2 }dy + O(n−1 αn2 )

R1R
= bαn2 µ22 (k) + 2cαn µ2 (k) 0
xk(x)1(0,q−αn x) (y))dxdy

R1 R
+b 0
[ xk(x)1(0,q−αn x) (y))dx]2 }dy + O(n−1 αn2 )

4
= bαn2 µ22 (k) + 2bαn µ2 (k)S1 + bS2 + O(n−1 αn2 )

16
where b = n−1 αn2 F −100 (q)2 . But
R1R
S1 = 0
xk(x)1(0,q−αn x) (y))dxdy

R R1
= xk(x) 0
1(0,q−αn x) (y))dydx

R
= xk(x)(q − αn x)dx

R R
= q xk(x)dx − αn x2 k(x)dx

= −αn µ2 (k)

and R1 R
S2 = 0
[ xk(x)1(0,q−αn x) (y))dx]2 }dy

R 1 R q−y
= 0
αn
[ q−1 xk(x)dx]2 dy
αn

R q−y R1 R q−y
= {y[ αn
q−1 xk(x)dx]2 }|10 − 0
yd{[ αn
q−1 xk(x)dx]2 }
αn αn

R1 R q−y
αn q−y q−y 1
= −2 0
{y[ q−1 xk(x)dx] α k( α )(− α )}dy
n n n
αn

2
R1 q−y q−y
R q−y
αn
= αn 0
{y αn
k( αn
)[ q−1 xk(x)dx]}dy
αn

2
R q−1
αn
Rt
= αn q {(q − αn t)tk(t)[ q−1 xk(x)dx]}d(−αn t)
αn αn

R αqn Rt
= 2 q−1 {(q − αn t)tk(t)[ q−1 xk(x)dx]}dt
αn αn

R αqn
= 2 q−1 (q − αn t)tk(t)j(t)dt
αn

4 Rt
where j(t) = xk(x)dx and c is such that k is finitely supported in [−c, c]. Then
−c
Z
−1 2 −100 −1 2 −100
2 2 2 2
(q−αn t)tk(t)j(t)dt+O(n−1 αn2 ).

V ar Q̃n (p) = −n αn F (q) αn µ2 (k)+2n αn F (q)

17
If we replace αn by h and F −100 (q) by Q00 (q), then
Z
−1 4 00
(q)2 µ22 (k) + 2n−1 h2 Q00 (q)2 (q − ht)tk(t)j(t)dt + O(n−1 h2 ).

V ar Q̃n (p) = −n h Q

But the bias of Q̃n (p) is

1
bias = h2 µ2 (k)Q00 (q) + O(h2 ) + O(n−1 ).
2
Hence the MSE of Q̃n (p) is

h4 2
µ (k)Q00 (q)2 + O(h4 ) + O(n−1 h2 ) − n−1 h4 Q00 (q)2 µ22 (k)

M SE Q̃n (p) = 4 2

+2n−1 h2 Q00 (q)2 (q − ht)tk(t)j(t)dt + O(n−1 h2 ).

That is
Z
1 1
AM SE Q̃n (p) = ( − )h4 Q00 (q)2 µ22 (k) + 2n−1 h2 Q00 (q)2

(q − ht)tk(t)j(t)dt.
4 n

18
Figure 1: Efficiency under double exponential. Different panels correspond to
different methods.

19
Figure 2: Efficiency under exponential. Different panels correspond to different
methods.

20
Figure 3: Efficiency under Log Normal. Different panels correspond to different
methods.

21
Figure 4: Efficiency under standard Normal. Different panels correspond to
different methods.

22
Figure 5: Efficiency under double exponential. Different panels correspond to
different sample sizes.

23
Figure 6: Efficiency under exponential. Different panels correspond to different
sample sizes.

24
Figure 7: Efficiency under Log Normal. Different panels correspond to different
sample sizes.

25
Figure 8: Efficiency under standard Normal. Different panels correspond to
different sample sizes.

STAM Formula Sheet
100% (2)
STAM Formula Sheet
4 pages
STAT 231 Course Notes Winter
100% (1)
STAT 231 Course Notes Winter
358 pages
Non-Parametric Methods Using Kernel Density Estimation
No ratings yet
Non-Parametric Methods Using Kernel Density Estimation
1 page
Grade 10 Math Exam 4th PDF
100% (6)
Grade 10 Math Exam 4th PDF
3 pages
Mathematics 10 - Q4
No ratings yet
Mathematics 10 - Q4
84 pages
A Note On Nonparametric Estimation of Conditional
No ratings yet
A Note On Nonparametric Estimation of Conditional
7 pages
Quantile Methods Notes 2021
No ratings yet
Quantile Methods Notes 2021
12 pages
Kubat 1980
No ratings yet
Kubat 1980
8 pages
10.1515 Rose.2011.008
No ratings yet
10.1515 Rose.2011.008
26 pages
Lecture 12
No ratings yet
Lecture 12
4 pages
Non Parametric Density Estimation
No ratings yet
Non Parametric Density Estimation
4 pages
Tsionas 2020 - Bayesian Quantile SFM
No ratings yet
Tsionas 2020 - Bayesian Quantile SFM
8 pages
Nonparametric Quantile Regression: Ichiro Takeuchi Quoc V. Le Tim Sears Alexander J. Smola
No ratings yet
Nonparametric Quantile Regression: Ichiro Takeuchi Quoc V. Le Tim Sears Alexander J. Smola
32 pages
Naval Research Logistics - 2009 - Liu - Kernel Estimation of Quantile Sensitivities
No ratings yet
Naval Research Logistics - 2009 - Liu - Kernel Estimation of Quantile Sensitivities
15 pages
Density Estimation 36-708
No ratings yet
Density Estimation 36-708
32 pages
Distribution
No ratings yet
Distribution
4 pages
Ast Part1 PDF
No ratings yet
Ast Part1 PDF
20 pages
The Optimal Bandwidth For Kernel Density Estimation of Skewed Distribution: A Case Study On Survival Time Data of Cancer Patients
No ratings yet
The Optimal Bandwidth For Kernel Density Estimation of Skewed Distribution: A Case Study On Survival Time Data of Cancer Patients
9 pages
A Stabilized Bandwidth Selection Method For Kernel Smoothing of The Periodogram
No ratings yet
A Stabilized Bandwidth Selection Method For Kernel Smoothing of The Periodogram
12 pages
Nadaraya-Watson Teoria PDF
No ratings yet
Nadaraya-Watson Teoria PDF
9 pages
Nonlife Actuarial Models: Nonparametric Model Estimation
No ratings yet
Nonlife Actuarial Models: Nonparametric Model Estimation
41 pages
Functional Estimation For Density, Regression Models and Processes (Odile Pons)
No ratings yet
Functional Estimation For Density, Regression Models and Processes (Odile Pons)
205 pages
Lec7 Density PDF
No ratings yet
Lec7 Density PDF
9 pages
(Paper) Wand, M. P. and Schucany, W. R. (1990) - Gaussian-Based Kernels. Canad. J. Statist. 18 197-204
No ratings yet
(Paper) Wand, M. P. and Schucany, W. R. (1990) - Gaussian-Based Kernels. Canad. J. Statist. 18 197-204
9 pages
Demsity Estimation
No ratings yet
Demsity Estimation
11 pages
Intro&NP Stat
No ratings yet
Intro&NP Stat
122 pages
A Kernel Method For Estimating Finite Population Distribution Functions Using Auxiliary Information
No ratings yet
A Kernel Method For Estimating Finite Population Distribution Functions Using Auxiliary Information
9 pages
Kernel Density Estimation and Its Application
No ratings yet
Kernel Density Estimation and Its Application
8 pages
牛颖Introduction to M-estimator
No ratings yet
牛颖Introduction to M-estimator
4 pages
Parametric Quantile Regression Censored Truncated Data Biometrics 2017
No ratings yet
Parametric Quantile Regression Censored Truncated Data Biometrics 2017
10 pages
Getdist: Kernel Density Estimation: Url: Http://Cosmologist - Info
No ratings yet
Getdist: Kernel Density Estimation: Url: Http://Cosmologist - Info
11 pages
Dattner and Reiser, Estimation of Distribution Functions in Measurement Error Models (2013)
No ratings yet
Dattner and Reiser, Estimation of Distribution Functions in Measurement Error Models (2013)
15 pages
Quantile Estimation, ASTM Data Points, July-August 2014
No ratings yet
Quantile Estimation, ASTM Data Points, July-August 2014
3 pages
Estimando Una Funcion de Distribucion Con Datos Truncados
No ratings yet
Estimando Una Funcion de Distribucion Con Datos Truncados
16 pages
Estimating The Support of A High-Dimensional Distribution
No ratings yet
Estimating The Support of A High-Dimensional Distribution
28 pages
Estimations
100% (1)
Estimations
183 pages
Estimation of Integrated Functionals of A Monotone Density
No ratings yet
Estimation of Integrated Functionals of A Monotone Density
30 pages
Miscellaneous 0
No ratings yet
Miscellaneous 0
25 pages
Modelling 05 00031
No ratings yet
Modelling 05 00031
15 pages
STAT2102 Chapter6
No ratings yet
STAT2102 Chapter6
5 pages
Empirical Finance1
No ratings yet
Empirical Finance1
31 pages
Chacon Rodriguez Casal 2010
No ratings yet
Chacon Rodriguez Casal 2010
6 pages
Classification and Kernel Density Estimation
No ratings yet
Classification and Kernel Density Estimation
7 pages
Cours 2 MVA
No ratings yet
Cours 2 MVA
5 pages
Choi 2018
No ratings yet
Choi 2018
13 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
B E E Q E D P T: Ayesian Stimation of Xtreme Uantiles and The Xceedance Istribution For Aretian Ails
No ratings yet
B E E Q E D P T: Ayesian Stimation of Xtreme Uantiles and The Xceedance Istribution For Aretian Ails
13 pages
Convergence Uniforme Index
No ratings yet
Convergence Uniforme Index
23 pages
Densityestimation
No ratings yet
Densityestimation
28 pages
Kde Presentation PDF
No ratings yet
Kde Presentation PDF
105 pages
Bagkavos and Patil (2009)
No ratings yet
Bagkavos and Patil (2009)
25 pages
Zhang 1994
No ratings yet
Zhang 1994
22 pages
Adapting To Unknown Smoothness: R. M. Castro May 20, 2011
No ratings yet
Adapting To Unknown Smoothness: R. M. Castro May 20, 2011
9 pages
Tight Distribution-Free Confidence Intervals For Local Quantile Regression
No ratings yet
Tight Distribution-Free Confidence Intervals For Local Quantile Regression
50 pages
Minimum L - Distance Estimators For Non-Normalized Parametric Models
No ratings yet
Minimum L - Distance Estimators For Non-Normalized Parametric Models
32 pages
Optimal Bandwidth Choice For The Regression Discontinuity Estimator 2009
No ratings yet
Optimal Bandwidth Choice For The Regression Discontinuity Estimator 2009
27 pages
Hinkley 1975
No ratings yet
Hinkley 1975
12 pages
Common Statistical Densities: Appendix 1
No ratings yet
Common Statistical Densities: Appendix 1
59 pages
The Study of Different Types of Kernel Density Estimators: Minge Sha, Yonggang Xie
No ratings yet
The Study of Different Types of Kernel Density Estimators: Minge Sha, Yonggang Xie
5 pages
Stat-Review Xid-8243919 1
No ratings yet
Stat-Review Xid-8243919 1
24 pages
Nonparametric Statistics Epiphany 2024-25
No ratings yet
Nonparametric Statistics Epiphany 2024-25
102 pages
Bayesian Selector of Adaptive Bandwidth For Multivariate Gamma Kernel Estimator On (0, )
No ratings yet
Bayesian Selector of Adaptive Bandwidth For Multivariate Gamma Kernel Estimator On (0, )
23 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
McKinsey Problem Solving Test
No ratings yet
McKinsey Problem Solving Test
19 pages
Lesson 5
No ratings yet
Lesson 5
20 pages
Package Openair': December 7, 2020
No ratings yet
Package Openair': December 7, 2020
165 pages
Outlier Detection For Multivariate Time Series
No ratings yet
Outlier Detection For Multivariate Time Series
12 pages
Quantile NPCI
No ratings yet
Quantile NPCI
4 pages
AIRs-LM - MATH 10 Quarter 4-Week 4-To-5 - Module-4
90% (10)
AIRs-LM - MATH 10 Quarter 4-Week 4-To-5 - Module-4
20 pages
Lecture Plan Format
No ratings yet
Lecture Plan Format
33 pages
Python Introduction
No ratings yet
Python Introduction
73 pages
Practice Set #8 Newsvendor Model - Ans
No ratings yet
Practice Set #8 Newsvendor Model - Ans
5 pages
Lab7 Bootstrap Part2
No ratings yet
Lab7 Bootstrap Part2
4 pages
The Central Limit Theorem
No ratings yet
The Central Limit Theorem
6 pages
Solution For Homework 2 Problem 1
No ratings yet
Solution For Homework 2 Problem 1
8 pages
MOdule 4 - Measures of Position
No ratings yet
MOdule 4 - Measures of Position
13 pages
(Journal of Water and Land Development) Drought and Water Mobilization in Semi-Arid Zone The Example of Hammam Boughrara Dam (North-West of Algeria)
No ratings yet
(Journal of Water and Land Development) Drought and Water Mobilization in Semi-Arid Zone The Example of Hammam Boughrara Dam (North-West of Algeria)
8 pages
Outlier Detection and Capping
No ratings yet
Outlier Detection and Capping
7 pages
MSstats R Manual
No ratings yet
MSstats R Manual
47 pages
Specialization Courses Offered by ECED
No ratings yet
Specialization Courses Offered by ECED
19 pages
DescriptiveStatistics BEAMER
No ratings yet
DescriptiveStatistics BEAMER
28 pages
Chapter Testmeasureofvariation
No ratings yet
Chapter Testmeasureofvariation
7 pages
Thematic Map Design
No ratings yet
Thematic Map Design
29 pages
S4 Measures of Position
No ratings yet
S4 Measures of Position
33 pages
Informatics Practices Class 12 Cbse Notes Data Handling
0% (1)
Informatics Practices Class 12 Cbse Notes Data Handling
17 pages
Descriptive Statistics: Chapter 6 - Random Sampling and Data Description 1
No ratings yet
Descriptive Statistics: Chapter 6 - Random Sampling and Data Description 1
43 pages
Package Mice': November 14, 2020
No ratings yet
Package Mice': November 14, 2020
188 pages
Regression With Stata
No ratings yet
Regression With Stata
40 pages
Lesson2 Shs
No ratings yet
Lesson2 Shs
4 pages
Lecture Notes 1
No ratings yet
Lecture Notes 1
6 pages

Pohq 4

Uploaded by

Pohq 4

Uploaded by

Bandwidth Selection for Kernel Quantile

Ming-Yen Cheng1 and Shan Sun2

In this article, we summarize some quantile estimators and related bandwidth

Keywords. Bandwidth, kernel, quantile, nonparametric smoothing.

Short Title. Quantile Estimation.

JEL subject classification: C14, C13.

where IA (x) = 1 if x ∈ A and 0 otherwise. Accordingly, a nonparametric estimator

Qn (p) = inf{x : Fn (x) ≥ p} = X([np]+1) ,

pr qr Q00 (pr ) p r qr n 1 000 1 0000 o

2 Kernel smoothed quantile estimation

2.1 Inverse of kernel distribution function estimator

(1) f is differentiable with a bounded derivative f 0 ;

(2) f 0 is continuous in the neighborhood of ξp and f 0 (ξp ) 6= 0;

2.2 Kernel smoothing the order statistics

Remark 2.1. When F is symmetric and p = 0.5, then

AM SE Q̃n (p) = n−1 [q(0.5)]2 {0.25 − 0.5hψ(k) + n−1 h−1 R(k)},

3 Density and quantile derivative estimation

3.1 Density derivative estimation

of the density f (x), giving

When k is the standard Normal density,

3.2 Quantile derivative estimation

Similarly, we obtain the asymptotically optimal bandwidth for Q̃00 n (p) as

4.1 Method 1. Approximate hopt,1 for Q̂n (p) using density

Plugging this into (3), we have a data-based bandwidth

for Q̂n (p). If k is the standard normal density then

4.3 Method 3. Approximate hopt,1 for Q̂n (p) using quantile

When k is standard normal density, h̄opt,2 becomes

5.1 True values

1 f 0 (ξp ) 3f 0 (ξp )2 − f (ξp )f 00 (ξp )

10f (ξp )f 0 (ξp )f 00 (ξp ) − f (ξp )2 f 000 (ξp ) − 15f 0 (ξp )3

f (x) = e−x = −f 0 (x) = f 00 (x) = −f 000 (x).

5.2 Simulation results

[2] Faulk, M. (1984). Relative deficiency of kernel type estimators of quantiles,

[3] Jones, M. C. (1992). Estimating densities, quantile, quantile densities and

[5] Parzen, E. (1979). Nonparametric statistical data modeling. J. Amer. Stat.

[8] Shankar, B. (1998). An optimal choice of bandwidth for perturbed sample

[9] Sheather, S. J. and Marron, J. S. (1990). Kernel quantile estimators. J. Amer.

[11] Yamato, H. (1973). Uniform convergence of an estimator of a distribution

[12] Yang, S. S. (1985). A smooth nonparametric estimation of a quantile function.

+[ xk(x)1(0,q−αn x) (y))dx]2 }dy + O(n−1 αn2 )

But the bias of Q̃n (p) is

+2n−1 h2 Q00 (q)2 (q − ht)tk(t)j(t)dt + O(n−1 h2 ).

You might also like