0% found this document useful (0 votes)
115 views22 pages

Zhang 1994

The article discusses a new approach to nonparametric regression by introducing nonparametric regression expectiles, which estimate the expectile function of the conditional distribution of a dependent variable Y given an explanatory variable x. This method utilizes an asymmetric squared loss function and provides an iterative algorithm for calculation, demonstrating consistency and asymptotic normality. The paper also includes a simulation study to illustrate the utility of this estimator in understanding nonparametric regression data.

Uploaded by

Imane Mtms
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views22 pages

Zhang 1994

The article discusses a new approach to nonparametric regression by introducing nonparametric regression expectiles, which estimate the expectile function of the conditional distribution of a dependent variable Y given an explanatory variable x. This method utilizes an asymmetric squared loss function and provides an iterative algorithm for calculation, demonstrating consistency and asymptotic normality. The paper also includes a simulation study to illustrate the utility of this estimator in understanding nonparametric regression data.

Uploaded by

Imane Mtms
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

This article was downloaded by: [Northwestern University]

On: 25 March 2015, At: 06:23


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK

Journal of Nonparametric Statistics


Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/gnst20

Nonparametric regression expectiles


a
Biao Zhang
a
The University of Toledo ,
Published online: 12 Apr 2007.

To cite this article: Biao Zhang (1994) Nonparametric regression expectiles , Journal of Nonparametric Statistics, 3:3-4,
255-275, DOI: 10.1080/10485259408832586

To link to this article: http://dx.doi.org/10.1080/10485259408832586

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of
the Content. Any opinions and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied
upon and should be independently verified with primary sources of information. Taylor and Francis shall not be
liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities
whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of
the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
Nonparametric Statistics, Vol. 3, pp. 255-275 01994 Gordon and Breach Science Publishers
Reprints available directly from the publisher Printed in the United States of America
Photocopying permitted by license only

NOWPARAMETRIC REGRESSION EXPECTILES*


BIAO ZHANG

The University of Toledo

It is well known that a standard nonparametric regression analysis is to model the average behavior
between the dependent variable Y and the explanatory variable x . But such an approach may not
always be appropriate if one is interested in the extreme behavior of Y conditional on x . This paper
considers the problem of estimating the expectile function of the conditional distribution of Y given x
based on the observational data generated according to a nonparametric regression model. We
proposed a kernel-type nonparametric regression estimator, called nonparametric regression expectile,
using an asymmetric squared loss function. This estimator models not only the average behavior but
also the extreme behavior of Y given x in the nonparametric regression setting. An iterative algorithm
is presented to calculate the estimator. It is shown that the nonparametric regression expectile is
consistent and asymptotically normally distributed. We also derive a lower bound for the asymptotic
Downloaded by [Northwestern University] at 06:23 25 March 2015

variance and the asymptotic expression for the mean square error and the optimal bandwidth. A
simulation study is given to demonstrate the utility of the nonparametric regression expectile for
understanding nonparametric regression data.

KEYWORDS: Asymmetric squared loss, bandwidth, kernel-type, expectile function, nonparametric


regression.

1. INTRODUCTION

Suppose that the n observable data {(x,, y,), . . . , (x,, y,)} are generated according
to the following nonparametric regression model

where g is an unknown but smooth regression function evaluated at the design


points 0 = xo I x l . . 5 x, = 1, and the are independent, identically distributed,
mean zero observation errors with EE: = cT2 and continuous density function f (a).

One statistical problem posed by the observation model (1.1) is to estimate g


without assuming a particular parametric form for the unknown function g.
Standard nonparametric regression analyses study the relationship between the
dependent variable Y and the explanatory variable x by modeling the average
behavior of Y given x. Several estimators of the regression function g have been
introduced. The kernel estimator proposed by Gasser and Miiller (1979) is

*This research was supported by NSF Grant DMS 89-02667. Computations were performed using
computer facilities supported in part by the National Science Foundation Grants DMS 86-01732, DMS
87-03942 and DMS 89-05292 awarded to the Department of Statistics at The University of Chicago,
and by The University of Chicago Block Fund.
256 B. ZHANG

with weights

where K(.) is a kernel function having finite support on [-I, 11 with a maximum
at zero, {h,) is a sequence of positive bandwidths tending to zero as the sample
size n tends to infinity, and { s , ) ~ is = ~a sequence of interpolating points such that
x , - ~(S, SX,, i = 1 , . . . , n - 1 and so =O, s, = 1.
If the kernel function K ( . ) is smooth, then the kernel estimator (1.2) is
asymptotically unbiased. Thus, given data (x,, y,), i = 1, . . . , n, which is a cloud of
points in Euclidean space R ~ , the kernel estimator gX(x) of the regression
function g(x), for large n, describes the middle of the point cloud in the y
direction, as a function of x. But such an estimator may not always be appropriate
for modeling the relationship between Y and x when one is interested in the
extreme behavior of Y conditional on x. For example, in order to know whether
an action concerning Y will have the effect on extreme values of Y,one may want
to capture the local behavior of the data either in the center or in the tails, and
thus one is interested in the higher or lower parts of the point cloud as well as its
middle. The objective of this paper is to present an approach modeling not only
Downloaded by [Northwestern University] at 06:23 25 March 2015

the average behavior but also the extreme behavior of Y given x in the
nonparametric regression setting.
For linear regression model, Koenker and Bassett (1978, 1982) defined (linear)
regression quantiles using asymmetric absolute loss functions. Breckling and
Chambers (1988) considered asymmetric M-estimators. Newey and Powell (1987)
and Efron (1991) used asymmetric squared loss functions to define what they call
curves expectiles and regression percentiles. In this paper, we will under the
nonparametric regression setting introduces a new class of kernel-type non-
parametric regression estimators, called nonparametric regression expectiles,
using asymmetric squared loss functions. These estimators are used to estimate
the expectile function (See Subsection 2.1) of the conditional distribution of Y
given x. The proposed approach also generalizes the bounded and symmetric loss
function in the M-type kernel estimator to an unbounded and asymmetric loss
function and is useful in understanding the nonparametric regression data.
In Section 2, we define nonparametric regression expectiles using asymmetric
squared loss functions. We also propose an iterative algorithm to calculate the
nonparametric regression expectiles. Section 3 establishes large sample properties
of nonparametric regression expectiles. In Section 4, we derive the asymptotic
expression for the mean square error of the nonparametric regression expectile,
including the asymptotically optimal bandwidth. Last, in Section 5, we present a
simulation study to demonstrate the utility of nonparametric regression expectiles
for understanding nonparametric regression data. Proofs of lemmas and theorems
appear in the Appendix.

2. THE NONPARAMETRIC REGRESSION EXPECTILE

This section concerns the definition and calculation of nonparametric regression


expectiles.
NONPARAMETRIC REGRESSION EXPECTILES 257

2.1. The Expectile Function


Suppose that a dependent variable Y and an explanatory variable x are related
according to the nonparametric regression model, i.e.,

where the error E has mean 0, variance u2and continuous density function f (.).
Let f ( y x ) be the conditional density function of Y given x, then f ( y / x ) =
/
f (y - g(x)). For 0 < p < 1, the pth quantile function qp(x) of this conditional
distribution satisfies J Z p g ' f (y I x) dy = p . Depending on f I x), this quantile
(a

function can be quite complicated. This motivates us to find an alternative


function which, like the quantile function, characterizes the conditional distribu-
1
tion of Y given x , but is easy to handle. If f ( y x ) is symmetric in y for all
I
x E [0, 11, then for p = 1, E(Y - qIl2(x) X )= 0, or
Downloaded by [Northwestern University] at 06:23 25 March 2015

Ap(8 1 X )= E[Wp(Y - 8)(Y - 8) 1 x]. (2.4)


For o < p < 1, (2.2) motivates us to choose the function gp(x), that solves the
equation
Ap(8 I x ) = E[Wp(Y - 8)(Y - 8) 1 X]= O (2.5)
over 8, as an alternative function to the quantile function qp(x). It is easy to
check that Ap(8 I x ) is continuously differentiable and strictly decreasing in 8.
Therefore, gp(x) is the unique solution to (2.4) ,and is well defined. We will call
gp(x) the expectile function of the conditional distribution of Y given x. The idea
of the expectile function was invented by Newey and Powell (1987), who
demonstrated that the expectile function gp(x) summarizes the distribution
function in much the same way that the quantile function qp(x) does and that for
given x, gp(x) is determined by the properties of the expectation of Y conditional
on Y being in a tail of the distribution.
Let

where 0 < p < 1, then p,(.) is an asymmetric squared loss function and pp(r)
reduces to the symmetric squared loss function if p = 0.5. Furthermore, it can be
shown that gp(x) is the unique global minimum of the following equation
I
E[pp(Y - 0) x] = min! (2.7)
with respect to 8, where 0 I X I 1. This is an equivalent way to define the
expectile function gp(x).
258 B. ZHANG

2.2. The Nonparametric Regression Expectile


This subsection concerns the estimation of the expectile function g p ( x ) based on
the observational data { ( x , , y,), . . . , (x,, y,)} generated by the nonparametric
regression model (1.1). We propose a kernel-type estimator based on minimizing
the kernel estimator of E [ p p ( Y - 8 ) x ] . 1
For p = 4, notice that the kernel estimator (1.2) of g I l 2 ( x ) is a weighted
least-square estimator in the sense that it is the minimizer of the weighted
squared error criterion
n
2 a i ( x ) ( X - 8)' = min! (2.8)
i=l

xn

i=l
a i ( x ) p l I 2 ( X- 8 ) = min!

with respect to 8 , provided that Z7=l a i ( x )= 1, which is true for large n such that
[ ( x - l)lh,, x l h , ] 2 [ - I , 11. For O < p < 1, (2.2), ( 2 . 7 ) and ( 2 . 9 ) motivate us to
estimate g p ( x ) by what is called nonparametric regression expectile, denoted by
Downloaded by [Northwestern University] at 06:23 25 March 2015

g,(x), which is defined to be the minimizer of

x a i ( x ) p p ( &- 8 )
n

i=l
= min! (2.10)

with respect to 8 , where 0 s x 5I. Notice that g,(x) is the kernel estimator (1.2)
when p = i. Let
n

then it is easy to show that & ( 6 I x , p ) is strictly convex and continuously


differentiable as a function of 8 , and goes to infinity as 8 goes to infinity. This
implies that the minimizer g,(x) of S n ( 8 1 x, p ) exists uniquely, and is the solution
of

We will present an iterative algorithm to solve g,(x). Furthermore, we will


show that g,(x) is a consistent estimator of g p ( x ) and that g,(x) - g p ( x ) is
asymptotically normally distributed. Consequently, the nonparametric regression
expectile g,(x) provides a convenient and useful method of summarizing the
conditional distribution of the dependent variable Y given the explanatory
variable x .
Finally, we briefly mention the connection between the nonparametric regres-
sion expectile and what is called the M-type kernel estimator. Due to the well
known tendency of least-square estimates to be dominated by outliers, the kernel
NONPARAMETRIC REGRESSION EXPECTILES 259

estimator (1.2) is not robust in the distribution sense. Therefore, Hardle and
Gasser (1984), among others, proposed the M-type kernel estimator gE(x) as the
estimator of the regression function g(x), which includes some robust alternatives
to the estimator (1.2) and is defined as any minimizer of
n
2 ai(x)p(.I;;. - 8) = min!
i=l

with respect to 8, where p is some bounded and nonnegative symmetric loss


+.
function with derivative p' = It is seen that our weighted error criterion (2.10)
generalizes the weight error criterion (2.14) in the sense that we generalize the
bounded and symmetric loss function p in (2.14) to an unbounded and
asymmetric loss function pp in (2.10). This extension is useful in understanding
the nonparametric regression data.

2.3. An Iterative Algorithm


This subsection concerns the calculation of the nonparametric regression expectile
t&). Let
Downloaded by [Northwestern University] at 06:23 25 March 2015

then it is seen from (2.13) that g,(x) is the stationary value of

Iterative methods are needed to solve (2.15). Starting from an initial ap-
proximation t & ( ~ to
) t , ( ~ ) , we successively calculate

for k = 0, 1 , . . . . In the following, we will show that under the condition that we
restrict ourselves to solve the fixed point equation (2.15) among the class of
continuous functions, i.e., 8 = 6(x) E C[O, 11, the gL;)(x)'s calculated according to
(2.16) will always converge to the desired nonparametric regression expectile
g,(x). In the meantime, we will also give the rate of convergence of the iteration
(2.16).
For 8 E C[O, 11, let the norm 11 11 be defined by 11 8 (1 = max, ,[o, ll 11 e(x) 11, then
we have
2.1. Suppose that (A3) (See Section 3) holds and that g,,
THEOREM gL? E C[O, I],
then
-(k+l) -
knP g,ll < IEP- g,ll. (2.17)
Furthermore, there exists a constant ap(O < a, < I), such that

Proof. See the Appendix.


260 B. ZHANG

Theorem 2.1 shows that the distance from the true nonparametric regression
expectile are strictly decreasing in the iteration. Furthermore, for any starting
value, the iteration (2.16) converges linearly to g,(x), that is, the deviation in
terms of the norm I / . I / decreases at a geometric rate.
Next we present an algorithm showing how to use the iteration (2.16) to find
the true nonparametric regression expectile g,(x). Let g&)(x) be any starting
value for each x, say g$(x) = 0 for all x E [0, I], and gA:)(x) be given according to
(2.16) for k 2 1 and each x, then it can be shown by induction that for k r 1,

Thus, for rn r 1, it follows from (2.19) and the triangle inequality that

Letting rn -+ in (2.20) gives


Downloaded by [Northwestern University] at 06:23 25 March 2015

But

where Mn= maxx,[o, IC:=la,(x)ql. For given p, n and positive 77, if we choose
the smallest integer ko such that

then by (2.21), (2.22) and (2.23), we have for each x

which suggests the following iterative algorithm: successively calculate g$)(x) =


~,(gi;-l)(x)) for each x and k s 1 until k = ko with ko determined by (2.23),
then the k,th iteration g!$)(x) can be used to approximate gnp(x) with absolute
error less than 77.
Finally we mention that similar results to those in Theorem 2.1 also hold for the
expectile function g,(x) in Subsection 2.1. Let

then according to (2.5), gp(x) is the stationary value of


e = u,(e).
Furthermore, we have the following results.'
NONPARAMETRIC REGRESSION EXPECTILES 261

THEOREM^.^. Let gp and gjP) E C[O, I], and define gF'l) = u,(~F)) for k r I, then
Ikbk+" - gP11 < kf' - gp 11 (2.26)
and there exists a constant Pp(O < &, < I), such that

kf"' - gp 11 5 P,k llgtp) - gp 11. (2.27)

3. LARGE SAMPLE PROPERTIES O F THE NONPARAMETRIC


REGRESSION EXPECTILE

The asymptotic theory for the nonparametric regression expectile gnp(x) is


developed in this section under the following assumptions.
( A l ) The bandwidths satisfy
hn+O and nhn+ m.

(A2) The interpolating sequence {S~)T=~


satisfies
Downloaded by [Northwestern University] at 06:23 25 March 2015

(A3) The kernel function K(u) is Lipschitz-continuous and has compact


11. It further satisfies
support [-I,

K(u) du = 1 and
I uK(u) du = 0.

3.1. Weak Consistency


We establish in this subsection the weak consistency of the nonparametric
regression expectile g(,x) to the expectile function gp(x). Denote Sn(B ( x, p) to
be the expectation of Sn(8 I x, p), then

where

According to (2.7), gp(x) is the unique global minimum of hp(8 I x).


262 B. ZHANG

The idea behind the establishment of the weak consistency of g,(x) to g p ( x )is
1
first to show the weak consistency of & ( 8 x, p ) to h p ( 8 / x ) , then we would
anticipate that the minimum g,(x) of s n ( 8 ( x ,p) converges in probability to the
minimum gp(x) of h p ( 8 I x ) . In doing so, let [8,, Ob] be any bounded closed set,
then since both Sn(8 I x, p ) and h p ( 8 I x ) are continuous functions of 8 , it follows
by the following Lemma 3.1 that for each x E ( 0 , l ) and p E (0, I ) ,

= lim
n-f-
sup
e€[e,,eb]
12 a i ( x ) h P ( e1 xi) - h p ( 8 1 x)l = 0.
i=l
(3.3)

LEMMA 3.1. Let h ( x ) : [0, 11 + R be a Lipschitz continuous function. Suppose that


( A l ) ,( A 2 ) and ( A 3 ) hold, then we have

Furthermore, for m > 1,


Downloaded by [Northwestern University] at 06:23 25 March 2015

Proof. See the Appendix.


Notice that for sufficiently large n an alternative expression to (3.5) is

The following lemma gives the weak consistency of & ( 8 I x, p).


LEMMA3.2. Suppose that ( A l ) to ( A 3 ) hold, then for each x E (0, 1) and
1 I
p E (0, I ) , Sn(8 x, p ) converges to h p ( 8 x ) in probability uniformly on any
bounded open interval (8,, €Jb) containing g,(x); i.e., for all E > 0 ,

Proof. See the Appendix.


Lemma 3.2 enables us to establish the weak consistency of the unique minimum
I
g,(x) of jn(8 x, p ) to the. unique minimum g p ( x ) of h p ( 8 I x, p ) for each
x E ( 0 , l ) and p E ( 0 , l ) .
THEOREM 3.1. Suppose that (Al) to ( A 3 ) hold, then gnp(x)is weakly consistent,
i.e., g,(x)%gp(x) for each x E ( 0 , l ) and each p E ( 0 , l ) .
NONPARAMETRIC REGRESSION EXPECTILES 263

Proof The proof follows from Lemma 3.2 and the following lemma due to
Newey and Powell (1987, Lemma A).

LEMMA 3.3. Let O0 be a point in Rq and O an open set containing OO. If


(A) Qn(0) converges to Q(8) in probability uniformly on O.
(B) Q(8) has a unique minimum on O at eO,
(C) Qn(0) is convex in 0;
then for 0 = arg min,, Q,(0).

(i) 8 exists with probability approaching one:


(ii) 8 converges in probability to eo.

3.2. Asymptotic Normality


In this subsection we consider the asymptotic normality of g,(x).

.
THEOREM
x (0, I),
3.2. Suppose that (Al) to (A3) hold, then for each p E (0, 1) and
Downloaded by [Northwestern University] at 06:23 25 March 2015

(3.8)

where (Gb(Y, 0) = (Y - 8)Wp(Y- 0).

THEOREM 3.3. Suppose that ( A l ) to (A3) hold, E E ~ +< " w , and g(x) is con-
tinuously differentiable, then for each p E (0, 1) and x E (0, I), we have

where
VP (x )
4(x) =-
Ep(4
1
, K2(u)du,

3.3 The Lower Bound of cr:(x)


By Theorem 3.3, we can split the asymptotic variance of g,(x) into two factors
264 B. ZHANG

where

Note that SK does not depend on p, x and error density function f (a). According
to Epanechnikov (1969), the quadratic kernel

is the kernel function that minimizes S, subject to (A3) and J u2K(u) du = 1.


Now we consider the lower bound of Rp(x). Assuming that f has derivative f ',
then since
Downloaded by [Northwestern University] at 06:23 25 March 2015

it follows by the Schwarz inequality that

inf R,(x) = inf E[(Y - gP(xN2W;(Y - gP(x)) I XI 2


p,x ~ ( 0 . 1 ) PJE~O,~)

(3.16)
Furthermore, it can be shown that the lower bound in (3.16) is achieved if
f '(z)lf (z) is constant or f '(z)lf (z) is of the form az for some constant a.
In summary, we have shown the following inequality

I ( f )S~07 (3.17)
for all p and x in (0,l) under the conditions described above.
NONPARAMETRIC REGRESSION EXPECTILES 265

4. MEAN SQUARE ERROR AND OPTIMAL BANDWIDTH

In this section, we establish the asymptotic expression for the mean square error
of g,(x), from which we derive the asymptotically optimal bandwidth. Note from
(A.24) in the Appendix that

where fjn(x,p ) is given by (A.25). Let

p ) = Var (4%))
u~,(x, -,
then we have the following asymptotic expression for B,(x, p ) and u:(x, p).
LEMMA4.1. Suppose the conditions (Al) to (A3) hold and g(x) is twice
Downloaded by [Northwestern University] at 06:23 25 March 2015

continuously differentiable, then

where Ep(x) and Vp(x) are given by (3.10), and

Now we are ready to give the asymptotic expression for the mean square error
of gn,(x), which is defined by

THEOREM 4.1. Suppose that (Al) to (A3) hold and g(x) is twice continuously
differentiable, then if Ap(gp(x) I x) # 0,

The asymptotically optimal bandwidth is

and
266 B. ZHANG

Theorem 4.1 implies that with the choice of the optimal bandwidth h:, the
nonparametric regression quantiles gnP(x) has order of consistency n4I5, i.e.
n4'5E(g,p(~)- g p ( ~ ) ) 2 +C < rn and n + m.

5. A SIMULATION STUDY

A small simulation study was performed to provide insight into the behavior of
the non-parametric regression expectile g,(x). For model (1.1), we adopted a
rescaled version of a regression function (Wahba and Wold 1975)

with si =xi = (i - l)/(n - I), i = 1, . . . , n. The E~ are taken to be independent,


identically distributed N(0, 1) errors. Under these assumptions, 100 independent
observations {(x,, y,), . . . , (x,, y,)), displayed in Figure 1, were generated from
the nonparametric regression model (1.1) using The NEW S Language (a
statistical package). The iterative algorithm discussed in Subsection 2.3 was used
to numerically compute g,(x) for n = 100, p = 0.25, 0.5, 0.75 and x = O.Ol(0.01) X
Downloaded by [Northwestern University] at 06:23 25 March 2015

Figure 1. The nonparametric regression expectiie g,(x) based on 100 independent observations
simulated from the model (1.1) and (5.1) with s, = x i = (i - 1)/99, i = 1,. . . , 100, and the ei being the
N(0, 1) errors. The top, middle and bottom curves correspond to & ( x ) with p = 0.25,0.5 and 0.75,
respectively.
NONPARAMETRIC REGRESSION EXPECTILES 267

0.99 with the choice of the quadratic kernel (3.14) in evaluating the weights a,(x)
given by (1.3). All computations were done in double precision FORTRAN. The
bandwidth h, was selected according to the optimal bandwidth h:, which can be
calculated from (4.8) as h,*= 0.0505,0.0616 and 0.0737 for p = 0.25,0.5 and 0.75,
respectively. The resulting three nonparametric regression expectile curves,
denoted by g l ~25,(x),
~ ( gloO(O
~ 5 ) ( ~ )and g l ~ ~ ( ~ , 7 5 )were
( x ) , superimposed on the
observations in Figure 1.
The following three points can be made from Figure 1. First, all three
nonparametric regression expectile curves have similar patterns, reflecting the
homogeneousity of the observations. The second point is to note that there are 32
observations below the curve gl,(a2,)(~), 48 observations below gloo(05,(~) and 67
observations below g,oo(o,,s)(x),suggesting that the lower ( p = 0.25) and upper
(p = 0.75) nonparametric regression expectile curves are inclined to be closer to
the mean ( p = 0.5) nonparametric regression expectile curve. This is a conse-
quence of the decreasing outlier sensitivity of the nonparametric regression
expectile g,(x) for smaller or larger values of p . Finally, the nonparametric
regression expectile gnP(x) conveys more information than the kernel estimator
(1.2) by itself. For example, when x = 0.5, the central 50% of the response Y,
Downloaded by [Northwestern University] at 06:23 25 March 2015

25th through 75th expectile, is estimated to be between 0.0177 to 0.3418.

APPENDIX: Proofs

Proof of Theorem 2.1. If g:? E C[O, 11, then it is easily seen that giY1)=
T,(g$)) E C[O, 11 for k 2 0. Let

then aP = lrPl< 1, and by (2.16)

2.18) can be proved by induction.


Proof of Lemma 3.1. (3.4) was established by Hardle and Gasser (1984, Lemma
2.2). Now we prove (3.5). Let
1 x-u
&(x) = hI" aY(xx)h(x,)- nm-' ~ ~ ( ~ ) hd u( u )
i=l
268 B. ZHANG

then applying the mean value theorem for integrals gives that

5 S' ~ ~ ( _ ) h ( du~ I)
1
A n ( x )= 5
i=l s,-I K(?') duIm-lh(xi)- i=l s,-1
X-U

where s,-, 5 ti,qi Isi. Furthermore, we have


Downloaded by [Northwestern University] at 06:23 25 March 2015

Since K(.) and h ( . ) are both Lipschitz-continuous, there are constants L1 and
LZ such that
lKm(u)- Km(v)ls Lllu - vj and Ih(u) - h ( v ) l sL21u - vl,
and thus if In = {i: Ix - xil 5 h,), we have

+ L2sup IKm(v)l2 /xi - Til Is, - si-llm


1"

+ SUP I ~ ( vsup) / K ~ ( Uisi) I-C~


u U
~ I(si
- -~ S l , J ~ - ~
1"

= O ( n - m )+ O(n-(m+S-2)hn),

Therefore, (3.5) follows.


Proof of Lemma 3.2. It is enough to show that

On the other hand, since pn(B 1 x, p ) is continuous in 8, there exists a value of


NONPARAMETRIC REGRESSION EXPECTILES 269

By (1.3) and assumptions (A2), (A3), it is seen that nhnakis uniformly bounded,
i.e., there is a constant C, such that

uniformly for k = 1, . . . , n. Thus

The error term in (A.4) is at most ak(x) It1 rnk(x)(Billingsley 1986, p 353), where
Downloaded by [Northwestern University] at 06:23 25 March 2015

It can be shown by (A.3) and the smoothness of g that there exists a constant CZ
such that

uniformly for k = 1 , . . . , n. The integrand in (A.5) goes to 0 as n+ and is


dominated by 2C2+ 4~:. Thus, it follows by the dominated convergence theorem
that for each x E (0, 1)
lim max rnk(x)= 0.
n-+m l s k s n

For fixed t, applying Lemma 1 of Billingsley (1986, p 367) gives

5 ltl max rnk(x).


lsksn

It follows from (A.6) and (A.7) that for fixed t


lim +,,(t) = 1,
n--
B. ZHANG

which implies that for all &I2> 0,

Therefore, (A.l) follows from (A.2), (A.8) and the triangle inequality.
Proof of Theorem 3.2. Note first that Ap(8 1 x) is continuously differentiable in 8,
satisfying Ap(gp(x)I x) = 0 and dAp(8 I x)/a8 < 0, therefore, there are positive
constants a and b such that

Now by (2.13), we have


Downloaded by [Northwestern University] at 06:23 25 March 2015

It follows from (A.lO) and Theorem 3.1 that with probability tending to 1,

We will show that

If (A.13) is true, then the term on the left-hand side of (A.12) tends to 0 in
probability. Let ~ > 0be given and K; = 2Lnlr], where L, =
nhn2;=l a:(x)Er~.,Oc;,gp(x)), then for sufficient large n, say n r N, we have
NONPARAMETRIC REGRESSION EXPECTILES

On the other hand, by the Chebyshev's inequality

(A.14) and (A.15) imply that the equalities

(A.16)
and

hold simultaneously with probability exceeding 1 - q . Furthermore, (A.16)


implies that
Downloaded by [Northwestern University] at 06:23 25 March 2015

hence, by (A.17)
(A.19)

Thus, combining (A.16) and (A.19) implies that the equality

holds with probability exceeding 1 - q for n 2 N. Since it can be shown that


Ln = 0 ( 1 ) , the right-hand side of (A.20) can be made arbitrarily small by
choosing q small enough. Therefore, Theorem 3.2 follows if (A.13) holds.
To prove (A.13), let

IEEY=la i ( ~ ) [ @ ~ (7)X-, +p(E;.,gP(x))])- AP(7 I x)l


An(x,p) = sup
lr-gp(x)lsb I
lAP(7 x)l
7

then

and it is easy to see that


lim An(x,p) = 0.
n--+-
272 B. ZHANG

it follows from ( A . 9 ) and the Chebyshev's inequality that

which tends to 0 according to Lemma 3.1 and ( A . 2 1 ) .


Proof of Theorem 3.3. Since Ap(8 I x ) is continuously differentiable in 8 and
Downloaded by [Northwestern University] at 06:23 25 March 2015

satisfies Ap(gp(x)I x ) = 0, it follows from the weak consistency of g,(x) that

where &, = En(x,p ) is between g,(x) and g p ( x ) , and P p ( 8 , x ) = aAp(8 I x ) l a e =


- E W p ( g ( x ) - 8 + E I x ) . It follows by Theorem 3.1 and the continuity of P P ( 8 I x )
in 8 that

Furthermore, by Theorem 3.2,

hence

where
a (g&) - &(XI) = Gn(x, P ) + o p ( l ) , (A.24)

To prove Theorem 3.3, it is enough to show that G n ( x , p ) is asymptotically


normally distributed. Let

then since Ap(gp(x)I x ) = 0, it follows by Lemma 3.1 that


NONPARAMETRIC REGRESSION EXPECTILES 273

On the other hand, for m r 2, we have

where

It is easy to check that Gp(z ) x) is continuously differentiable in z, and hence by


Lemma 3.1
Downloaded by [Northwestern University] at 06:23 25 March 2015

Thus, for m = 2 + a , where a >0, (A.27) and (A.28) imply that

Therefore, according to the Lyapounov Central Limit Theorem,

which implies that

But, according to (A.26) and (A.28), we have

Furthermore, it can be shown by Lemma 3.1 that


J K2(u) du
lim Var($,(x, p)) =
n-m P&p(x) I X)E[(Y - gp(x))*W$' - gp(x)) I XI
274 B. ZHANG

Thus, it follows from (A.29), (A.30), (A.31) and the Slustky Theorem that

Therefore, Theorem 3.3 follows from (A.24), (A.32) and the Slustky Theorem,
Proof of Lemma 4.1. It follows from (2.4), (A.24) and Lemma 3.1 that for large
n,

1
-
- (x) I u) du + O(n-')

Since Ap(8 I 2 ) is twice continuously differentiable with respect to z , applying the


Taylor expansion gives that for large n,
Downloaded by [Northwestern University] at 06:23 25 March 2015

Combining (A.33) and (A.34) implies (4.4).


Similarly, by Lemma 3.1, we have for large n

Therefore, (4.5) follows by noting that

Proof of Theorem 4.1. Using (4.1) and Lemma 4.1 gives


NONPARAMETRIC REGRESSION EXPECTILES 275

It is easy to check that the right-hand side of (A.35) is minimized by h,* given by
(4.8). Substituting h: into (A.35) gives (4.9).

Acknowledgement
The author wishes to thank Professors Michael Stein and Wing Wong for their
helpful comments.

References
Billingsley, P. (1986), Probability and Measure, 2nd ed., New York: John Wiley & Sons.
Breckling, J. and Chambers, R. (1988), "M-quantiles". Biometrika, 75, 761-771.
Efron, B. (1991), "Regression percentiles using asymmetric squared error loss," Statistica Sinica, 1,
94-125.
Epanechnikov, V. A. (1969), "Nonparametric estimates of a multivariate probability density," Theor.
Probability Appl., 14, 153-158.
Gasser, T. and Miiller, H. G. (1979), "Kernel estimation of regression functions," in Smoothing
Techniques for Cure Estimation, Springer Lecture Note 757. (ed. T. Gasser, M. Rosenblatt).
Hardle, W. and Gasser, T. (1984), "Robust non-parametric function fitting," J. R. Statist. Soc. B, 46,
42-51.
Koenker, R. and Bassett, G. (1978), "Regression quantiles," Econometrica, 46,33-50.
Koenker, R, and Bassett, G. (1982), "Robust tests for heteroscedasticity based on regression
Downloaded by [Northwestern University] at 06:23 25 March 2015

quantiles," Econometrica, 50, 43-61.


Newey, W. K. and Powell, J. L. (1987), "Asymmetric least squares estimation and testing,"
Econometrica, 55, 819-847.
Wahba, G. and Wold, S. (1975), "A complete automatic French curve: fitting spline functions by
cross-validation," Communications m Statist., 4, 1-17.

You might also like